ChineseHat / colorsofneworleans

1 stars 0 forks source link

hashtag typo #1

Closed invisibleink closed 11 years ago

invisibleink commented 11 years ago

I am poking around the silex branch and learning a lot. Thanks, Doug! I found one small typo in the tables.sql. Line 88 has mardigrad instead of mardigras, so those tags are not getting picked up correctly. The tweet_hashtags table has some NULLs as a result.

invisibleink commented 11 years ago

I just pushed a tiny fix.

dmiller-iseatz commented 11 years ago

Good catch. There may be some other typos too. There were many instances of hasgtags typos at one point.

Your fix is mostly correct, but the md5 sum needs to be generated for 'mardigras' too. The md5 sum is used as a unique key on the table to prevent duplicate hashtag entries.

I can't do it now, but the best fix would be to update the record via mysql, then re-export the database:

UPDATE hashtags SET hashtag = 'mardigras', hashtag_md5 = MD5('mardigras') WHERE id = 4
invisibleink commented 11 years ago

Thanks Doug. I actually did generate the md5 with a mysql select statement, used that md5 to reinstall, and the md5 was truncated. I then updated the md5 to the new truncated one. I will redo it as you say and commit shortly if it comes out differently.

Thanks for the explanation! Looking forward to helping in whatever way I can.

On Jul 15, 2013, at 1:46 PM, dmiller-iseatz wrote:

Good catch. There may be some other typos too. There were many instances of hasgtags typos at one point.

Your fix is mostly correct, but the md5 sum needs to be generated for 'mardigras' too. The md5 sum is used as a unique key on the table to prevent duplicate hashtag entries.

I can't do it now, but the best fix would be to update the record via mysql, then re-export the database:

UPDATE hashtags SET hashtag = 'mardigras', hashtag_md5 = MD5('mardigras') WHERE id = 4 — Reply to this email directly or view it on GitHub.

invisibleink commented 11 years ago

Update: The typo was only in the hashtag, not in the md5 hash.
But I have this question: Would there be any advantage/disadvantage to using a longer database field? The md5's are coming out to 32 characters. Currently it's a binary(16) field.
Pardon the noob question, and thanks again!