ptwobrussell / Mining-the-Social-Web

The official online compendium for Mining the Social Web (O'Reilly, 2011)
http://bit.ly/135dHfs
Other
1.21k stars 491 forks source link

uppercase letters causing issues when used as couchdb table names (the_tweet__harvest_timeline.py) #8

Closed rasper121 closed 13 years ago

rasper121 commented 13 years ago

When using the script the_tweet__harvest_timeline.py, I had errors returned at the point of creation of db tables whenever a screen_names from twitter contained uppercase letters. I found that the install of couchdb on my windows machine did not by default allow uppercase letters in the database name, which causes a problem since twitter screen names often have uppercase letters. If anyone else's couch db installation has this limitation, I recommend some pre-processing of the twitter screen names to turn them all into lowercase before using them as part of a couchdb table name. Instead of line 66 which reads: DB = '%s-%s' % (DB, USER), my code uses the following lines:

screennamelower = USER.lower() # added to make names lowercase DB = '%s-%s' % (DB, screennamelower)

ptwobrussell commented 13 years ago

Sorry for the delay in responding to you. I didn't realize that this would be an issue. Thanks for pointing it out. Have you discovered any other instances where screen names should be normalized to lower case? If you could send over a pull request with any of these changes, if there are any more, I'd appreciate it. Thanks again.