taspinar / twitterscraper

Scrape Twitter for Tweets
MIT License
2.4k stars 581 forks source link

CSV output, slow, timestamp issues #192

Open steeley opened 5 years ago

steeley commented 5 years ago

running with python 3.7.3 on OSX Sierra. installed from git download. App seems to run without errors, although it is very slow - even on simple queries where I am looking over a short period and I know there are only 20-30 tweets. CSV output better once I had used tabs as delimiter. Timestamps field appears to often be empty, or contains hashtags or URLs Needs an option to overwrite output instead of 'file exists' then quits.

All this makes it difficult to get usable data

taspinar commented 5 years ago

The new version will have ";" as a default separator for the CSV output file and the command line argument "-ow / --overwrite" to indicate you want existing output files to be overwritten.

steeley commented 5 years ago

just tried latest version, but it seems there are extra return characters that are messing up the CSV file, so still not working

taspinar commented 5 years ago

Do you have any suggestions? One option could be to save it to csv with pandas, but I don't want to add pandas as an requirement just for saving the tweets to csv.

steeley commented 5 years ago

not sure... script works well apart from this although a bit slow. Maybe some text encoding issue - I don't know where these extra return characters are coming from - perhaps " " quotes are getting messed up?? . Maybe CSV export in python3.7 has issues. CSV read/write is normally easy.