Closed gskarp closed 4 years ago
What does the first line of your file corona_tweets_01.txt
look like?
Hmmm, where did you get that from? It looks like the id file has been opened with Excel or something that truncated all the IDs (see how they all end in zero?).
Are you sure there isn't something else on line 1? That error gets thrown when it finds a line without a number on it: https://github.com/DocNow/hydrator/blob/master/app/utils/twitter.js#L15
If you are able to upload the id file here or send it to me via email ehs@pobox.com I can try to debug further.
Yes, I actually opened it in excel to erase a second column, which produced the same error. I originally downloaded the csv files from here: https://ieee-dataport.org/open-access/corona-virus-covid-19-tweets-dataset
I realize that there was a mistake in the way I used excel. In csv things are completely different
Yes, unfortunately Excel is known to mangle numbers and dates. Definitely be wary of it. I would install csvkit and then cut out the column you want into a new file.
csvcut -c tweet_id corona_tweets_01.csv > coronoa_tweet_ids_01.csv
We've talked about adding functionality to allow people to load tweet ids from arbitrary CSV files by having people select the column. But until that's available I'm afraid you will need to select out the data in some other way.
I'm going to close this, but please reopen if it doesn't seem resolved.
While trying to start hydrating a txt file with only the tweet ids, I get the following message