Closed PujanWho closed 1 year ago
What do you see on line 1 of the file?
Hi, after going through previous errors, I followed the issue of removing everything that is not the id, the issue I am facing now is that there seem to be certain lines coming up with errors and those seem to be ending in 0, so is there any fix as once I remove one row on sheets and redownload it I get another error on a further down line so I want to reduce having to manually go through the data and removing it.
@PujanWho I was able to bring tweets_06 in to Google Sheets, remove columns B-D, and export the resulting CSV back to my desktop. Be sure to UNcheck the option to convert text to numbers, or Sheets will decide to change a few of the tweetIDs to scientific notation. Pretty sure you also want to include .json when you're providing the output filename (@edsu why doesn't that default?)
Hydrator is now slowly chugging through 1,771,295 ids. I don't think it'll be done by the end of my work day, but I'll try to remember to post how many it finished with, just for comparison's sake. :-)
@PujanWho beware, Excel will invalidate the Tweet IDs unfortunately -- the numbers overflow :-(
@ppival the output filename has no effect on the behavior of Hydrator, other than where it writes the data.
the output filename has no effect on the behavior of Hydrator, other than where it writes the data.
Oh I know, @edsu, it's just always seemed weird if it's going to output .json, why do I have to explicitly tell it to do so?
@PujanWho I was able to bring tweets_06 in to Google Sheets, remove columns B-D, and export the resulting CSV back to my desktop. Be sure to UNcheck the option to convert text to numbers, or Sheets will decide to change a few of the tweetIDs to scientific notation. Pretty sure you also want to include .json when you're providing the output filename (@edsu why doesn't that default?)
Hydrator is now slowly chugging through 1,771,295 ids. I don't think it'll be done by the end of my work day, but I'll try to remember to post how many it finished with, just for comparison's sake. :-)
This was the perfect fix, Thank you so much. I had alternatively in the mean time had started using another data set that includes just pure twitter ID - "https://github.com/echen102/COVID-19-TweetIDs", if anyone is too lazy to do the sheets steps and just wants twitter IDs off the bat, but your solution has allowed me to now use the original twitter dataset(s) that I wanted to, so Thank you very much.
I am having an issue hydrating some twitter data from this https://ieee-dataport.org/open-access/coronavirus-covid-19-tweets-dataset, I am specifically trying to hydrate 06 but I keep getting this error "Invalid Tweet ID on Line 1".