Open mcolemann opened 3 years ago
Hydrator doesn't do that currently, but that would be a good option to add potentially. If you need to do it now you can use twarc's conversation
command.
Thank you very much! Is twarc compatible with the .csv files from the Hydrator?
It depends on what you are doing. Do you want to get all the conversation threads in the CSV file you generated?
Actually I am trying to identify some good case studies for the conversation threads. I have 33 .csv files (each around 300-500MB) and I would like to reconstruct all the threads (to better identify the case studies). Do you know how I can do this?
Do you have access to the Academic Research product track, which allows searching the historical archive?
In theory it ought to be possible if you extract the tweet ids from your CSVs into a file e.g ids.txt
. And then run twarc conversations ids.txt --archive conversations.json
to collect all the threads. It could take a while depending on the sizes of the threads you encounter. But these are all questions for the twarc issue tracker I guess.
Unfortunately I don't have (I think so) access to the Academic Research product track... How can I have access to it?
Thanks a lot! Then I will open an issue for twarc.
If you are studying or working at a university you can apply. The main difference is that you can access 10 million tweets a month from Twitter's V2 API (usually limited to 100,000/month). The V2 API includes things like reply_count for tweets, as well as the conversation_id for a tweet which lets you easily collect all the tweets in a thread. And most importantly, Academic Research track lets you search the full archive of tweets rather than just the last week.
Hi everyone!
Does someone of you have a code to reconstruct the threads?
Thanks a lot!