DocNow / catalog

A simple catalog of Twitter ID Datasets
http://catalog.docnow.io/
Other
28 stars 34 forks source link

Add AOC replies dataset #57

Closed mapmeld closed 5 years ago

mapmeld commented 5 years ago

Hi, I collected these Tweets by scanning replies, which is not generally easy to do using the Twitter API. Someone suggested that this might be a dataset for DocumentingTheNow, so I created a file of just Tweet-IDs and am submitting a pull request to add it to the datasets yaml.

edsu commented 5 years ago

I'm sorry I missed this @mapmeld! I can merge as is, but would you be willing to add a little bit to the description about how you performed the data collection. Particularly important are what tools you used to collect the data and what API endpoints you used.

mapmeld commented 5 years ago

@edsu add to the description here or in a follow-up PR?

The Tweet IDs and content were captured with a TamperMonkey userscript (link: https://github.com/mapmeld/aoc_reply_dataset/blob/master/scan.js ).

There is no API endpoint to collect replies to a given Tweet, especially if it is not a recent Tweet. I would view a Tweet manually and use that script to download the thread as a JSON record.

edsu commented 5 years ago

If you don't mind another PR would be ideal. Thanks!