Closed FlxVctr closed 3 years ago
Do I understand correctly, they want to get reply's from a fixed set of comment ID's or do they want to specify a user? I dont really understand where they want to start. In Theory a scraper would probably be almost as time intense as doing this over the API. I really would rather go the "scrape way" (maybe because i'm getting more familiar with standard scraping).
I could probably code a scraper where you could specify a list of usersnames/userlinks, that the scraper then will visit and scrape their comments+replies. Then it would export those to whatever file format that is wanted. Should be pretty solid with selenium. But it would definitely take more time than a instagram scraper (The sourcecode of a twitter site is a bit more complex, and finding the right spots seems to be time consuming)
To a specific user would be easy. The other thing is more challenging. That's why people are asking for a solution. We don't have to prioritise this now. It is just something I wanted to note for later.
I'd be more comfortable with the API approach, which is actually pretty easy. But we can also decide later to do both. In any case, this does not have to happen before our "Twitter Semester".
You would start, e.g., with a keyword search, and then you want to get all replies and replies to replies to the resulting Tweets.
So probably the best way is a hashtag search > fetch those comments > fetch their replies > option to ecxtract into one or multiple files! Atm i dont see the time for doing it, as we got alot to do with the wiki and more coming in 2020. But if i catch some spare time i will be very pleased to get into the issue and the twitter sphere :)
e.g. based on this:
https://gist.github.com/edsu/54e6f7d63df3866a87a15aed17b51eaf
or simply via scraping.
Got asked for a solution to this quite a few times this year.