Closed God-damnit-all closed 4 years ago
I've got this fixed upstream (how did I commit that!), but I haven't pushed because I haven't finished the twitter stuff (this weekend, hopefully. Assuming I don't get distracted with 3d printer crap again).
Let me see if I can pull out the relevant changes from my local repo.
I'm really looking forward to the twitter stuff, right now it is the biggest pain in the ass to keep track of.
I think I'm going to have something that grabs the web-accessible stuff first (read: I've written bits of it). The auth-ed stuff can come later. If that runs regularly, it shouldn't miss things.
I was trying to tell you before, all of it is easily web accessible if you used the advanced search filters to grab a user's tweets by quarter (3 months at a time), going back to the start of 2014. (You'll want to have the filters overlap their dates a bit because twitter isn't very exact about how it dates things.)
The only problem is that you have to be logged into an account that is configured to always view adult content for it to work correctly. But no API use is required.
Full list of advanced search filters is here: https://github.com/igorbrigadir/twitter-advanced-search
Actually, it might be even better to use max_id
and since_id
instead, I just now learned about their existence. That would be much more accurate.
Sorry for the spam but here, read this section. It even has an example on how to scrape using snowflake IDs using Python: https://github.com/igorbrigadir/twitter-advanced-search#snowflake-ids
Once again, the only issue is that the search only shows you adult content on an account configured not to filter it, but no phone number has to be attached to the account or anything.
Currently these two lines only lead to commented code, so it expects
loopCtr += 1
to be part of the for loop, causing an indentation error.