mikf / gallery-dl

Command-line program to download image galleries and collections from several image hosting sites
GNU General Public License v2.0
10.77k stars 889 forks source link

Twitter Pagination Control #3379

Open ReiFan49 opened 1 year ago

ReiFan49 commented 1 year ago

Recently, I noticed the hit of rate limit whenever I tried to JSON scrape all of my liked tweets (which sums up more than 50k tweets). Is it possible to limit twitter pagination (on JSON mode) like /likes or /media up to N tweets, N pages or N time ago from CLI (via -o or something like that)? Thanks.

My current gallery-dl version is 1.22.0.

afterdelight commented 1 year ago

you could use time delay as a temporary measure

ReiFan49 commented 1 year ago

Now that you mentioned a temporary measure, I could make a cap for my own program (let's say only allow add and update for 1k tweets), which means after gallery-dl -j. But it still doesn't really solve the problem completely... 🤔

afterdelight commented 1 year ago

use a long delay like 20-30 seconds per download. i guess with time that long it wont make gallery-dl hit api rate limit and you can download all tweets without adding a limit

ReiFan49 commented 1 year ago

again my issue is not the delay/rate limit, but on how much the tweet hogged from the scraping during json mode. I saw --range option but I don't think that's an option to use esp. for Twitter JSON scraping... unless it is? I don't know how far it will fare so far.

afterdelight commented 1 year ago

do you want to scrape as many as possible tweets in one session?

ReiFan49 commented 1 year ago

I would say not "as many as possible", but like allow a conditional stopping like amount of tweets, time period, 1 rate limit hit, etc.

As long as it's achievable on JSON mode as well.

afterdelight commented 1 year ago

thats alot of conditions. im not sure which one you want