ripmeapp2 / ripme

Downloads albums in bulk
MIT License
290 stars 37 forks source link

bug/request functionality to bypass or work around rate limiting 429 errors #184

Open stubkan opened 3 months ago

stubkan commented 3 months ago

Expected Behavior

ripping should take into account anti-ripping rate limiting, and allow you to adjust download rate to prevent 429 errors, and notify you if files are not downloaded due to rate limit block (429 error), as well as allow you to redownload an url to get the missed files due to exceeding rate limit

Actual Behavior

these 3 things do not seem to occur. rate limiting and 429 errors are common when one rips sites, so i think a good ripping tool should have functions to prevent or workaround 429 rate limiting

image

downloading urls from 4chan thebarchive gets rate limited quickly, so on average 2 out of 3 pictures are fetched when ripping a dozen threads

the rate limited files are declared 'unretrievable' when a simple wait or retry will actually work fine, not sure why they are declared unretrievable, they are also tagged as completed in the final result list, but they were never downloaded

after the scrape is over, all unretrievable 429 rate limit blocked files are placed in the 'completed' list of history, so it tells you 150 files succeeded, by putting 429 unretrieved files as well as completed files together in the same list, so if you don't check the log or have debug mode on, you will think it downloaded properly, when it didnt

if you attempt to fix it and redownload the threads, by check mark and click redownload button, it will ignore all the missed 429 files and you don't have the option to re-download them, because the log says "Already downloaded" when they are not

image

In the configuration - the only configuration you can do to attempt to reduce rate limit 429 errors is to reduce the threads to 1, but this is not enough, I suggest adding a delay between each download - similar to gallery-dl's --sleep or --sleep-request function, this will allow users to bypass 429 rate limit errors

image

Also, the retry option is for 10 retries, but it does not try to retry, so I am unsure if that option is working

soloturn commented 3 months ago

not that i'd expect a big change, but would you mind trying with latest release 2.1.9 ?

stubkan commented 3 months ago

Oh, my bad. I am using 2.1.9-7 - the latest release. I mis-read it as 2.1.7 (you can confirm by looking at the green version text in my last screenshot)