Closed shinji257 closed 3 months ago
They definitely changed their rate-limiting config (or even activated it in the first place).
Just configure sleep and sleep-request https://github.com/mikf/gallery-dl/blob/master/docs/options.md
I'm playing around and [0, 2]
is not enough.
sleep-request: 5, 10 seems to work fine. (it didn't after 20 minutes)
Trying 10, 30 seconds now.
I'm actually more hoping for "Hey we hit a rate limit. Wait 30 seconds or longer then try the last request again" vs a relatively static 10-30 second range for sleep-request
. I think we currently do something like that with Twitter.
Tried to set sleep-request
to 10, 30 and no dice. Still got it eventually.
@shinji257 @Skaronator It is my understanding that same as bunkr.su they added (or upped/strengthened) their DDosGuard protection that now often will require a captcha to solve before accessing the site or files for download ?
cyberdrop-dl has added cookies extraction for this I believe.
Doesn't look like it is that. It looks like they just seem to heavily throttle available connections per client. I get this even during normal browsing from time to time. I think usually if a captcha is hit then you get a 403 error or similar.
This happens with other sites that have rate limiting as well, like desuarchive. Gdl just skips any posts that give a 429.
Yes except it ends up skipping all but maybe the first two artists since it gets the 429 even when trying to pull requests from API.
I'm trying to resume a kemono grab, but gallery-dl grabbing everything up to that point creates the 429 error.
Can we get a sleep-retry
config? The retries seem to be happening instantly...
Why wouldn't sleep-request
not work for you here?
No idea, I have it set to 10.0
but the retries are less than a second apart.
Sounds like this setting isn't actually being used. Check with -E
.
edit: Retries for extractor HTTP requests, i.e. not file downloads, do indeed wait for at least sleep-request
seconds. (code)
You can pass kemono's o=…
query parameter to gallery-dl to skip that amount of posts, although this only worked in multiples of 50 last time I checked.
https://kemono.su/fanbox/user/12345?o=350
In theory that would work, but there are no "skippable" posts when downloading a discord server...
Discord channels also have an o
parameter, although it currently gets always set to 0
with no way of changing it (other than changing the code itself).
Interesting, did not know that. Also, it appears just waiting until the next day managed to reset my "too many tries" for now.
So no way to put a pause between downloads? I think just a few seconds would be enough based on my experience with the site.
This has been working so far for me.
"sleep-request": [0.5, 1.5],
"sleep-extractor": [0.5, 1.5],
The problem I'm having is that even with a decent delay of 10 seconds, it will occasionally skip over one artist due to a 429, and with hundreds of artists and --abort 5, it will potentially skip hundreds of files.
In turn I will then have sometimes do a "full" run without --abort which then takes roughly 2 full days with the delays.
I'd really just like for it to be able to wait X minutes instead of skipping when it encounters 429, and maybe then I can finally set a delay below 60 seconds...
We can have it retry 429 but there is no way to set a delay on that. I know that (at least) DeviantArt is setup that if a 429 is hit and the active sleep is lower than 30 seconds it adds a second and tries again after that delay. Maybe we can duplicate that over to this to resolve the issue?
EDIT:
I added "retry-codes": [429],
to the list in the kemono section of my config. This should have it retry. I don't know exactly how the code works but it should (I hope) wait another 0.5-1.5 random seconds before doing so in order to hopefully allow for a subsequent attempt. I did notice some 429 errors popup but was followed by a success right after.
So it has been running for a day or two. Has hit 429 every now and again but thanks to me adding it as a retry code it seems to be merged into the "5 tries" bit of the application. In the end it manages to get it thanks to the delays I added in. Example logging...
(NSFW Warning)
* .\gallery-dl\kemonoparty\fanbox\14149030\3372601_22_ea340e11-fef5-4073-b30f-9ba354f92dd6.jpg
* .\gallery-dl\kemonoparty\fanbox\14149030\3372601_23_3fe4dc89-9a10-4c54-8f3b-c2373c3388b6.jpg
* .\gallery-dl\kemonoparty\fanbox\14149030\3372601_24_e19f95c2-266c-4484-b3a3-5707169e82df.jpg
[downloader.http][warning] HTTPSConnectionPool(host='c6.kemono.su', port=443): Read timed out. (read timeout=30) (1/5)
[downloader.http][warning] HTTPSConnectionPool(host='c6.kemono.su', port=443): Read timed out. (read timeout=30) (2/5)
[downloader.http][warning] '429 Too Many Requests' for 'https://kemono.su/data/05/f7/05f7c103d97fb33566d84cac8436d2531a522cd6ad1cfbbfcc632286e3501596.jpg' (3/5)
* .\gallery-dl\kemonoparty\fanbox\14149030\3372601_25_c42bd277-2e18-4bad-b658-d795cc0335bc.jpg
* .\gallery-dl\kemonoparty\fanbox\14149030\3372601_26_4ce212f7-037c-47eb-bb60-14dba1f0f827.jpg
* .\gallery-dl\kemonoparty\fanbox\14149030\3372601_27_46969f1f-8b66-4ff7-8a52-319659195104.jpg
* .\gallery-dl\kemonoparty\fanbox\14149030\3372601_28_b25bfb6a-2075-4e5b-99f2-054d7f71f60d.jpg
* .\gallery-dl\kemonoparty\fanbox\14149030\3372601_29_c71eb867-039c-45e4-9aed-084397b76261.jpg
* .\gallery-dl\kemonoparty\fanbox\14149030\3372601_30_e2f0dca8-1e42-469e-8dca-8f90b7fdf11b.jpg
* .\gallery-dl\kemonoparty\fanbox\14149030\3372601_31_f8bccef6-2256-4d78-9db6-66b32b16eebe.jpg
* .\gallery-dl\kemonoparty\fanbox\14149030\3372601_32_ca7f1fa4-a309-4cf2-8432-3a9e3851be08.jpg
[downloader.http][warning] HTTPSConnectionPool(host='c6.kemono.su', port=443): Read timed out. (read timeout=30) (1/5)
[downloader.http][warning] HTTPSConnectionPool(host='c6.kemono.su', port=443): Read timed out. (read timeout=30) (2/5)
[downloader.http][warning] '429 Too Many Requests' for 'https://kemono.su/data/8a/3a/8a3a7d8b5c24eb59ac4f92540a92ab0f3223eba2c651e8734f650f2f9f6b9f6b.jpg' (3/5)
* .\gallery-dl\kemonoparty\fanbox\14149030\3372601_33_8f91936f-ec7b-4c5c-8bc0-2566bd1639a5.jpg
* .\gallery-dl\kemonoparty\fanbox\14149030\3372601_34_93df6664-0bc6-4ea0-b953-5e4b67bd0d60.jpg
* .\gallery-dl\kemonoparty\fanbox\14149030\3372601_35_5070e6dd-cbd4-4f85-a6c2-0c45d99f8220.jpg
* .\gallery-dl\kemonoparty\fanbox\14149030\3372601_36_ff85ad54-b5d5-4592-b8e9-d8ab946d7745.jpg
Added a sleep-429
option that lets you set a custom sleep time for 429 Too Many Requests responses (defaults to 60 seconds for now): 566472f080c675d25a3c0785ce0884029a7cd3a5
After a short period of time I get a 429 error. It lasts for a bit then goes away (I'd say 30 seconds?) however I was hoping someone had a way to mitigate the problem in the configuration so I can actually do a run without hitting this?
As of right now when 429 is hit for Kemono it doesn't even slow down. It just continues to spam skipping over artists until it gets to the end of the list.