AlphaSlayer1964 / kemono-dl

A simple kemono.party downloader using python.
508 stars 81 forks source link

Downloader encounters 403 error when the cookie is valid #65

Open bea831333 opened 2 years ago

bea831333 commented 2 years ago

Version

Version: 2022.02.10

Your Command

kemono-dl.py --cookies "kemono-cookies.txt" --from-file "users.txt" --post-timeout 120 --retry-download 10

Description of bug

A valid cookie will fail intermittently, repeating the same command does not reproduce the same error. However on the very next post the cookie will work. I assume the cookie is valid because it was obtained 10 minutes prior to running the script

cookie error

How To Reproduce

Run the above command for user https://kemono.party/fanbox/user/39123643

Error messages and tracebacks

See image above

AlphaSlayer1964 commented 2 years ago

Did you get your cookie file while logged into a kemono account? Di you get the cookie file from the chrome or firefox extension?

bea831333 commented 2 years ago

The cookie was while I was logged in, I got it using the firefox extension and removed the appropriate httponly prefixes.

image Format of my cookies file

AlphaSlayer1964 commented 2 years ago

ok let me try the same setup and get back to you.

bea831333 commented 2 years ago

image I commented out the 403 handling code in main.py and added some extra delay for the retry. I got the above behaviour.

I suspect kemono.party is sending 403 errors incorrectly as a form of rate limiting (not actually incorrect, just my misunderstanding, see below comment)

bea831333 commented 2 years ago

Perhaps if you inspect the substatus codes https://docs.microsoft.com/en-us/troubleshoot/developer/webapps/iis/www-administration-management/http-status-code

Do you get "403.502 - Forbidden: Too many requests from the same client IP; Dynamic IP Restriction Maximum request rate limit reached." for the error code and message?

AlphaSlayer1964 commented 2 years ago

That seems like the cause. Will have to look into getting the sub-status code for the response.

0o0miku0o0 commented 2 years ago

Sometimes for a single file it could raise 403 bad cookies issue, but for the next file, the download works fine.

Since the issue most likely will not occur for the next few files in download queue, is would be nice to develop additional functions that could retry downloading the files raise this error after the download queue is finished or immediately after the error was raised.

Currently maybe the only soultion is to go through the logs to find out and download these files manually from kemono website.

screenshot
Xyn0gen commented 2 years ago

Try adding the arg --user-agent curl/7.66.0 Fixed for me, found it in https://github.com/mikf/gallery-dl/issues/2683#issuecomment-1156771370

JoGaTo commented 1 year ago

Tried that method and it still gives me 403 errors.

ILogOutOnTheToilet commented 1 year ago

I get this too, and I think this is because Kemono probably limits a session to a certain amount of downloads per time length, then starts to refuse to serve that session for as many resources as possible. As a result, you will begin to get random 403 errors for random download requests, and those requests will always be 403 errors until some unknown amount of time has passed. The 403 errors for those files will disappear usually after several hours, but I usually attempt the next day and it works again. The reason I suspect it might be a session issue is because the same files giving 403 errors work fine when I try to download them in the browser. I tried regenerating the cookie file, but it didn't work to get rid of 403 errors, so there might be some other factors Kemono is looking at.

I made a PR to make the 403 errors more bearable with a download time-out and to stop downloading upon any request errors (I added new parameters--download-timeout and --stop-on-failure). Otherwise, a 403 error will cause the code to loop rapidly, which will cause 429 too many request error because of rapidly looping get-request submissions if you are downloading a lot of files. And if you get that, you can't even visit Kemono's website on your browser for several hours. Download timeout is to mitigate any chance of submitting too many requests too fast that causes the 429 errors. Here is the PR: https://github.com/AlphaSlayer1964/kemono-dl/pull/177

Example usage: python kemono-dl.py --cookies "kemono.party_cookies.txt" --links "https://kemono.party/patreon/user/<user id>" --ratelimit-sleep 200 --retry 0 --dirname-pattern "<download directory>\{service}\{username} [{user_id}]" --download-timeout 5 --stop-on-failure