Jules-WinnfieldX / CyberDropDownloader

Bulk Gallery Downloader for Cyberdrop.me and Other Sites
GNU General Public License v3.0
1.6k stars 185 forks source link

[FEATURE] Dealing with 429's? #688

Closed baccccccc closed 8 months ago

baccccccc commented 8 months ago

I apologize for marking it as a "feature" because it's more like a discussion topic. Or maybe it sounds like a proposal for a wiki page.

HTTP code 429 is basically DDoS-Guard, right?

What are the typical ways to deal with it?

I have a feeling that it worked slightly better in v4 than in v5. But I have no data to back up this claim with.

I set my rate limiting options to one concurrent download, and I am still hitting a bunch of 429s on bunkr.

Rate_Limiting_Options:
  connection_timeout: 15
  download_attempts: 10
  download_delay: 1
  max_simultaneous_downloads: 1
  max_simultaneous_downloads_per_domain: 1
  rate_limit: 50
  read_timeout: 300

Every time I run CDL to download a significantly large forum thread, there's still a bunch of failed downloads with 429. Previously, in v4, all it took was re-launching CDL again (and maybe once again) to download the remaining file. But not anymore. I tried launching CDL four or five times in a row, and there's still about a dozen files being blocked.

I confirm when using the same link in a browser, download starts. So, it's not like it's actually a 404 or 451 disguised as 429 as it sometimes happens with other downloads.

So maybe I'm just getting unlucky, or v5 is somehow more "efficient" (aggressive?) in trying to download stuff from bunkr? Any further tuning recommendations?

Jules-WinnfieldX commented 8 months ago

So coincidentally 429's got a lot worse right as I released V5. In testing I basically didn't run into them, now they are rampant. I feel like it is variable server side depending on server load, and I'll never be able to tune the program for it.

baccccccc commented 8 months ago

just thinking out loudly... what if you built a random delay (think, 1..5 seconds) between retry attempts? Can it, at least in theory, help to some degree?

Jules-WinnfieldX commented 8 months ago

I doubt it. There already is a 1 second delay between downloads.

This likely will be something I have to play with, but there is not going to be getting rid of it unfortunately.

Jules-WinnfieldX commented 8 months ago

1 second for bunkr*

baccccccc commented 8 months ago

is it hardcoded? What if I want to set it to two or three seconds and see if it helps in my case?

Jules-WinnfieldX commented 8 months ago

oh. I think I found it.

Jules-WinnfieldX commented 8 months ago

so. The arg I was going to suggest you use, I never actually use in the code. Love that. But I also found a typo meaning bunkrr downloads / requests weren't following the rate limits I set for them at all

Jules-WinnfieldX commented 8 months ago

Give me a second to fix some shit surrounding this that I'm finding

Jules-WinnfieldX commented 8 months ago

5.0.49 is going up now, see if it helps

kurfuu commented 8 months ago

Will this update be put into the Cyberdrop-DL V5 Start Files at somepoint. im no good at navigating github, and cant seem to figure out how to add the updated files into the Start Files folder i currently have. <3

Jules-WinnfieldX commented 8 months ago

Will this update be put into the Cyberdrop-DL V5 Start Files at somepoint. im no good at navigating github, and cant seem to figure out how to add the updated files into the Start Files folder i currently have. <3

The start files will automatically install updates such as this one I just put out, yes. You don't need to do anything on your part besides maybe running the start file once or twice for the update to install.

baccccccc commented 8 months ago

wait, what are the start files? 👀

Jules-WinnfieldX commented 8 months ago

wait, what are the start files? 👀

https://github.com/Jules-WinnfieldX/CyberDropDownloader/releases/tag/Release

the 5.1 zip contains them. You'd have to severely edit them for them to work with the windows store python install

baccccccc commented 8 months ago

wait, what are the start files? 👀

https://github.com/Jules-WinnfieldX/CyberDropDownloader/releases/tag/Release

the 5.1 zip contains them. You'd have to severely edit them for them to work with the windows store python install

I see! Ok, doesn't look like I need that. I run upgrade manually every now and then, and have my own “start file” in PowerShell that runs CDL with a bunch of command line parameters.

(And I'm still in the process of figuring how to adapt this to v5. E.g., I'll probably have to move the logs into a subfolder after every run. I can deal with it, just trying to see what works and what does not for now.)

Anyway, running .49 now. Still running, but there's been ten 429's already :( Is there something I can adjust in the config now?

baccccccc commented 8 months ago

ok, wait, I just found that I had Ignore the history in my config. Let me try without that. In v4, it was the key for subsequent download attempts.

baccccccc commented 8 months ago

yup, all good now. Sorry for the red herring. I completely forgot setting ignore history in the config file. And because it's one of the very few command-line options that remained in v5, I somehow got under impression it's not in the config :)

Jules-WinnfieldX commented 8 months ago

It's all good. Glad what I changed helps at the very least with 429's.

baccccccc commented 8 months ago

here are some more random observations.

  1. I've been hitting quite a lot of 403's recently. Changed the user agent in CDL to match that of my browser. It seemed to help. So maybe consider an idea for enhancement: import the user-agent string from the browser instead of typing/pasting it manually.
  2. I am still getting occasional 429's. Relaunching CDL for the 2nd time usually helps. But maybe something like a random delay between retries could help it look more like a "human" rather than "automation."
baccccccc commented 8 months ago

or maybe consider something like this

  1. try to download everything as usual.
  2. wait ~10 seconds
  3. retry only those downloads that returned 429's before