ArchiveTeam / ArchiveBot

ArchiveBot, an IRC bot for archiving websites
http://www.archiveteam.org/index.php?title=ArchiveBot
MIT License
356 stars 72 forks source link

Apply delay settings immediately at the start of a job #515

Open JustAnotherArchivist opened 3 years ago

JustAnotherArchivist commented 3 years ago

As I understand it, jobs are currently started without concurrency or delay settings, and those are later set by the settings monitor. This means that a job always starts at 1 concurrency and 0 delay. This may be a problem when targets have heavy rate limiting and even just two or three requests in a short time trigger that.

While setting the concurrency via the command line is broken (https://github.com/ArchiveTeam/wpull/issues/339), delays are fine (--wait), so the pipeline should use that. However, the command line options do not allow for specifying a custom range of delays, so the average of delay_min and delay_max should be used there.