JakeYallop / WaybackDownloader

A simple utility for downloading a website from the wayback machine.
GNU Lesser General Public License v2.1
2 stars 0 forks source link

Allow creating additional workers for snapshot page downloading to achieve a specific download speed #7

Open JakeYallop opened 5 months ago

JakeYallop commented 5 months ago

The downloading of web pages has a variable rate limit, and new workers are created to reach to desired limit. Downloading lists of CDX records (snapshot pages) can take several seconds, and with a high-enough webpage download rate limit, it could end up making the snapshot page downloading the limiting factor.

We should try to rewrite the snapshot downloading so that it can also be parallelised.