Chunk is a download tool for slow and unstable servers.
Install it with go install github.com/cuducos/chunk/cmd/chunk@latest
then:
$ chunk <URLs>
Use --help
for detailed instructions.
The Download
method returns a channel with DownloadStatus
statuses. This channel is closed once all downloads are finished, but the user is in charge of handling errors.
d := chunk.DefaultDownloader()
ch := d.Dowload(urls)
d := chunk.DefaultDownloader()
d.MaxRetries = 42
ch := d.Dowload(urls)
d := chunk.Downloader{...}
ch := d.Download(urls)
It uses HTTP range requests, retries per HTTP request (not per file), prevents re-downloading the same content range and supports wait time to give servers time to recover.
In order to complete downloads from slow and unstable servers, the download should be done in “chunks” using HTTP range requests. This does not rely on long-standing HTTP connections, and it makes it predictable the idea of how long is too long for a non-response.
In order to be quicker and avoid rework, the primary way to handle failure is to retry that “chunk” (content range), not the whole file.
In order to avoid re-starting from the beginning in case of non-handled errors, chunk
knows which ranges from each file were already downloaded; so, when restarted, it only downloads what is really needed to complete the downloads.
In order to avoid unnecessary stress on the server, chunk
relies not only on HTTP responses but also on other signs that the connection is stale and can recover from that and give the server some time to recover from stress.
The idea of the project emerged as it was difficult for Minha Receita to handle the download of 37 files that adds up to just approx. 5Gb. Most of the download solutions out there (e.g. got
) seem to be prepared for downloading large files, not for downloading from slow and unstable servers — which is the case at hand.