rgreinho / trauma

Tokio Rust Asynchronous Universal download MAnager
MIT License
131 stars 10 forks source link

Completed File Errors w/ "416 Range Not Satisfiable" Instead of Skipping #84

Closed RoganMatrivski closed 8 months ago

RoganMatrivski commented 8 months ago

Bug Report

NGINX servers (and possibly others) will return a 416 Range Not Satisfiable error if the file is already fully downloaded.

Current Behavior

When a file completed, the download task fails because of 416 error code.

Expected Behavior

When a file completed, it should instead skip the task instead of failing.

Possible Solution

As a stopgap measure, i've added a function called get_serverside_filesize in Download struct and added a size check before the actual GET request. This is a bit janky because of additional HEAD request, but it works for now.

pub async fn get_serverside_filesize(
    &self,
    client: &ClientWithMiddleware,
) -> Result<Option<u64>, reqwest_middleware::Error> {
    let res = client.head(self.url.clone()).send().await?;
    Ok(res.content_length())
}
let filesize = match download.get_serverside_filesize(client).await {
    Ok(size) => match size {
        Some(size) => size,
        None => return summary.fail("Failed to get server side file size"),
    },
    Err(e) => return summary.fail(e),
};

if size_on_disk > 0 && size_on_disk == filesize {
    main.inc(1);

    return summary.with_status(Status::Skipped(
        "the file was already fully downloaded".into(),
    ));
}

Steps to Reproduce

  1. Run curl -vI http://ipv4.download.thinkbroadband.com/5MB.zip, save the Content-Length (52428780)
  2. Run curl -vLJ -r 52428780- http://ipv4.download.thinkbroadband.com/5MB.zip, script will fail
  3. Run curl -vLJ -r 52428779- http://ipv4.download.thinkbroadband.com/5MB.zip or any range below Content-Length, script will run.

Context

The library first will check if the file is resumable and also check if it exist on the disk, store the file length on the disk, and then do a GET request to the file server to check for any errors and if the file length on the disk is the same as the length on the file server. The problem is, when you set Content-Range request header to be the same as the length of the completed file, some server will return error 416, thus failing the download task.

Your Environment

Last commit:
  0d07afe Update tracing-opentelemetry requirement from 0.21 to 0.22 (#80)  (HEAD -> main, origin/main, origin/HEAD)
Browser(s):

System Version: Ubuntu 22.04.1 LTS (Jammy Jellyfish)
rgreinho commented 8 months ago

Thanks a lot for reporting the issue!

I believe it is addressed with PR #86. Trauma will now check the size of the existing file on disk against the size provided by the server to determine whether or not it should proceed with the download.

If that does not solve your problem, feel free to re-open this issue.

However if the server does not provide a "content-length" header, the download will still fail. It does not really make sense to me that the server would check the range without provided the content-length, that's why I decided to implement it this way.

RoganMatrivski commented 8 months ago

Sorry for the late reply. I think the PR fixes the issue, so we can consider this issue to be solved. Thanks!

Although i have a question.

The main progress bar doesn't increment when a file is skipped. Is this normal behavior? I believe the progress bar should advance even when a file is skipped. Although the file is excluded from downloading, we can still consider it a completed task. Therefore, the progress bar should increment. On the other hand, a failed task represents an error, so the current implementation is correct in not incrementing the progress bar for such cases.

rgreinho commented 8 months ago

Oh, you are right! I will definitely review this behavior, and make the adjustments accordingly.