durch / rust-s3

Rust library for interfacing with S3 API compatible services
MIT License
519 stars 198 forks source link

Corrupt downloads (sync) #304

Closed andrewbaxter closed 1 year ago

andrewbaxter commented 1 year ago

Describe the bug I was downloading a large (1gb) file, and the sha256 sums weren't matching up. Post zstd extraction the file size was less than expected.

To Reproduce Sorry, not quite minimal but

    bucket.get_object_to_writer(
        &new.internal.object_path,
        &mut Writer::new(
            &mut BufWriter::new(&mut File::create(&other_path)?),
            Decoder::new()?,
        ),
    )?;

(Writer/Decoder are zstd here for decoding a zstd payload).

When doing an sha256sum on the output file I got the wrong sha256sum, and it was the wrong size.

Expected behavior The file size should be the expected size and produce the expected sha256 sum.

Environment

Additional context I tested by downloading the file manually and then doing unzstd followed by sha256sum which worked. Additionally, the new aws s3 sdk worked fine (wrapped in an ZstdDecoder Read wrapper before copying to the file).

The download has worked for me with a different file once, previously. The same file produces the same output every time I try to download, so it doesn't seem to be random.

durch commented 1 year ago

@andrewbaxter very interesting, could you test if the same thing occurs without zstd decoder, ie just stream the bytes as they are, the only thing I can think of is the io::copy doing something weird but that seems pretty unlikely... Maybe try increasing the timeout on the bucket, in case it gets cut off

~Looking at the implementation, I'm surprised it even gets to io::copy, since the response() will try and get and allocate bytes, so the to_writer is only sugar here, it does not really stream anything. Fixing that might fix this problem as well~

Its late :)

durch commented 1 year ago

@andrewbaxter can you try testing the fix-sync-stream branch, it uses nativeattohttpc write_to method, instead of io::copy

andrewbaxter commented 1 year ago

Sorry, no you were absolutely write - zstd was the issue here, it apparently doesn't flush on drop. This was all me being stupid.

Sorry again, and thanks!