durch / rust-s3

Rust library for interfacing with S3 API compatible services
MIT License
498 stars 195 forks source link

Async code may block #305

Open Janrupf opened 1 year ago

Janrupf commented 1 year ago

Describe the bug Some functions recently changed to only have fake async wrappers. For example put_object_stream is async, but takes a std::io::Read. This is a problem as it may cause blocking I/O in an async context (and generally API's are expected to take async I/O traits when being async).

The latest stream API's also don't really conform to the async API (streams returned by the API don't actually implement the async Stream trait, which makes it impossible to pass them to for example actix). The &mut Pin returned by the bytes accessor can't be used to hand of a stream. Additionally the new API introduces allocations which are undesired for high speed data streaming as well as masking errors (the errors are silently discarded by a filter_map call).

To Reproduce Use latest rust-s3 (0.33.0-beta3 at time of writing this).

Expected behavior API's such as put_object_stream should be fully async.

Environment

Additional context Commit in question breaking the Stream API: https://github.com/durch/rust-s3/commit/89925874e69dd33d721972656991e065db89c297

filter_map call in question: https://github.com/durch/rust-s3/blob/d83cb67cca528bdda7acf1094b80ec48330b3566/s3/src/request/tokio_backend.rs#L159

This silently discards read errors, causing potential data corruption.

durch commented 1 year ago

@Janrupf thank you for the detailed report, exactly what I was hoping to get from this beta, will keep you posted

Janrupf commented 1 year ago

The PR I created addresses the concerns with uploading by stream potentially blocking. However, I'm not entirely sure how to fix the download streams. In my fork I just reverted the stream rework and implemented access to content length using stream size hints, but this falls short because it doesn't expose other headers.

I have a suggest though: For the async API make a new trait, S3ResponseStream, which has the futures Stream trait as a super trait. This would allow extending and using the standard Stream trait, which in returns means compatibility with other futures based Stream API's while allowing extending the trait.