Nugine / s3s

S3 Service Adapter
Apache License 2.0
133 stars 33 forks source link

Discussion / Feature request: GetObject with Transfer-Encoding ? #80

Closed St4NNi closed 1 year ago

St4NNi commented 1 year ago

Hi, I know this crate is focused on S3 compatibility. But we are experiencing a problem where a minor opt-in addition that is out-of-spec would greatly enhance our workflow.

What we are trying to do?

The application I am working on has a feature where users can dynamically select a set of data objects for download. This is realized via a special reserved Bucket and dedicated bundle_ids as keys. When a user requests such a bundle, the data is streamed from storage more or less directly to the user as tar.gz or zip archive. Ideally we would love to do this directly via regular S3(s) requests and not via a dedicated (more or less) duplicated solution.

Problem

In theory this is quite straight forward: check for the special bucket, gather all storage information needed and stream the data to the user. The only major problem we encountered was the requirement for a Content-Length in the GetObject Response. Since we bundle the data on user request as compressed archive, the final size of the data is not known in advance.

In theory this could easily be solved via Transfer-Encoding: chunked but with s3s this is impossible, while it is possible to set the appropriate header in S3Response, it is not possible to remove or suppress the Content-Length header and thus h1 will show a warning that conflicting headers exist and the Transfer-Encoding header is ignored.

From a technical standpoint there is no reason to enforce the Content-Length header besides the S3 spec. One suggestion might be that adding a Transfer-Encoding header would explicitly remove the Content-Length header. This could be done quite easily somewhere here without any breaking change or any changed behavior when this special header is not set.

@Nugine what is your take on more or less trivial, opt-in non spec features?

Thanks in advance !

Nugine commented 1 year ago

It's an interesting problem but this feature may be broken in the future if aws changes something about GetObject.

S3Service is a hyper service. It means you can modify the request and response by wrapping it in middlewares, like what a regular web service does. You can pass custom data in the extensions field. So there is a way to "fix" the response headers after calling the S3 service. It seems dirty but relatively reliable.

Anyway, I'll add some "suppressing" logic when merging custom headers into final headers, as you suggested. The next release will be v0.7.1 on 2023-09-03.

St4NNi commented 1 year ago

That's a good point. I hadn't really thought about a wrapping hyper service, will keep that in mind for future ideas. Also thanks for adding this suppression logic and I agree that this might brake in the future if aws changes something about GetObject (although I think that this is highly unlikely without a proper V2).

Till then I will go ahead and give the wrapping service also a try and will share my results here if someone else has a similar problem!