aws / aws-sdk

Landing page for the AWS SDKs on GitHub
https://aws.amazon.com/tools/
Other
68 stars 12 forks source link

Support non-overwriting put-object (add If-None-Match: * HTTP Header) #727

Closed drhpc closed 2 months ago

drhpc commented 2 months ago

Describe the feature

The feature request aws/aws-cli#2874 alludes to this, for cp, my concern is with s3api put-object. I want to be able to ensure that I do not overwrite an existing object (also in the meaning of creating a new version in top of a locked object). The use case is consumption via HTTP URLs that with the bucket in static website mode. I want to ensure that the data corresponding to a URL does not change by a concurrent put-object action.

The sentiment is described here, too:

https://stackoverflow.com/questions/12654828/amazon-s3-avoid-overwriting-objects-with-the-same-name

The solution seems to be to simply add

If-None-Match: *

to the PUT request for the server to return an error. Though I do not know if this is the case for other S3 implementations than AWS itself. Maybe someone can confirm that first.

This seems like a simple change, but I was unable to figure out how the aws-cli codebase calls the botocore code base to inject some additional header. Tracing is too noisy, the python dbg doesn't work (on my test system) on the aws command for some reason. I hope this is trivial to answer for someone familiar with the codebase.

Can we have non-overwriting put-object? That would greatly help my application. Right now, I am on the fence, hoping that I can use simple shell calls to the aws CLI instead of diving into library API and re-building the simple inspect/upload I need for my application.

Use Case

I am building an archive of files as objects in S3 and want to ensure that these files fetched from a bucket as static website are unchanged, even if the archiving software attempts to upload to the same key again. I am generating names that are fairly unique, with hashes and timestamps in them, but I am including hash collisions in my test cases explicitly and want the software to be robust with those, detecting the collision instead of silently overwriting an object (version).

Addressing locked object versions would complicated the application and make it difficult (impossible?) to serve in static website mode to simple HTTP(S) clients.

Proposed Solution

Add a command-line switch to put-object to either insert arbitrary request headers or specifically for the If-None-Match: * one.

Alternatively: Something else that achieves non-overwriting object putting.

Other Information

I may be able to do that if I could figure out the control flow before request headers are built / the call into botocore happens. I am not deep into python development and tooling, but can change code I see.

Acknowledgements

CLI version used

aws-cli/1.32.89 Python/3.10.12 Linux/5.15.0-100-generic botocore/1.34.89

Environment details (OS name and version, etc.)

Ubuntu 22.04.4 LTS / x86-64

tim-finnigan commented 2 months ago

Thanks @drhpc for the feature request. The put-object command involves a call to the underlying PutObject API maintained by the S3 team. Since service APIs like this are used across SDKs in addition to the CLI, I'm going to transfer this to our cross-SDK repository and reach out to the S3 team for feedback. (ref: P127348069)

tim-finnigan commented 2 months ago

We heard back from the S3 team regarding this feature request, and they have added it to their backlog for further review and tracking. Please feel free to check back in here for updates in the future or share any additional details on use cases. Thanks again for the feature request.

github-actions[bot] commented 2 months ago

This issue is now closed.

Comments on closed issues are hard for our team to see. If you need more assistance, please either tag a team member or open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.

drhpc commented 2 months ago

Thanks!