aws / aws-sdk-ruby

The official AWS SDK for Ruby.
https://aws.amazon.com/sdk-for-ruby/
Apache License 2.0
3.58k stars 1.23k forks source link

S3: Object ACL gets reset to :private when updating metadata #180

Closed zwily closed 11 years ago

zwily commented 11 years ago

Updating metadata on an object does a copy in the background, and the copy command sets a :private ACL on the new object by default. That a fine behavior for copy, but when doing something like object.metadata['cache-control'] = 'max-age=315360000', the ACL getting reset is surprising.

zwily commented 11 years ago

This probably applies to other settings that are normally not copied with S3Object#copy_from as well, like :server_side_encryption and :reduced_redundancy.

trevorrowe commented 11 years ago

You are correct. This is a tricky issue, as there are quite a few of these attributes that we would need to collect and copy along. In additional to existing metadata, the ACL, :server_side_encryption and :reduced_redundancy one would also need to consider:

There are likely other attributes we would need to identify if we are going to do anything beyond merging existing metadata.

zwily commented 11 years ago

When I first looked at the #copy_from method, I thought that the part that sets the :metadata_directive was attempting to preserve those. But I read it backwards - if :metadata is set, then it will use REPLACE, so everything gets reset to defaults.

That part is confusing too - it seems like if I don't set :metadata, :content_disposition, :content_type, or :cache_control in a copy, they will be preserved. If I set one of them, the rest will be reset (assuming I understand the S3 docs on that correctly.)

trevorrowe commented 11 years ago

That is correct. CopyObject will preserve those attributes only if NONE of them are set. If you set any of them, you must set all. The object ACL is never preserved by S3 when using the CopyObject operation.

trevorrowe commented 11 years ago

At this point, I am inclined to add support for preserving those attributes in the original case (when updating metadata).

s3.buckets['foo'].objects['bar'].metadata['key'] = value # preserve additional object metadata AND acl

I am not inclined to make changes to the #copy_from method. This currently very closely maps to the S3 CopyObject operation, and should probably stay as such. Thoughts?

zwily commented 11 years ago

Agreed - #copy_from should maintain the semantics of CopyObject. (Although it would be nice if the #copy_from documentation made the point about "if you set one you must set them all" more clear.)

trevorrowe commented 11 years ago

Hmm... I started work on this and I'm running into another issue. It does not look like it will be possible to copy the ACL without 2 additional calls. If the object does have a custom ACL, it can not be set via #copy_object, as that operation only accepts canned ACLs and ACL headers, not XML ACL documents. The worst case is now 4 calls instead of 2:

  1. HeadObject (gets existing metadata)
  2. GetObjectAcl (to see if an ACL is set so we can preserve this)
  3. CopyObject (with the merged metadata)
  4. PutObjectAcl (only if step 2 returned an ACL)

This leaky abstraction is getting worse. I wonder if there is a better way to handle this.

zwily commented 11 years ago

Ick, yeah that's messy. If it's too bad, then just updating the documentation to say that it uses #copy_from semantics is likely enough for now. That at least would have kept me from getting bitten by it.

trevorrowe commented 11 years ago

Closing this issue. The public docs have a note on both #copy_to and #copy_from that indicates the ACL is not copied and must be re-specified.

trevorrowe commented 11 years ago

Doc link for reference: http://docs.aws.amazon.com/AWSRubySDK/latest/AWS/S3/S3Object.html#copy_from-instance_method