aws / aws-cli

Universal Command Line Interface for Amazon Web Services
Other
15.34k stars 4.08k forks source link

S3 sync does not allow specification of file metadata headers e.g content-encoding #319

Closed ratpik closed 10 years ago

ratpik commented 11 years ago

Uploading to S3 by using the aws s3 sync should have an option to specify the headers for that request. Right now there doesn't seem to be any way to upload gzipped content and expect it to have the appropriate metadata

--add-header='Content-Encoding: gzip'

onyxfish commented 11 years ago

+1, this makes the s3 tool unusable if you're using CloudFront and need to specify cache headers.

NV commented 11 years ago

What if I want to sync both gzipped and uncompressed files?

ratpik commented 11 years ago

The two should be synced in separate commands. One with the compression headers and one without them.

lectroidmarc commented 11 years ago

+1 here too. Uploading gzip files and being able to set "Content-Encoding" and "Cache-Control" is important to us.

garnaat commented 10 years ago

We have added a number of new options to the s3 commands such as --content-disposition, --content-encoding, --content-language, --cache-control. Please check out the interactive help page for details.

robeson commented 10 years ago

--content-disposition is listed above but not in the pull request merge: https://github.com/aws/aws-cli/pull/352

And, from what I can tell, it doesn't seem to be working. I can perform copies but ContentDisposition isn't set in the meta data. I'm doing the following: aws s3 cp s3://bucket1/path/object s3://bucket1/path/object --content-disposition "attachment"

Am I missing something or is that functionality missing?

garnaat commented 10 years ago

The comment for #352 is incorrect. There is a --content-disposition option and it seems to be working correctly for me. How are you determining that it is not set in the metadata? Try doing this:

aws s3api head-object --bucket bucket1 --key path/object1

and see if the content disposition is returned for the object.

robeson commented 10 years ago

Thanks for your reply. Yes, that's how I'm checking and it's not there. I get this:

{
    "LastModified": "Tue, 01 Oct 2013 21:03:11 GMT", 
    "AcceptRanges": "bytes", 
    "ETag": "\"...\"", 
    "ContentType": "application/octet-stream", 
    "ContentLength": "15141142"
}

And I'm copying from one bucket to another, as follows:

aws s3 cp s3://bucket1/path/object s3://bucket2/path/object --content-disposition "attachment"
garnaat commented 10 years ago

Okay, thanks for the additional info. I'll try to reproduce the problem locally and update here with my results.

garnaat commented 10 years ago

I see what's happening.

If you do a cp or mv from a local file to S3, it is doing a PUT operation, basically creating a new object in S3. When creating a new object, you can specify a variety of metadata to be associated with that data. The content-disposition is one example and it seems to be working fine in this context.

When you do a cp or mv from S3 to S3, it is doing a COPY operation. This copies an existing object in S3 to another object in S3. When performing this COPY operation, you can use the x-amz-metadata-directive header to tell S3 whether it should copy the metadata from the original object or replace the metadata with new values provided in the operation.

We are not currently providing a way to set the value of the x-amz-metadata-directive header in the s3 command thus it is always using the default value which is copy. So, your new object in S3 has the exact same metadata as the original object in S3 and there is no way to override that.

We should create a separate issue for to track this.

marianobntz commented 10 years ago

How about Vary: Accept-Encoding parameter... It would be nice to be able to set that parameter too...

revolunet commented 8 years ago

Where is the "interactive help page" please ? Can someone confirm we can set Cache-Control headers using sync ? (i need to set Expiration headers)

revolunet commented 8 years ago

ok using sync : aws s3 sync --acl public-read --cache-control "max-age=3600" --expires 2100-01-01T00:00:00Z /path/to/images s3://bucket/images

makmanalp commented 8 years ago

Note for posterity, if you have issues with metadata headers not updating on sync, potentially see also #1145

bruno-rossi-movile commented 8 years ago

How can I download a file that is in Content-Encoding = gzip ?

cat test.json

��zl��i�4fv�����s�|6�C>����+�ݺ>�EDh�0��0�s�mU��R��]�B66Ļ�)�T��}�>@ is impossible to read, because has Content-Encoding = gzip. I need make a sync from bucket to local , but all filies in bucket is in gzip, How can I download for possible read ?

AlexeyPanda commented 8 years ago

Hi, how I can find the object to metadata in aws CLI. ?

{ "AcceptRanges": "bytes", "ContentType": "text/plain", "LastModified": "Tue, 15 Mar 2016 12:38:36 GMT", "ContentLength": 230139, "ETag": "\"3afd38518d72b0b83fa7102b37cc3c79\"", "Metadata": { "1": "1", string "metadata" keys "1","1"

ptsteadman commented 8 years ago

I have the same problem as Bruno, aws s3 cp s3://<bucket>/<file> --endpoint 'http://s3.amazonaws.com' . results in a gzipped file. Unzipping the file confirms that the file is not corrupt. I tried added --content-encoding 'gzip' but it did not help.

monty241 commented 7 years ago

It is great that content-encoding can be set, but transparant on the fly zip/unzip would even be better. Now you always have to pre-process the files, whereas in many scenarios it could be done during the sync (and with twice the threads maybe even as fast).

yvele commented 7 years ago

That's why I created https://github.com/yvele/poosh which allow a metadata configuration file based on glob patterns:

{
  plugins : ["s3"],
  baseDir : "./deploy",
  remote  : "s3-us-west-2.amazonaws.com/my-bucket",

  each: [{
    headers   : { "cache-control": { cacheable: "public" } }
  }, {
    match     : "**/*.{html,css,js}",
    gzip      : true,
    headers   : { "cache-control": { maxAge: "48 hours" } }
  }, {
    match     : "**/*.{jpg,png,gif,ico}",
    gzip      : false,
    headers   : { "cache-control": { maxAge: "1 year" } }
  }, {
    match     : "**/*.html",
    priority  : -1
  }]
}

I wish AWS add more control over headers while using aws s3 sync

akotranza commented 7 years ago

WholeFoods later this is still an unmitigated disaster. When using the CLI to COPY or SYNC from an s3 source to an s3 destination it should (with no extra parameters or caveats about multipart uploads) copy the metadata. How is this too much to ask?