fog / fog-aws

Module for the 'fog' gem to support Amazon Web Services http://aws.amazon.com/
MIT License
300 stars 352 forks source link

Make S3 Signature v4 Streaming Optional #523

Closed tsammut closed 5 years ago

tsammut commented 5 years ago

Fog leverages AWS S3's Signature version 4 Streaming as described at https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-streaming.html. This causes compatibility issues with S3-compatible object storage services that support v4 sigs but not streaming; for example, minio and Oracle Cloud Infrastructure. Additionally, this can surface compatibility problems in third-party applications (example) that utilize Fog under the covers.

Please add a connection configuration option that would disable Sig v4 streaming for S3 compatible object stores. Thank you!

geemus commented 5 years ago

I'm open to this, but don't really have any compatible infrastructure to test against. Plus I presume we would need to do something instead of sigv4, rather than just doing nothing? If you would like to work on a PR though, I'd certainly be happy to work through it with you.

tsammut commented 5 years ago

Hi @geemus, thank you!

I lack the skills to contribute a code change, but I'd be happy to provide you with credentials to an S3 compatible object store that supports sig v4 but not sig v4 streaming. Please let me know if you're interested!

I presume we would need to do something instead of sigv4, rather than just doing nothing?

In my case, my object store does support sigv4 and multipart uploads, just not sigv4 streaming. I believe disabling sigv4 streaming and using sigv4 (not sigv2) would be sufficient.

Thanks again.

stanhu commented 5 years ago

I'm interested in this too. Is it a matter of playing with the block in the following?

https://github.com/fog/fog-aws/blob/def0af094b59f2509fa44dcafecdac26758b23fe/lib/fog/aws/storage.rb#L589-L596

If you have a test account or more documentation on how Oracle Cloud works, I might be able to help.

geemus commented 5 years ago

@stanhu yeah, I think it probably would amount to changing that to say something like if params[:body].respond_to?(:read) && !params[:disable_signature_v4_streaming], but I'm not certain (and don't have access to something to test against.

tsammut commented 5 years ago

Great, thank you @stanhu. If you wouldn't mind emailing me at (tim.sammut@oracle.com) I can give you access to a test account.

stanhu commented 5 years ago

Great, I've got a test account and can reproduce the failure via this script:

require 'fog-aws'

# create a connection
connection = Fog::Storage.new(
  {
    provider: 'AWS',
    region: ENV['ORACLE_CLOUD_REGION'],
    aws_access_key_id: ENV['AWS_ACCESS_KEY_ID'],
    aws_secret_access_key: ENV['AWS_SECRET_ACCESS_KEY]',
    endpoint: ENV['ORACLE_CLOUD_ENDPOINT'],
    path_style: true,
    connection_options: {
      ssl_verify_peer: false,
    }
})

dir = connection.directories.new(key: 'my-bucket')

file = dir.files.create(
  key: 'CHANGELOG.md',
  body: File.open("CHANGELOG.md")
)

This returns this exception:

Expected(200) <=> Actual(400 Bad Request) (Excon::Error::BadRequest)
excon.error.response
  :body          => "<?xml version='1.0' encoding='UTF-8'?><Error><Message>STREAMING-AWS4-HMAC-SHA256-PAYLOAD is not supported.</Message><Code>InvalidRequest</Code></Error>"
stanhu commented 5 years ago

Maybe it was just me, but https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-auth-using-authorization-header.html confused me because:

For Oracle Cloud, this is what works and doesn't work with fog-aws:

Upload method Chunked transfer? Pass?
PUT Object No
PUT Object Yes :boom:
Multipart upload No
Multipart upload Yes

https://github.com/fog/fog-aws/pull/525 fixes the second case (PUT Object, chunked transfers), but it disables chunked transfers altogether. Ideally we'd be able to choose and not disable it outright.

What do you think @geemus?

geemus commented 5 years ago

@stanhu Yeah, it is weird in terms of naming, using aws-chunked instead of actual chunked.

Do you know what it would take to get PUT with chunked encoding working on Oracle? Could we look at falling back to v3 signing stuff instead?

tsammut commented 5 years ago

Thank you for working on this, @geemus and @stanhu! And for sure, the naming is making this more confusing!

Do you know what it would take to get PUT with chunked encoding working on Oracle? Could we look at falling back to v3 signing stuff instead?

Oracle Cloud doesn't support v2 signing (the predecessor to v4 signing) or chunked encoding.

Using PUT Object and Multipart Uploads with v4 (not v4-streaming) and without chunked encoding will work though.

stanhu commented 5 years ago

@tsammut By adding multipart_chunk_size to the test script config, I seem to be able to initiate multipart uploads fine, although the last multipart request seems to fail with:

The Content-Length must be greater than or equal to 1 (it was '0')

I was surprised I didn't see STREAMING-AWS4-HMAC-SHA256-PAYLOAD is not supported right off the bat the way I saw it with the PUT request, so I assumed chunked encoding was working?

stanhu commented 5 years ago

Ok, @tsammut tells me that AWS Multipart Upload does not use v4-streaming and uses v4. That explains it. I suppose I could confirm that by looking at Wireshark trace with an unencrypted Minio instance.

I think https://github.com/fog/fog-aws/pull/525 should fix this issue then. It could use some tests.

stanhu commented 5 years ago

To add to my confusion, in https://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadInitiate.html I see the example:

HTTP/1.1 200 OK   
x-amz-id-2: 36HRCaIGp57F1FvWvVRrvd3hNn9WoBGfEaCVHTCt8QWf00qxdHazQUgfoXAbhFWD   
x-amz-request-id: 50FA1D691B62CA43   
Date: Wed, 28 May 2014 19:34:58 GMT   
x-amz-server-side-encryption-customer-algorithm: AES256   
x-amz-server-side-encryption-customer-key-MD5: ZjQrne1X/iTcskbY2m3tFg==   
Transfer-Encoding: chunked   

<?xml version="1.0" encoding="UTF-8"?>
<InitiateMultipartUploadResult
xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
   <Bucket>example-bucket</Bucket>
   <Key>example-object</Key>
   <UploadId>EXAMPLEJZ6e0YupT2h66iePQCc9IEbYbDUy4RTpMeoSMLPRp8Z5o1u8feSRonpvnWsKKG35tI2LB9VDPiCgTy.Gq2VxQLYjrue4Nq.NBdqI-</UploadId>
</InitiateMultipartUploadResult>  

Note the use of Transfer-Encoding: chunked. However, it looks like v4-streaming uses Transfer-Encoding: aws-chunked (https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-streaming.html). Are these things really different?

tsammut commented 5 years ago

Interesting, I am not sure. This post seems to have some good points.

https://stackoverflow.com/questions/51762334/what-is-the-difference-between-multipart-upload-and-chunked-upload-on-aws

FWIW though, that example above is in the server's HTTP response, not the client's request.

stanhu commented 5 years ago

Ok, https://github.com/aws/aws-sdk-go/issues/142#issuecomment-83128304 (and https://github.com/boto/botocore/issues/995) clarifies things a bit:

To go back to the chunked upload point: once you're using multipart uploads, you don't need to worry about chunked signing, since you're doing the chunking as part of the UploadPart() operation locally. Once you have an individual part to upload, you likely have that data in memory and can seek on it as desired.

It sounds like chunked uploading isn't really applicable to multipart uploads because multipart uploads are effectively chunking data already.

stanhu commented 5 years ago

I think what confused me is that fog-aws doesn't distinguish between multipart and simple PUT requests and always inserts the STREAMING-AWS4-HMAC-SHA256-PAYLOAD header in multipart requests.

geemus commented 5 years ago

It may have just been a simplifying assumption along the way (and since it worked for s3 we didn't worry about the fact that it might not be as granular or accurate as it could be). I'll pull in #525 then for now, as it sounds like it gets us what we want. Also, since it is optional, it should be easy enough for people to just not opt-in if they don't want that behavior. We can always revisit if we find a better approach in the future.

wuservices commented 2 years ago

🙏 enable_signature_v4_streaming: false also helped me use this with Google Cloud Storage. While a native integration is better, using S3 compatibility does it make migration from S3 to GCS a bit simpler especially when dependencies are using fog-aws. Just sharing in case this helps somebody else too.

thoernle commented 1 year ago

🙏 enable_signature_v4_streaming: false also helped me use this with Google Cloud Storage. While a native integration is better, using S3 compatibility does it make migration from S3 to GCS a bit simpler especially when dependencies are using fog-aws. Just sharing in case this helps somebody else too.

That worked wonderfully when using Cloudflare R2. Maybe that could help in https://github.com/fog/fog/issues/3644

thooooooomas commented 1 month ago

Hi folks, related to https://github.com/fog/fog-aws/blob/def0af094b59f2509fa44dcafecdac26758b23fe/lib/fog/aws/storage.rb#L591

Isn't this very specific to AWS only? The spec says Content-Encoding should be set to aws-chunked. I'm guessing this is why signature v4 streaming is failing with other cloud providers (as shown above).

Is the only workaround to disable signature v4 streaming?

stanhu commented 1 month ago

Good question. I tried to make this change for NetApp ONTAP S3, but it didn't seem to help:

diff --git a/lib/fog/aws/storage.rb b/lib/fog/aws/storage.rb
index 9a6b18ab6..e4b0ec3bb 100644
--- a/lib/fog/aws/storage.rb
+++ b/lib/fog/aws/storage.rb
@@ -655,6 +655,7 @@ module Fog
                 # AWS have confirmed that s3 can infer that the content-encoding is aws-chunked from the x-amz-content-sha256 header
                 #
                 params[:headers]['x-amz-content-sha256'] = 'STREAMING-AWS4-HMAC-SHA256-PAYLOAD'
+                params[:headers]['Content-Encoding'] = 'aws-chunked'
                 params[:headers]['x-amz-decoded-content-length'] = params[:headers].delete 'Content-Length'
               else
                 params[:headers]['x-amz-content-sha256'] = 'UNSIGNED-PAYLOAD'

I wonder if S3 Signature v4 streaming should be disabled by default. When I use the AWS CLI to upload S3 files with --debug, I always see UNSIGNED-PAYLOAD, even with --expected-size: https://docs.aws.amazon.com/cli/latest/reference/s3/cp.html

geemus commented 1 month ago

I'm not sure, I haven't done anything related to this (on AWS or otherwise) in a very long time. I'm open to possibilities and suggestions, as long as we do our best not to break it for folks.

stanhu commented 1 month ago

Yeah, I'm inclined that this should be disabled by default since as far as I can tell the AWS SDK doesn't support this on the client side:

Many third-party S3 servers don't support this either.

Note that STREAMING-AWS4-HMAC-SHA256-PAYLOAD only takes effect for a single PUT request under the multipart size limit.

geemus commented 1 month ago

@stanhu thanks for the details, defaulting to false seems reasonable as a next step. Hopefully that makes it more compatible and will still allow people that specifically want it to opt-in.