basho / riak_cs

Riak CS is simple, available cloud storage built on Riak.
http://docs.basho.com/riakcs/latest/
Apache License 2.0
568 stars 94 forks source link

Multipart upload fails (AccessDenied) when using Nodejs aws-sdk client with v2 signature [JIRA: RCS-380] #1327

Open SBRK opened 8 years ago

SBRK commented 8 years ago

I'm having trouble uploading files when using aws-sdk (https://github.com/aws/aws-sdk-js). Their sdk provides an upload method that handles uploads. With small files, I have no problem, since it uses the putObject method. But with bigger files, it uses the multipart upload, and I get an AccessDenied response on the first part upload. I looked at the access logs :

127.0.0.1 - - [23/Nov/2016:19:07:51 +0000] "POST /buckets/mybucket/objects/city.fbx/uploads HTTP/1.1" 200 236 "" "aws-sdk-nodejs/2.7.7 linux/v6.9.1"
127.0.0.1 - - [23/Nov/2016:19:07:51 +0000] "PUT /buckets/mybucket/objects/city.fbx/uploads/l19UW2ICSJGe2vZ6KoWE8A==?partNumber=1 HTTP/1.1" 403 171 "" "aws-sdk-nodejs/2.7.7 linux/v6.9.1"
127.0.0.1 - - [23/Nov/2016:19:07:51 +0000] "DELETE /buckets/mybucket/objects/city.fbx/uploads/l19UW2ICSJGe2vZ6KoWE8A== HTTP/1.1" 403 171 "" "aws-sdk-nodejs/2.7.7 linux/v6.9.1"

When uploading a file with s3cmd, I have no problem, and in the logs, I see more or less the same, except the PUT requests succeed and return 200.

127.0.0.1 - - [23/Nov/2016:19:30:11 +0000] "POST /buckets/mybucket/objects/test2.obj/uploads HTTP/1.1" 200 237 "" ""
127.0.0.1 - - [23/Nov/2016:19:30:13 +0000] "PUT /buckets/mybucket/objects/test2.obj/uploads/xlYIwbH0TSyZDQQ1GlCLRQ==?partNumber=1 HTTP/1.1" 200 0 "" ""
127.0.0.1 - - [23/Nov/2016:19:30:13 +0000] "PUT /buckets/mybucket/objects/test2.obj/uploads/xlYIwbH0TSyZDQQ1GlCLRQ==?partNumber=2 HTTP/1.1" 200 0 "" ""
127.0.0.1 - - [23/Nov/2016:19:30:15 +0000] "PUT /buckets/mybucket/objects/test2.obj/uploads/xlYIwbH0TSyZDQQ1GlCLRQ==?partNumber=3 HTTP/1.1" 200 0 "" "
127.0.0.1 - - [23/Nov/2016:19:30:17 +0000] "POST /buckets/mybucket/objects/test2.obj/uploads/xlYIwbH0TSyZDQQ1GlCLRQ== HTTP/1.1" 200 303 "" ""

The only difference I see there is the "aws-sdk-nodejs/2.7.7 linux/v6.9.1" in the requests made from aws-sdk, but I'm not sure what this corresponds to. Is there any way to have more information in the access logs ? Know why the access was denied ?

SBRK commented 8 years ago

I managed to get the upload to work, switching to AWS signature v4 on the client and riak-cs. Sort of. I don't get AccessDenied errors anymore, but the data received is not good. If I redownload the file, I can see it has the HTTP request headers in it, for example:

PUT http://mybucket.s3.amazonaws.com//myfile?partNumber=1&uploadId=LV7CIfHgR8WUwHrJpAQ_Rg%3D%3D HTTP/1.1
User-Agent: aws-sdk-nodejs/2.7.7 linux/v6.9.1
Content-Length: 15728640
Content-Type: application/octet-stream
X-Amz-Content-Sha256: d9834d60be1558bf29031d86368b59306494dacf297f46d7ec719b9fc4b3ec6e
Host: source.s3.amazonaws.com
Expect: 100-continue
X-Amz-Date: 20161124T111905Z
Authorization: AWS4-HMAC-SHA256 Credential=T9X4CT-N5O_4AN6RHRGF/20161124/us-east-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=854ab2b53cb220cea37f83936d52722b4a8354ed5c4bd972814e9c78121ba8d2
Connection: close

I get the same behavior wether using s3cmd or aws-sdk, so it's definitely a problem on riak-cs' end. What could cause such a behavior ? Is there anything in the conf I should check ?

SBRK commented 8 years ago

I was able to get the upload to work as expected, rolling back aws-sdk to v2.1.0 (from... december 2014 !!). v2.1.1 fails, and here's their changelog: https://aws.amazon.com/releasenotes/9016743283037269

I'm guessing the part that could be problematic is the following:

AWS.S3 Expect 100-continue The SDK will now add Expect headers for large (1MB+) payloads in AWS.S3 to reduce the chance of connections being closed due to throttling or other authentication errors. See GitHub issue #135.

shino commented 8 years ago

Riak CS' v4 auth support is half-baked and not mature. So I recommend you to use v2 auth scheme. However, there is one open pull request in aws-sdk-js https://github.com/aws/aws-sdk-js/pull/530 which may leads multipart requests to AccessDenied in v2 auth scheme. Sadly, it has not been merged, it should be manually patched.

HTH.

shino commented 8 years ago

As for 100 Continue, the behavior in http protocol layer is treated in WebMachine and well-tested.

Thanks for your effort to trim down the issue between 2.1.0 and 2.1.1. There are several commits those are not listed in the changelog. Just my wild guess, the commit [1] between these releases may be the cause of AccessDenied (URL encoding of query params are very subtle in v2 scheme).

[1] https://github.com/aws/aws-sdk-js/commit/4ec169808494a2d9eab9641c636dc4ce1886aab0

SBRK commented 7 years ago

Ok thanks for these replies. I will try your fix once I have time. It seems to be working fine right now with v4 signature so I'm not in a rush anymore.

The problem from v2.1.0 to 2.1.1 is not the access denied thing. It's the body content that is saved to the file with the HTTP header in it.

SBRK commented 7 years ago

Maybe I should open an other issue for the body content problem, so that there is no mixup ?