brendanhay / amazonka

A comprehensive Amazon Web Services SDK for Haskell.
https://amazonka.brendanhay.nz
Other
604 stars 228 forks source link

Uploading to S3 via Transfer-Encoding fails with 403 Forbidden #546

Closed newhoggy closed 3 years ago

newhoggy commented 5 years ago

When choosing the chunk size to by 1mb, upload of a 1mb file works:

cabal v2-exec hw-uri -- put-file --region Oregon -f 1m.bin -o s3://jky-mayhem/1m.bin --aws-log-level Debug
[Debug] [tid: ThreadId 11][Client Request] {
  host      = s3-us-west-2.amazonaws.com:443
  secure    = True
  method    = PUT
  target    = Nothing
  timeout   = ResponseTimeoutMicro 70000000
  redirects = 0
  path      = /jky-mayhem/1m.bin
  query     =
  headers   = host: s3-us-west-2.amazonaws.com; x-amz-date: 20190918T060611Z; x-amz-content-sha256: 30e14955ebf1352266dc2ff8067e68104607e750abb9d3b36582b8af909fcb58; expect: 100-continue; authorization: AWS4-HMAC-SHA256 Credential=AKIASCTLLGG47NV76Y7R/20190918/us-west-2/s3/aws4_request, SignedHeaders=expect;host;x-amz-content-sha256;x-amz-date, Signature=0f2cf3a7d3c074eca70ffb6636f27d0766b2c6a73c9c4261b5045a8937396767
  body      =  <stream:1048576>
}
[Debug] [tid: ThreadId 11][Client Response] {
  status  = 200 OK
  headers = x-amz-id-2: UWceBobYn/J2n6tYhaI/V7YZzzn42WOlLoWRGPBFYqGVlp5oK7uqLrnmfmgAxbArs41DZAmMCP0=; x-amz-request-id: 0E300B7161246F31; date: Wed, 18 Sep 2019 06:06:12 GMT; x-amz-version-id: ur.L6xj1Zf.Df_lkknfbQ94KE80y6vUc; etag: "b6d81b360a5672d80c27430f39153e2c"; content-length: 0; server: AmazonS3
}

However uploading anything bigger fails:

$ cabal v2-exec hw-uri -- put-file --region Oregon -f 1025k.bin -o s3://jky-mayhem/1025k.bin --aws-log-level Debug
[Debug] [tid: ThreadId 11][Client Request] {
  host      = s3-us-west-2.amazonaws.com:443
  secure    = True
  method    = PUT
  target    = Nothing
  timeout   = ResponseTimeoutMicro 70000000
  redirects = 0
  path      = /jky-mayhem/1025k.bin
  query     =
  headers   = host: s3-us-west-2.amazonaws.com; x-amz-date: 20190918T060624Z; x-amz-content-sha256: STREAMING-AWS4-HMAC-SHA256-PAYLOAD; expect: 100-continue; content-encoding: aws-chunked; x-amz-decoded-content-length: 1049600; content-length: 1049865; authorization: AWS4-HMAC-SHA256 Credential=AKIASCTLLGG47NV76Y7R/20190918/us-west-2/s3/aws4_request, SignedHeaders=content-encoding;content-length;expect;host;x-amz-content-sha256;x-amz-date;x-amz-decoded-content-length, Signature=1702f7b34d88e000fa04f2f7a2b2d76e5500eccc5f322e2a72a7c1663f7e4d16
  body      =  <chunked>
}
[Debug] [tid: ThreadId 11][Client Response] {
  status  = 403 Forbidden
  headers = x-amz-request-id: 8138C4023A460AA2; x-amz-id-2: UiJAFq8VBTbVBXssmrvcvKiAnQF9iU2BIF4JgUTGz4kg0CkHuid9XcedP7VLKvNz2jo4FY2i5y4=; content-type: application/xml; transfer-encoding: chunked; date: Wed, 18 Sep 2019 06:06:25 GMT; connection: close; server: AmazonS3
}
[Error] [tid: ThreadId 11][ServiceError] {
  service    = S3
  status     = 403 Forbidden
  code       = SignatureDoesNotMatch
  message    = Just The request signature we calculated does not match the signature you provided. Check your key and signing method.
  request-id = Just 8138C4023A460AA2
}
hw-uri: ServiceError (ServiceError' {_serviceAbbrev = Abbrev "S3", _serviceStatus = Status {statusCode = 403, statusMessage = "Forbidden"}, _serviceHeaders = [("x-amz-request-id","8138C4023A460AA2"),("x-amz-id-2","UiJAFq8VBTbVBXssmrvcvKiAnQF9iU2BIF4JgUTGz4kg0CkHuid9XcedP7VLKvNz2jo4FY2i5y4="),("Content-Type","application/xml"),("Transfer-Encoding","chunked"),("Date","Wed, 18 Sep 2019 06:06:25 GMT"),("Connection","close"),("Server","AmazonS3")], _serviceCode = ErrorCode "SignatureDoesNotMatch", _serviceMessage = Just (ErrorMessage "The request signature we calculated does not match the signature you provided. Check your key and signing method."), _serviceRequestId = Just (RequestId "8138C4023A460AA2")})

To reproduce, see https://github.com/haskell-works/hw-uri/tree/new-put-file-command

karls commented 5 years ago

Seeing exactly the same behaviour. It looks like it uses a different body type — chunked vs stream. Not sure if it's a red herring or not. Haven't been able to figure out what's going wrong yet..

karls commented 5 years ago

FWIW I re-tried some of the failed uploads with boto3 (the Python SDK for AWS) which worked OK. I then tried changing its transfer configuration to match amazonka-s3 (128kb chunk size), but that still worked OK.

ystael commented 5 years ago

We just ran into this issue. The behavior we observe is that if the request body is a ChunkedBody then the initial request fails its signature verification, but if the request body is a HashedBody it succeeds. I don't think this is an issue with the chunked signature computation, because it's the first request that fails verification -- AWS returns a 403 instead of 100 Continue that would signal to start sending chunks.

This problem started appearing for us, in the us-east-1 region, at 2019-09-16 00:00 UTC, first with gradual failures and then after about 12 hours with consistent failures on every request. We didn't make any changes to our code. This tells me that AWS changed something about what the signature verification expects, but so far I have been unable to get a precise description from support.

Here's our reproduction case: (built for us against Stackage LTS 13.27)

import ClassyPrelude
import Control.Lens ((.~), (&), set)
import Control.Monad.Trans.Resource (runResourceT)
import Data.Conduit.Binary (sourceLbs)
import Network.AWS (Credentials (Discover), envLogger, LogLevel (Trace), newEnv, newLogger, runAWS, send)
import Network.AWS.Data.Body (ChunkedBody (ChunkedBody), defaultChunkSize, toBody)
import Network.AWS.S3.PutObject (putObject)

main :: IO ()
main = do
  env <- newEnv Discover
  let bucketName = "bucket"
      putTgt = "path/to/object-key.txt"
      putContent = encodeUtf8 . pack . concat
                   $ (replicate 20000 "FAILURE " :: [String]) :: ByteString
      putSource = sourceLbs $ fromStrict putContent
      putBody = toBody $ ChunkedBody defaultChunkSize (toInteger $ length putContent) putSource
      putBody2 = toBody putContent
      -- if you use putBody, it will fail signature verification
      -- if you use putBody2, toBody generates a HashedBody and it succeeds
      putReq = putObject bucketName putTgt putBody
  logger <- newLogger Trace stdout
  putStrLn $ tshow putReq
  putResp <- runResourceT $ runAWS (env & envLogger .~ logger) $ send putReq
  putStrLn $ tshow putResp
ystael commented 5 years ago

@karls When I compared against boto3 what I observed (turning on debug logging) was that boto3 does not sign its request payloads: the x-amz-content-sha256 it sends is UNSIGNED-PAYLOAD.

dfithian commented 5 years ago

Our workaround was to install the AWS CLI and shell out to it using Turtle and temporary files.

newhoggy commented 5 years ago

Note that the AWS CLI does not use transfer-encoding, which explains why it still works.

$ aws s3 cp 7g.bin s3://jky-mayhem/7g.bin --endpoint-url http://localhost:9999
$ nc -l 9999
POST /jky-mayhem/7g.bin?uploads HTTP/1.1
Host: localhost:9999
Accept-Encoding: identity
X-Amz-Content-SHA256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
Content-Length: 0
User-Agent: aws-cli/1.16.170 Python/2.7.16 Darwin/18.7.0 botocore/1.12.160
X-Amz-Date: 20190919T000419Z
Content-Type: application/octet-stream
Authorization: AWS4-HMAC-SHA256 Credential=AKIASCTLLGG47NV76Y7R/20190919/us-west-2/s3/aws4_request, SignedHeaders=content-type;host;x-amz-content-sha256;x-amz-date, Signature=45794a8351d6e50322598ce55c845d84469a26fdc6b23524e0327b508dbc919e

The AWS CLI manages to use the simpler method without using a lot of memory.

Trying the same with the amazonka library causes the entire file to be loaded into memory, which is really bad.

amazonka needs a fix so that non-transfer-encoding method has good memory use behaviour as well.

dsturnbull commented 5 years ago

To clarify the above: boto3 and aws cli use the multipart upload api. amazonka uses V4 streaming signatures and puts chunks directly to the object API, which is the thing that suddenly stopped working.

dsturnbull commented 5 years ago

@jchia noticed that the aws library doesn't include content-* headers. I filtered content-length out and we're back to working.

ystael commented 5 years ago

Note that the documentation specifically says that Content-Length is required; see the table under "Calculating the Seed Signature" on https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-streaming.html. I will call this out on our support ticket.

asheshambasta commented 5 years ago

We're seeing the same issue. An application that was working before mysteriously started failing when uploading files to an S3 bucket. No configuration or implementation changes were made on our side. Its a little concerning to see that AWS can just update crucial stuff without any prior announcements.

For us the workaround has (unfortunately) been to avoid chunked uploads and just buffer the entire file in memory and upload from there.

I'll try to get in touch with AWS and will update you guys here in case I receive any response as to why this was done.

dsturnbull commented 5 years ago

Amazon support have told us it's a change on their end and will not be reverted. I don't really understand why they are talking about Transfer-Encoding when we're not actually sending that. But I guess Content-Encoding: aws-chunked is similar(?)

"On September 16, 2019, S3 introduced changes to the REST.PUT.OBJECT API to increase compliance with HTTP standards. This included a change to enforce RFC 7230 where "A sender must not send a Content-Length header field in any message that contains a Transfer-Encoding header field."

asheshambasta commented 5 years ago

Another thing we're noticing when using amazonka-s3 to upload PNG files smaller than 3kB in size:


[DEBUG] [08/Oct/2019:16:39:07 +0200] ["Fractal::AWS"] [Version 4 Metadata] {
  time              = 2019-10-08 14:39:07.279540792 UTC
  endpoint          = s3-eu-west-1.amazonaws.com
  credential        = REDACTED/20191008/eu-west-1/s3/aws4_request
  signed headers    = cache-control;content-type;expect;expires;host;x-amz-acl;x-amz-content-sha256;x-amz-date;x-amz-tagging
  signature         = REDACTED
  string to sign    = {
AWS4-HMAC-SHA256
20191008T143907Z
20191008/eu-west-1/s3/aws4_request
9505d56fcce633bf89384cb2af5c183ba799749c1b49f84cc43b77749a24b151
}
  canonical request = {
PUT
/ca-img-test/100x100_logo_d436845fe16745619209e337332dd79c.png

cache-control:max-age: 315360000, public
content-type:image/png
expect:100-continue
expires:Fri, 05 Oct 2029 14:39:07 GMT
host:s3-eu-west-1.amazonaws.com
x-amz-acl:public-read
x-amz-content-sha256:2abad4845fa618d830a01380da0c93ce4a55bd8ce5ea4f2eee06157a6c3e90c5
x-amz-date:20191008T143907Z
x-amz-tagging:optim=0

cache-control;content-type;expect;expires;host;x-amz-acl;x-amz-content-sha256;x-amz-date;x-amz-tagging
2abad4845fa618d830a01380da0c93ce4a55bd8ce5ea4f2eee06157a6c3e90c5
  }
}
Cannot decode byte '\\x89': Data.Text.Internal.Encoding.decodeUtf8: Invalid UTF-8 stream

Can this be related?

joshgodsiff commented 5 years ago

@asheshambasta - it's probably not related to this specific problem. This ended up being very specifically about a change Amazon made on their end to which headers they use to calculate the signature, which doesn't seem to be the same error you're getting.

jchia commented 4 years ago

Have we decided how to resolve this issue? Are we waiting for more information?

dustin commented 4 years ago

As long as the functionality is in the library, we have runtime bugs that could be compile time bugs. Filtering it out in the meantime would be an improvement.

On Tue, Nov 19, 2019, 01:40 Joshua Chia notifications@github.com wrote:

Have we decided how to resolve this issue? Are we waiting for more information?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/brendanhay/amazonka/issues/546?email_source=notifications&email_token=AAAAN4YZTRUXGT7ZP5AAJALQUOX5VA5CNFSM4IXZ3DYKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEENQF2Q#issuecomment-555418346, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAAN4Z327XJKSETCAS6X4LQUOX5VANCNFSM4IXZ3DYA .

joshgodsiff commented 4 years ago

Applying this patch has worked for us: https://github.com/brendanhay/amazonka/pull/547

Beyond that we're really waiting on @brendanhay on whether or not to merge it and create a release.

jchia commented 4 years ago

Applying this patch has worked for us: #547

Beyond that we're really waiting on @brendanhay on whether or not to merge it and create a release.

Works for us, too. Until the next release.

endgame commented 3 years ago

Now that #547 is merged and we're looking to get a 2.0 RC together, I'll close this issue.

ysangkok commented 3 years ago

If anybody's builds started failing, it may be because the release/1.7.0 branch with commit ba2bfaab was deleted (posting here since it fixed this bug). By following that commit link you can vendor it to your own repo.