brendanhay / amazonka

A comprehensive Amazon Web Services SDK for Haskell.
https://amazonka.brendanhay.nz
Other
599 stars 227 forks source link

Fix presign urls that are not for S3 #767

Closed denar50 closed 2 years ago

denar50 commented 2 years ago

Fixes #766

endgame commented 2 years ago

(Please force-push your changes)

denar50 commented 2 years ago

@endgame I have addressed all your comments. Please let me know if you want me to do further changes. As for testing, I have tested that the generation of presigned URLs for RDS are working correctly.

The only thing left to do is the testing of S3 requests that you suggested (I will look into that tomorrow).

One more question. These fixes are based on Amazonka 2.0. What about patching 1.6?

denar50 commented 2 years ago

One more question. These fixes are based on Amazonka 2.0. What about patching 1.6?

endgame commented 2 years ago

One more question. These fixes are based on Amazonka 2.0. What about patching 1.6?

Don't worry about it. 1.6 has been stale for so long that the folklore has been "use latest git" for years, and 2.0 RC1 came out late last year. 2.0 has had so many new services, correctness fixes and ergonomics fixes that I recommend you use it regardless.

denar50 commented 2 years ago

@endgame I have tested PutObject and GetObject in that order. I put a file in a bucket and then download it locally. I noticed however that the V4 presign function is not called in those cases, so I also did a presignURL for a GetObject request and it worked fine (it hit the case when the digest should be UNSIGNED-PAYLOAD): it generated an URL that I could use somewhere else to download the file in question.

endgame commented 2 years ago

Yeah, the pre-signing capability I'd love to see tested is a pre-signed PutObject, which signs a URL so that someone else can upload an object into your bucket. (Amazon's new Selling Partner API returns such a URL, when you create a feed document in their API.)

Such uploads seem like a pretty central test case for presigned URLs with unsigned payloads, and if that works then I'm happy.

denar50 commented 2 years ago

@endgame Alright, I am failing to do the PutObject request from a presigned URL, so I would appreciate some assistance there because I am not sure if the code itself is broken or I am not using the presigned URL correctly. So what I do is load the file with:

body <- AWS.chunkedFile AWS.defaultChunkSize "/some/path/in/my/computer/test-put.txt"

and then I generate the presigned URL with:

runResourceT $ presignURL regionalEnv signingTime 900 (S3.newPutObject "some.s3.bucket" "some/key/test-put.txt" $ body)

This gives me back a presigned URL. I noticed that I am hitting the otherwise in the case statement inside the V4.hs file:

digest =
      case _requestBody rq of
        Chunked _ -> trace "CHUNKED unsigned" unsignedPayload
        Hashed (HashedStream h _ _) -> Base.Tag $ BAE.convertToBase BAE.Base16 h
        Hashed (HashedBytes h b)
          | BS.null b && (_serviceSigningName $ _requestService rq) == "s3" -> trace "HASHED BYTES unsigned" unsignedPayload
          | otherwise -> Base.Tag $ BAE.convertToBase BAE.Base16 h

    unsignedPayload = Base.Tag "UNSIGNED-PAYLOAD"

But when I tried uploading the file with cURL: curl --upload-file /some/path/in/my/computer/test-put.txt "PRESIGNED_URL_HERE"

AWS responds with:

The request signature we calculated does not match the signature you provided. Check your key and signing method.
endgame commented 2 years ago

I don't think that's right - I think you're hitting the HashedStream case: the file you're provided is probably small enough that chunkedFile is falling back to calling hashedFile, which returns a HashedStream.

For a presigned s3:PutObject, you probably want to pass "" in for the body, and rely on its IsString instance.

However, I am concerned that we're getting a signature mismatch in the HashedStream case, since it should read and hash the whole file for you.

denar50 commented 2 years ago

For a presigned s3:PutObject, you probably want to pass "" in for the body, and rely on its IsString instance.

I just ran this test and it was successful. It generated a URL that I then used through Insonmia to send a PUT request without a body. Then I confirmed on S3 and I saw a txt file created with 0kb.

denar50 commented 2 years ago

However, I am concerned that we're getting a signature mismatch in the HashedStream case, since it should read and hash the whole file for you.

@endgame After running the same test I saw that the last part of the response that I get from AWS is:


host
UNSIGNED-PAYLOAD</CanonicalRequest><CanonicalRequestBytes>50 55 54 0a 2f 6e 6f 6e 70 72 6f 64 2e 65 75 31 2e 63 6f 72 65 2d 62 61 6e 6b 69 6e 67 2e 6b 6c 61 72 6e 61 2e 6e 65 74 2f 2f 63 6f 72 65 2d 62 61 6e 6b 69 6e 67 2f 69 6c 74 78 6e 2f 73 74 61 67 69 6e 67 2d 70 72 6f 66 69 6c 69 6e 67 2d 64 61 74 61 2f 74 65 73 74 2d 70 75 74 2e 74 78 74 0a 58 2d 41 6d 7a 2d 41 6c 67 6f 72 69 74 68 6d 3d 41 57 53 34 2d 48 4d 41 43 2d 53 48 41 32 35 36 26 58 2d 41 6d 7a 2d 43 72 65 64 65 6e 74 69 61 6c 3d 41 53 49 41 36 43 44 34 36 32 47 47 59 41 51 47 4b 4d 4d 53 25 32 46 32 30 32 32 30 34 31 39 25 32 46 65 75 2d 77 65 73 74 2d 31 25 32 46 73 33 25 32 46 61 77 73 34 5f 72 65 71 75 65 73 74 26 58 2d 41 6d 7a 2d 44 61 74 65 3d 32 30 32 32 30 34 31 39 54 31 35 33 33 32 32 5a 26 58 2d 41 6d 7a 2d 45 78 70 69 72 65 73 3d 39 30 30 26 58 2d 41 6d 7a 2d 53 65 63 75 72 69 74 79 2d 54 6f 6b 65 6e 3d 46 77 6f 47 5a 58 49 76 59 58 64 7a 45 46 6b 61 44 49 6c 75 61 77 4d 69 4e 6e 75 56 74 38 4c 57 6a 43 4c 71 41 63 51 4a 38 52 54 33 47 64 52 78 55 46 70 30 6b 66 4f 39 34 55 4d 6d 53 71 4d 6d 51 74 32 55 77 79 39 4b 49 68 77 4f 43 38 69 42 68 44 55 4b 6d 49 65 56 6e 41 51 39 25 32 46 72 68 78 43 6c 30 6c 4b 53 34 30 4f 55 30 64 55 56 51 6c 52 54 59 66 70 25 32 42 57 32 78 59 52 51 38 43 50 66 4b 51 71 6a 34 58 25 32 46 25 32 46 5a 6d 63 6c 6f 70 7a 65 38 70 4a 66 48 33 4f 4c 71 78 74 61 47 25 32 46 48 77 33 70 6a 31 6d 68 7a 25 32 46 54 43 38 6d 44 73 6e 6e 6e 48 25 32 46 4b 77 6f 6f 59 79 6d 69 42 48 44 48 63 4f 6d 76 25 32 46 5a 41 70 65 6a 48 38 75 77 57 50 35 4b 79 31 6e 48 54 48 4c 57 70 33 68 78 78 56 4c 6d 63 50 35 34 55 25 32 46 70 43 55 38 38 45 4b 51 56 54 67 68 48 58 6a 4c 74 65 75 79 38 37 38 4d 6c 66 46 6a 4b 4a 38 43 45 47 66 52 46 25 32 42 43 30 63 53 6a 6f 7a 7a 4d 36 33 39 58 42 42 38 34 59 76 4d 32 6f 25 32 42 77 25 32 42 79 44 34 79 34 6d 65 33 77 73 77 5a 76 33 76 34 64 55 56 31 39 37 74 4e 51 6e 4a 56 63 31 25 32 46 31 77 53 7a 63 6e 42 4e 51 73 30 57 64 64 7a 6c 6a 72 48 31 35 78 73 43 53 6a 74 6f 76 75 53 42 6a 49 7a 53 77 79 66 71 35 6d 25 32 42 6e 53 47 4b 73 46 78 4c 4a 52 42 4c 59 56 6c 41 38 68 68 6f 38 5a 4b 46 77 34 38 68 78 4b 4e 52 4d 76 68 45 64 42 62 4a 41 65 38 4c 6c 62 62 49 44 77 46 54 66 47 76 62 58 25 32 46 74 34 26 58 2d 41 6d 7a 2d 53 69 67 6e 65 64 48 65 61 64 65 72 73 3d 68 6f 73 74 0a 68 6f 73 74 3a 73 33 2e 65 75 2d 77 65 73 74 2d 31 2e 61 6d 61 7a 6f 6e 61 77 73 2e 63 6f 6d 0a 0a 68 6f 73 74 0a 55 4e 53 49 47 4e 45 44 2d 50 41 59 4c 4f 41 44</CanonicalRequestBytes><RequestId>DW82DK46ZN77N602</RequestId><HostId>E9h054YMuRKDwXQhcuCF8oVvS6oblwTTZvLd0lXWagCE8uc2UygrBaJJYNU0ItX3q+ILXtY+F68=</HostId></Error>

There you see UNSIGNED PAYLOAD... so I changed Hashed (HashedStream h _ _) -> Base.Tag $ BAE.convertToBase BAE.Base16 h to Hashed (HashedStream _h _ _) -> Base.Tag "UNSIGNED-PAYLOAD" and it worked. I don't really know what to make of this. Should we add an additional case where we put UNSIGNED PAYLOAD if the service is S3? It would look like this:

digest =
      case _requestBody rq of
        Chunked _ -> unsignedPayload
        Hashed (HashedStream h _ _)
          | (_serviceSigningName $ _requestService rq) == "s3" -> unsignedPayload
          | otherwise -> Base.Tag $ BAE.convertToBase BAE.Base16 h
        Hashed (HashedBytes h b)
          | BS.null b && (_serviceSigningName $ _requestService rq) == "s3" -> unsignedPayload
          | otherwise -> Base.Tag $ BAE.convertToBase BAE.Base16 h

unsignedPayload = Base.Tag "UNSIGNED-PAYLOAD"

Moreover, looking at the presign documentation from the AWS cli, you never speficified a file there, which leads me to think that the body to presign the URL is always UNSIGNED-PAYLOAD.

endgame commented 2 years ago

2. I agree that we want to check for S3 in the HashedStream case and return UNSIGNED-PAYLOAD there. If you push that change, I will review again.

Actually, I am not convinced of this - we've gone to all the work of streaming this file and hashing it. We know the payload and it should match! I am now suspicious that something is actually wrong with how we calculate this, and that's why it's not working. Unless S3 has some weirdness where it's known to reject presigned PutObject requests with known bodies.

denar50 commented 2 years ago

Your test with the unsigned PUT which created a zero-byte file - can you confirm that it works if you actually upload some data?

Yeah, I am able to upload the file even if I create an unsigned PUT for a zero-byte file (just because that hits the case BS.null b && (_serviceSigningName $ _requestService rq) == "s3" -> unsignedPayload with unsignedPayload being Base.Tag "UNSIGNED-PAYLOAD", and it seems that sending UNSIGNED-PAYLOAD allows us to upload any file :shrug:

Your most recent comment sometimes writes UNSIGNED PAYLOAD (with space) and sometimes UNSIGNED-PAYLOAD (with hyphen). Did you meant UNSIGNED-PAYLOAD (with hyphen) everywhere? I cannot find UNSIGNED PAYLOAD (with space) anywhere in botocore. I agree that we want to check for S3 in the HashedStream case and return UNSIGNED-PAYLOAD there. If you push that change, I will review again.

Apologies for the confusion, I meant UNSIGNED-PAYLOAD (with hyphen). Given your last comment where you explain that you still have doubts about ignoring the hashed stream and using UNSIGNED-PAYLOAD, I will wait on this.

I am still worried that presigning with a known payload may still be busted. Can we test that by presigning a request for (it's silly but free) dynamodb:CreateTable?

I will try to do this test today and get back to you.

endgame commented 2 years ago

it seems that sending UNSIGNED-PAYLOAD allows us to upload any file shrug

Yeah, I believe that's one of the major purposes of this feature: allow the caller to upload something to a fixed location.

Did your have any luck presigning with a known payload?

denar50 commented 2 years ago

Did your have any luck presigning with a known payload?

Unfortunately not. I am able to generate the presigned URL by doing runResourceT $ presignURL regionalEnv signingTime 900 (newCreateTable "test-table-amazonka" (newKeySchemaElement "Artist" (KeyType' "HASH") :| [])), and I saw that I am going through the otherwise in:

Hashed (HashedBytes h b)
          | BS.null b && (_serviceSigningName $ _requestService rq) == "s3" -> unsignedPayload
          | otherwise -> Base.Tag $ BAE.convertToBase BAE.Base16 h

But no idea how to actually use the URL to send the request with curl. Where can I see the JSON payload before it gets hashed in Amazonka? Maybe I can copy that string and send it as a body?

endgame commented 2 years ago

The body will be assembled using instance ToJSON CreateTable:

https://github.com/brendanhay/amazonka/blob/b7d98d3ce163371b41cea38614c5637a3ee7d300/lib/services/amazonka-dynamodb/gen/Amazonka/DynamoDB/CreateTable.hs#L686-L709

This is because its AWSRequest instance uses postJSON:

https://github.com/brendanhay/amazonka/blob/b7d98d3ce163371b41cea38614c5637a3ee7d300/lib/services/amazonka-dynamodb/gen/Amazonka/DynamoDB/CreateTable.hs#L636

Which is defined in amazonka-core:Amazonka.Request:

https://github.com/brendanhay/amazonka/blob/b7d98d3ce163371b41cea38614c5637a3ee7d300/lib/amazonka-core/src/Amazonka/Request.hs#L76-L77

The easiest thing to do is probably to call Aeson.encode on the CreateTable value, or if you want to be absolutely sure of what's going into the signer, hack something from Debug.Trace into one of these request builder functions.

denar50 commented 2 years ago

So doing encode on the CreateTable result yields:

"{\"AttributeDefinitions\":[],\"KeySchema\":[{\"AttributeName\":\"Artist\",\"KeyType\":\"HASH\"}],\"TableName\":\"test-table-amazonka\"}"

Which I use to curl the URL returned in the presign:

curl -X POST "https://dynamodb.eu-west-1.amazonaws.com/?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=ASIA6QD462GKY22M9W4D%2F20220422%2Feu-west-1%2Fdynamodb%2Faws4_request&X-Amz-Date=20220422T064518Z&X-Amz-Expires=900&X-Amz-Security-Token=FwoGZXIvYXdzEJf%2F%2F%2F%2F%2F%2F%2F%2F%2F%2FwEaDJ1ibfO4FMp1iN8t8iLqAbFuqPE0sWK%2Fehok9g2DM7pdst%2FnaNTshzppCk0nomhZQHlKgfWNNif2nOG%2BBF92TC2SamKggmHG0WIoAdclZgQ0ZSdibiUYU%2BRi%2BkKNPariesiVSPqnJFGdrpMl1DkyfdM7psm%2BpvMSU29iH%2FWBsld5vwxtYTSjX%2BoiTBjIzgOWSM1L0mwzk%2BjqykR6OIHbHxJQKmFubQv%2BG4%2B%2FCHX90WkNGaMCQVdEeVttRPzS2RJ1b&X-Amz-SignedHeaders=content-type%3Bhost%3Bx-amz-target&X-Amz-Signature=9a4bb4a5a9354c44eaa52b20a2b839bf11316830c6903b602b3486a3599068cb" -d "{\"AttributeDefinitions\":[],\"KeySchema\":[{\"AttributeName\":\"Artist\",\"KeyType\":\"HASH\"}],\"TableName\":\"test-table-amazonka\"}"

And now I am getting HTTP/1.1 404 Not Found :dizzy:

I get the same result even if I send an empty payload or if I do PUT instead of POST.

endgame commented 2 years ago

I won't have time to read this weekend, but on Tuesday I intend to read the AWS Sigv4 stuff closely and will try and figure out what we're doing wrong. I'm starting to think we're in "did this ever work outside S3?" territory. Your RDS token stuff is working though, right?

denar50 commented 2 years ago

Your RDS token stuff is working though, right?

Yes. It works as intended.

endgame commented 2 years ago

I've spent this week looking at the generator instead. I should be able to look at this in the next day or so.