boto / boto3

AWS SDK for Python
https://aws.amazon.com/sdk-for-python/
Apache License 2.0
9.04k stars 1.87k forks source link

S3 generate_presigned_url with a RequesterPays bucket fails with a signature mismatch under AWS4-HMAC-SHA256 #3685

Open pkage opened 1 year ago

pkage commented 1 year ago

Describe the bug

When generating a pre-signed RequesterPays S3 get_object URL with boto3, the generated URL is invalid if the signature method used is the recommended v4 signature method. Given that this is the recommended method for S3 signature authentication (and the only one supported in new regions such as eu-central-1), this makes creating these URLs impossible.

It seems like the X-Amz-Request-Payer is not being properly added to the request, causing a signature mismatch on the AWS side when verifying the URL.

Expected Behavior

Expected a valid presigned RequesterPays url, similar to using default (v2) authentication.

Current Behavior

S3 returns a SignatureDoesNotMatch error, with message The request signature we calculated does not match the signature you provided. Check your key and signing method.. The full XML response is below:

Full response (with AWS AccessKeyIDs redacted) ```xml SignatureDoesNotMatch The request signature we calculated does not match the signature you provided. Check your key and signing method. REDACTED AWS4-HMAC-SHA256 20230427T193611Z 20230427/us-west-2/s3/aws4_request 1793fed204005dc6495ac2637e854bf55d35fa63b33cea4ff9c0ed11444d5be0 7c39b096817f4eeb370d00c8f6121bb85b9cdf5c9f8d026df8135e861afaca4b 41 57 53 34 2d 48 4d 41 43 2d 53 48 41 32 35 36 0a 32 30 32 33 30 34 32 37 54 31 39 33 36 31 31 5a 0a 32 30 32 33 30 34 32 37 2f 75 73 2d 77 65 73 74 2d 32 2f 73 33 2f 61 77 73 34 5f 72 65 71 75 65 73 74 0a 31 37 39 33 66 65 64 32 30 34 30 30 35 64 63 36 34 39 35 61 63 32 36 33 37 65 38 35 34 62 66 35 35 64 33 35 66 61 36 33 62 33 33 63 65 61 34 66 66 39 63 30 65 64 31 31 34 34 34 64 35 62 65 30 GET /collection02/level-1/standard/etm/2022/163/033/LE07_L1TP_163033_20220506_20220901_02_T1/LE07_L1TP_163033_20220506_20220901_02_T1_B2.TIF X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=REDACTED%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Date=20230427T193611Z&X-Amz-Expires=86400&X-Amz-SignedHeaders=host%3Bx-amz-request-payer host:usgs-landsat.s3.amazonaws.com x-amz-request-payer: host;x-amz-request-payer UNSIGNED-PAYLOAD 47 45 54 0a 2f 63 6f 6c 6c 65 63 74 69 6f 6e 30 32 2f 6c 65 76 65 6c 2d 31 2f 73 74 61 6e 64 61 72 64 2f 65 74 6d 2f 32 30 32 32 2f 31 36 33 2f 30 33 33 2f 4c 45 30 37 5f 4c 31 54 50 5f 31 36 33 30 33 33 5f 32 30 32 32 30 35 30 36 5f 32 30 32 32 30 39 30 31 5f 30 32 5f 54 31 2f 4c 45 30 37 5f 4c 31 54 50 5f 31 36 33 30 33 33 5f 32 30 32 32 30 35 30 36 5f 32 30 32 32 30 39 30 31 5f 30 32 5f 54 31 5f 42 32 2e 54 49 46 0a 58 2d 41 6d 7a 2d 41 6c 67 6f 72 69 74 68 6d 3d 41 57 53 34 2d 48 4d 41 43 2d 53 48 41 32 35 36 26 58 2d 41 6d 7a 2d 43 72 65 64 65 6e 74 69 61 6c 3d 41 4b 49 41 53 44 43 44 58 35 50 32 34 4e 41 34 5a 4b 55 48 25 32 46 32 30 32 33 30 34 32 37 25 32 46 75 73 2d 77 65 73 74 2d 32 25 32 46 73 33 25 32 46 61 77 73 34 5f 72 65 71 75 65 73 74 26 58 2d 41 6d 7a 2d 44 61 74 65 3d 32 30 32 33 30 34 32 37 54 31 39 33 36 31 31 5a 26 58 2d 41 6d 7a 2d 45 78 70 69 72 65 73 3d 38 36 34 30 30 26 58 2d 41 6d 7a 2d 53 69 67 6e 65 64 48 65 61 64 65 72 73 3d 68 6f 73 74 25 33 42 78 2d 61 6d 7a 2d 72 65 71 75 65 73 74 2d 70 61 79 65 72 0a 68 6f 73 74 3a 75 73 67 73 2d 6c 61 6e 64 73 61 74 2e 73 33 2e 61 6d 61 7a 6f 6e 61 77 73 2e 63 6f 6d 0a 78 2d 61 6d 7a 2d 72 65 71 75 65 73 74 2d 70 61 79 65 72 3a 0a 0a 68 6f 73 74 3b 78 2d 61 6d 7a 2d 72 65 71 75 65 73 74 2d 70 61 79 65 72 0a 55 4e 53 49 47 4e 45 44 2d 50 41 59 4c 4f 41 44 DXJ19WH3KYBWFDGK zhON36gttmp3FGh5kayxrMEI9a+EylC8av17bpAuTFzevr4AwQQudx6c2/7pRTwSAfI9VMUn1bI= ```

Reproduction Steps

(also as a gist, includes Poetry lockfile)

import boto3
from botocore.client import Config

client = boto3.client(
    's3',
    'us-west-2',

    # replace with your credentials
    aws_access_key_id='AWS_ACCESS_KEY_ID',
    aws_secret_access_key='AWS_SECRET_ACCESS_KEY',

    # !!! comment/uncomment this line to fail the request
    config=Config(signature_version='s3v4')
)

# example RequesterPays bucket in us-west-2
bucket = 'usgs-landsat'
key = 'collection02/level-1/standard/etm/2022/163/033/LE07_L1TP_163033_20220506_20220901_02_T1/LE07_L1TP_163033_20220506_20220901_02_T1_B2.TIF'

# should return a valid URL
asset_link = client.generate_presigned_url(
    ClientMethod='get_object',
    Params={
        'Bucket': bucket,
        'Key': key,
        'RequestPayer': 'requester'
    },
    ExpiresIn=(60 * 60 * 24) # 24h
)

# if signature_version is s3v4 or v4, this will not be a valid URL
print(f'signed url: {asset_link}')

Possible Solution

In the CanonicalRequest section of the XML trace above, X-Amz-Request-Payer=requester is not included as a parameter but is listed in X-Amz-Signed-Headers. I suspect that that's the issue, but I'm not sure how to go about attaching that as it seems to be coming from the very depths of boto3.

Additional Information/Context

Additionally, attempting to inject the header through the event system didn't seem to have any effect, though it's possible I'm misunderstanding something:

# ... before generating the URL ...
def _add_header(request, **kwargs):
    request.headers.add_header('X-Amz-Request-Payer', 'requester')
    print(request.headers)  # for debug

client.meta.events.register_first('before-sign.*.*', _add_header)

SDK version used

boto3 1.26.121

Environment details (OS name and version, etc.)

Python 3.10 on arm64-apple-darwin22.2.0

indrora commented 1 year ago

This is either here in boto3 or deep in botocore. I'll peek into this, but this should definitely be a p0 bug -- There's no reason we should be building invalid pre-signed URLs.

indrora commented 1 year ago

The only reference to s3v4 in boto3 itself is in https://github.com/boto/boto3/blob/8a64e31f3defe3af3098bf641d9926c92c0b0589/tests/integration/test_s3.py#L502

This test isn't particularly conclusive either -- it tests the S3 transfer manager more than anything else.

jun0tpyrc commented 1 year ago

looks still here for this bug, which makes some download not possible as

Signature Version 2 is being turned off (deprecated) in Amazon S3. Amazon S3 will then only accept API requests that are signed using Signature Version 4.
keithpeck commented 3 months ago

We received this response through AWS support on this issue

"During the investigation, The internal SDK team found that boto3 allows you to generate a presigned URL with the signed header x-amz-request-payer. However, the x-amz-request-payer header is not being passed in the presigned URL request. This occurs because boto3 is treating x-amz-request-payer as a header and not as a query parameter causing x-amz-request-payer to not be set in the header resulting in the presigned URL to return the error: SignatureDoesNotMatch [1]. For the presigned URL request to succeed, you would need to supply the x-amz-request-payer header to your presigned request as S3 API cannot supplement the query parameter as a standalone link with the current variant of SigV4."

In summary, a header of x-amz-request-payer: requester needs to be included.

I consider this a partial solution. It's sufficient in scenarios where objects are being downloaded in backend scripts (where you have control over headers). But doesn't help if object is being downloaded in frontend, via opening link in new window.