aws / aws-sdk-go-v2

AWS SDK for the Go programming language.
https://aws.github.io/aws-sdk-go-v2/docs/
Apache License 2.0
2.68k stars 651 forks source link

Cannot create presign url for PUT with metadata in SDKv2, whereas it was working for SDKv1 #2702

Closed Harital closed 4 months ago

Harital commented 5 months ago

Pre-Migration Checklist

Go Version Used

1.22

Describe the Migration Issue

In AWS SDK v1 there could be constructed presign urls for put objects with the metadata embedded in it. It was easy and transparent for the users of these url's. A simple PUT to the presign url would suffice to get an object uploaded in an S3 with the proper metadata.

In AWS SDK v2 this feature is broken. The client needs to send a header with the metadata alongside the presign url in order to work. Otherwise, a 403 error is returned. This makes no sense, as the user of the presign url's does not have (or at least, should not have) any knowledge of the S3 internals.

Code Comparison

v1: metadata was added to the query

req, _ := s3ClientV1.PutObjectRequest(&s3sdkv1aws.PutObjectInput{
    Bucket: aws.String("myBucket"),
    Key:    aws.String("myFilePath"),
})

// Add user metadata.
query := req.HTTPRequest.URL.Query()
query.Add("x-amz-meta-whatever", "metadataValue")
req.HTTPRequest.URL.RawQuery = query.Encode()

v2: new way of adding metadata

request, err := s3Clientv2.PresignPutObject(ctx, &s3.PutObjectInput{
    Bucket:      aws.String("myBucket"),
    Key:         aws.String("myFilePath"),
    Metadata:    map[string]string{"whatever": "metadataValue"},
}, func(opts *s3.PresignOptions) {
    opts.Expires = clock.Duration(8 * int64(time.Hour))
})

Observed Differences/Errors

When making a PUT to a presign url generated with v1, the object was properly uploaded with its metadata. When doing so with v2 a 403 error is returned, unless the metadata is passed as a header alongside the presign url.

Additional Context

This issue has been causing several prlblems several times in several languages, but it seems not to be any clear solution. https://github.com/aws/aws-sdk-go/issues/1467 https://github.com/aws/aws-sdk-go-v2/issues/1474 https://github.com/aws/aws-sdk-java-v2/discussions/5060 https://github.com/aws/aws-sdk-java-v2/issues/2200

RanVaknin commented 5 months ago

Hi @Harital ,

Thanks for reaching out and providing in depth explanation of the problem.

I want to address a few points you are making:

In AWS SDK v2 this feature is broken.

The functionality is not broken in v2, it simply does not exist the same way it did in v1. The presigning behavior changed in v2 (and many of the other newer SDKs) because of inconsistencies with the S3 service itself.

By default, every presigned URL parameter would be hoisted into the query parameter excluding a specific list of headers that must be signed. I don't have full visibility as to why this was put in place, but my guess is because the S3 service either respects or rejects params supplied in Query String vs Signed Headers and vice versa, when in fact either one should be valid. In many cases, providing a presigned param as a query string raises a mismatch error, or even worse fails silently.

This makes no sense, as the user of the presign url's does not have (or at least, should not have) any knowledge of the S3 internals.

S3 allows you to provide presigned parameters in two forms; As a query parameter hoisted into the presigned URL, or as a key under the X-Amz-SignedHeaders query parameter. After scouring S3's documentation I can say that this is very poorly documented. My best guess is that it was introduced as a security measure to force consumers of the presigned URL to "acknowledge" the params they are passing by explicitly adding to the request as headers. This reasoning also makes sense to me since when a presigned URL with a signed header is supplied, the value is obfuscated from the URL itself and requires the consumer to provide it themselves.

For example:

https://testbucket.s3.amazonaws.com/foo?
X-Amz-Algorithm=AWS4-HMAC-SHA256&
X-Amz-Credential=REDACTED/20240702/us-east-1/s3/aws4_request&
X-Amz-Date=20240702T195646Z&
X-Amz-Expires=3600&
- X-Amz-SignedHeaders=content-length;host&
- x-amz-meta-whatever=metadataValue&
+ X-Amz-SignedHeaders=content-length;host;x-amz-meta-whatever&
X-Amz-Signature=REDACTED

The presigning behavior has been a pain point across all SDKs since the presigning spec from s3 is not public. We are working on consolidating presigning knowledge base in a cross SDK effort and hopefully can incorporate this into every SDK to avoid exactly these sort of situations.

In the meantime I'll see if we can come up with a solution that provides the same functionality in as in v1.

Thanks, Ran~

RanVaknin commented 4 months ago

Hi @Harital ,

Until we do a complete re-design of the presigner, you can achieve the same functionality as in v1 using a middleware:

func presignPut() {
    cfg, err := config.LoadDefaultConfig(context.TODO(), config.WithRegion("us-east-1"))
    if err != nil {
        panic(err)
    }

    myMiddleware := middleware.SerializeMiddlewareFunc("MyTestMiddleware",
        func(ctx context.Context, input middleware.SerializeInput, next middleware.SerializeHandler,
        ) (
            output middleware.SerializeOutput, metadata middleware.Metadata, err error,
        ) {
            req, ok := input.Request.(*smithyhttp.Request)
            if !ok {
                return output, metadata, fmt.Errorf("unexpected transport: %T", input.Request)
            }

            req.URL.RawQuery = req.URL.RawQuery + "&x-amz-meta-whatever=metadataValue"

            if err != nil {
                panic(err)
            }

            return next.HandleSerialize(ctx, input)
        })

    input := &s3.PutObjectInput{
        Bucket: aws.String("bucket"),
        Key:    aws.String("foo12347.txt"),
        Body:   strings.NewReader("Hello World"),
    }

    client := s3.NewFromConfig(cfg, func(options *s3.Options) {
        options.APIOptions = append(options.APIOptions, func(stack *middleware.Stack) error {
            return stack.Serialize.Add(myMiddleware, middleware.After)
        })
    })

    presigner := s3.NewPresignClient(client)
    url, err := presigner.PresignPutObject(context.TODO(), input)
    if err != nil {
        fmt.Println("Failed to presign request", err)
        return
    }

    decodedURI, err := url2.QueryUnescape(url.URL)
    if err != nil {
        panic(err)
    }
    fmt.Println("The presigned URL is:", decodedURI)

    // ... rest of your code
}

This will result in the desired behavior where your metadata value is hoisted to the query parameter:

The presigned URL is: https://bucket.s3.us-east-1.amazonaws.com/foo12347.txt?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=REDACTED/20240708/us-east-1/s3/aws4_request&X-Amz-Date=20240708T230109Z&X-Amz-Expires=900&X-Amz-SignedHeaders=content-length;content-type;host&x-amz-meta-whatever=metadataValue&x-id=PutObject&X-Amz-Signature=REDACTED

And the object is correctly uploaded with the metadata to s3: image

Thanks, Ran~

github-actions[bot] commented 4 months ago

This issue is now closed. Comments on closed issues are hard for our team to see. If you need more assistance, please open a new issue that references this one.