brendanhay / amazonka

A comprehensive Amazon Web Services SDK for Haskell.
https://amazonka.brendanhay.nz
Other
604 stars 228 forks source link

Feature Request: Accelerated uploads with presigned S3 requests #499

Open MaxGabriel opened 5 years ago

MaxGabriel commented 5 years ago

AWS S3 supports accelerated uploads, in which you send your request to s3-accelerate instead of the normal s3 URL. AWS's accelerated upload comparison tool suggests uploads from SF to us-east1 would be 39% faster for us.

This blog post goes into some additional detail: https://medium.com/@aakashbanerjee/upload-files-to-amazon-s3-from-the-browser-using-pre-signed-urls-4602a9a90eb5

MaxGabriel commented 5 years ago

Actually this might already be possible with presignWith, passing a function to change the endpoint of the Service. I got successful responses doing this method, but the file I PUT didn't actually show up in S3, but maybe that's some issue on my AWS end.

MaxGabriel commented 4 years ago

Ok, I think you can sort of do this, but additional work is needed. Right now you can use presignWith and modify the endpoint of the s3 service to be the accelerated one. However, when you generate a request to upload with putObject, it will want to include the bucket name in the path, which is not used for accelerated uploads. For example, see below how the bucket name mercury-technologies-user-uploads-dev is repeated in both the endpoint and the path:

"https://mercury-technologies-user-uploads-dev.s3-accelerate.amazonaws.com/mercury-technologies-user-uploads-dev/test.pdf?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=CREDHERE%2F20200223%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20200223T052542Z&X-Amz-Expires=300&X-Amz-SignedHeaders=host&X-Amz-Signature=SIGNATUREHERE"

This upload will work, but it will put use the bucket name as part of the object name in S3. So the object will be in the mercury-technologies-user-uploads-dev bucket with a filename of mercury-technologies-user-uploads-dev/test.pdf (not just test.pdf, as you would normally get).

I don't see a way on putObject to tell it not to include the bucket name in the generated request. It looks like other SDKs have a useAcceleratedEndpoint flag which I presume tells them to not use the bucket name in the path component.

Would it be desirable to PR some change to PutObject to tell it not to include the bucket name in the generated path? I'm not sure if this is the correct approach, haven't thought about it that much.

endgame commented 3 years ago

Your plan to add a useAcceleratedEndpoint sounds reasonable, but I haven't thought deeply and don't know how you'll do that. You'll probably need to fiddle the files in config/ in the root of the repo, which are what drive the code generator, and then do some extra wrangling so that setting rewrites the request in the correct way.

JonathanLorimer commented 10 months ago

@endgame Is there any documentation on how the generation / templating in this library works?