aws-amplify / aws-sdk-android

AWS SDK for Android. For more information, see our web site:
https://docs.amplify.aws
Other
1.03k stars 549 forks source link

Transfer Utility uploaded object validation #3047

Open emarc-m opened 2 years ago

emarc-m commented 2 years ago

State your question Is there a way to use the transfer utility such that it will validate the object uploaded in S3 to make sure it is the same as what the client requested to upload for multipart uploads?

I've seen that for single part uploads, setting the Content-MD5 header on ObjectMetadata#contentMD5, validates the uploaded S3 object vs what the client has set. I was wondering if there's a similar mechanism for multipart uploads?

Additionally, is there a way of using transfer utility to take advantage of other checksum algorithms as specified in Using supported checksum algorithms.

For example: Setting a sha1 hash on the TransferUtility#upload(bucket, key, file, hash) thats computed on the client and once the upload is completed Transfer utility will verify if x-amz-checksum-sha1 from S3 is the same as what the client has set.

Thank you.

Which AWS Services are you utilizing?

AWS S3

Provide code snippets (if applicable) For example this code works for validating uploaded objects using Content-MD5 but only for single part uploads:

val transferUtility = ...
val fileToUpload = ...
val transferListener = ...

val meta = ObjectMetadata()
meta.contentMD5 = <client_computed_md5_of_file>

// If the md5 provided by the client in meta.contentMD5 does not match with S3's, this will invoke
//   transferListener#onError, see Sample Message
transferUtility.upload("bucket", "key", fileToUpload, meta).setTransferListener(transferListener)

Sample Message:

10-21 16:41:34.599  3287  3287 D App: com.amazonaws.services.s3.model.AmazonS3Exception: The Content-MD5 you specified was invalid. (Service: Amazon S3; Status Code: 400; Error Code: InvalidDigest; Request ID:<id>), S3 Extended Request ID: <ext_request_id>
10-21 16:41:34.599  3287  3287 D App:   at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:742)
10-21 16:41:34.599  3287  3287 D App:   at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:420)
10-21 16:41:34.599  3287  3287 D App:   at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:229)
10-21 16:41:34.599  3287  3287 D App:   at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4829)
10-21 16:41:34.599  3287  3287 D App:   at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1912)
10-21 16:41:34.599  3287  3287 D App:   at com.amazonaws.mobileconnectors.s3.transferutility.UploadTask.uploadSinglePartAndWaitForCompletion(UploadTask.java:282)
10-21 16:41:34.599  3287  3287 D App:   at com.amazonaws.mobileconnectors.s3.transferutility.UploadTask.call(UploadTask.java:114)
10-21 16:41:34.599  3287  3287 D App:   at com.amazonaws.mobileconnectors.s3.transferutility.UploadTask.call(UploadTask.java:59)
10-21 16:41:34.599  3287  3287 D App:   at java.util.concurrent.FutureTask.run(FutureTask.java:264)
10-21 16:41:34.599  3287  3287 D App:   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1137)
10-21 16:41:34.599  3287  3287 D App:   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:637)
10-21 16:41:34.599  3287  3287 D App:   at java.lang.Thread.run(Thread.java:1012)

Environment(please complete the following information):

Device Information (please complete the following information):

sdhuka commented 2 years ago

@emarc-m For multi-part upload, MD5 checksum is calculated for each part and returned as eTAG, more on it is explained here under Using part-level checksums for multi-part uploads.

TransferUtility currently doesn't support additional checksum algorithm, we will look into adding this in the future.

emarc-m commented 2 years ago

@sdhuka Thank you for response.

We currently us the etag verification with our backend to perform a Object head request to verify the etag matches. This works for both single and multi-part upload.

Is it possible for the client to set an expected etag on the Transfer Utility's upload request so that the upload will get verified if it matches with the uploaded object's etag in S3? If it does not match, the transfer utility will emit an error similar to setting meta.contentMD5 with an incorrect value. Also it seems that setting meta.contentMD5 with an incorrect value will not store any object on S3 and will not consume storage unnecessarily which is a nice feature (please correct me if I'm mistaken here).

Please let me know if the support for additional checksum will be/is added on an SDK update.

Thank you for the help.