aws / amazon-s3-encryption-client-java

The Amazon S3 Encryption Client is a client-side encryption library that enables you to encrypt an object locally to ensure its security before passing it to Amazon Simple Storage Service (Amazon S3).
Apache License 2.0
18 stars 9 forks source link

Multipart upload with CSE using RSA key pair #268

Open sid22 opened 1 month ago

sid22 commented 1 month ago

Problem:

My code is roughly

    val s3ClientObject =
      S3Client
        .builder()
        .credentialsProvider(
              StaticCredentialsProvider.create(
                AwsBasicCredentials.create(
                  spec.accessKey.get,
                  getSecretKey(metadataEncryptionUtils)
                )
              )
        )
       .region(REGION.US_EAST_1).build()

    val s3AsyncClientObject =
      S3AsyncClient
        .builder()
        .credentialsProvider(
              StaticCredentialsProvider.create(
                AwsBasicCredentials.create(
                  spec.accessKey.get,
                  getSecretKey(metadataEncryptionUtils)
                )
              )
        )
       .region(REGION.US_EAST_1).build()

Now I create S3EncryptionClient by wrapping above such as

    val encObject = 
      S3EncryptionClient
        .builder()
        .rsaKeyPair(userKeys)
        .enableLegacyUnauthenticatedModes(true)
        .enableLegacyWrappingAlgorithms(true)
        .wrappedClient(s3ClientObject)
        .wrappedAsyncClient(s3AsyncClientObject)
        .enableDelayedAuthenticationMode(true)
        .build()

I am able to use this encObject to do operations like creating bucket etc. I am also able to upload files to s3 bucket.

However, when i try to upload a large file ( say ~200MB ) with multi part upload it fails with following error

aused by: software.amazon.awssdk.crt.http.HttpException: Amount of data streamed out does not match the previously declared length.
    at software.amazon.awssdk.http.crt.internal.response.CrtResponseAdapter.onResponseComplete(CrtResponseAdapter.java:108) ~[thirdparty-intellij-deps.jar:?]
    at software.amazon.awssdk.crt.http.HttpStreamResponseHandlerNativeAdapter.onResponseComplete(HttpStreamResponseHandlerNativeAdapter.java:58) ~[thirdparty-intellij-deps.jar:?]
Exception in thread "AwsEventLoop 9" java.lang.IllegalStateException: Encountered fatal error in publisher
    at software.amazon.awssdk.utils.async.SimplePublisher.panicAndDie(SimplePublisher.java:339)
    at software.amazon.awssdk.utils.async.SimplePublisher.processEventQueue(SimplePublisher.java:226)
    at software.amazon.awssdk.utils.async.SimplePublisher.send(SimplePublisher.java:128)
    at software.amazon.awssdk.utils.async.InputStreamConsumingPublisher.doBlockingWrite(InputStreamConsumingPublisher.java:58)
    at software.amazon.awssdk.core.async.BlockingInputStreamAsyncRequestBody.writeInputStream(BlockingInputStreamAsyncRequestBody.java:76)
    at software.amazon.awssdk.core.internal.async.InputStreamWithExecutorAsyncRequestBody.doBlockingWrite(InputStreamWithExecutorAsyncRequestBody.java:108)
    at software.amazon.awssdk.core.internal.async.InputStreamWithExecutorAsyncRequestBody.lambda$subscribe$0(InputStreamWithExecutorAsyncRequestBody.java:81)
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.lang.IllegalStateException: Must use either different key or iv for GCM encryption
    at java.base/com.sun.crypto.provider.CipherCore.checkReinit(CipherCore.java:1088)
    at java.base/com.sun.crypto.provider.CipherCore.update(CipherCore.java:662)
    at java.base/com.sun.crypto.provider.AESCipher.engineUpdate(AESCipher.java:380)
    at java.base/javax.crypto.Cipher.update(Cipher.java:1869)
    at software.amazon.encryption.s3.internal.CipherSubscriber.onNext(CipherSubscriber.java:52)
    at software.amazon.encryption.s3.internal.CipherSubscriber.onNext(CipherSubscriber.java:16)
    at software.amazon.awssdk.utils.async.SimplePublisher.doProcessQueue(SimplePublisher.java:267)
    at software.amazon.awssdk.utils.async.SimplePublisher.processEventQueue(SimplePublisher.java:224)
    ... 10 more

If I directly use the s3ClientObject it works.

Solution:

Is there some limitation on CSE with multi part uploads ?

justplaz commented 1 month ago

Hello Sid22,

The Async S3 Encryption Client uses the Java implementation Reactive Streams to send data to S3. This is generally more efficient, but for multipart uploads, there are issues, because Reactive Streams do not have an equivalent to mark/reset or seek in "traditional" input/output streams. Because of this, when individual parts are retried, the cipher cannot keep track of its state and replay the data it has already encrypted to the server, and you see errors like the one you have posted. If you still need to use the Async client, you can set the enableMultipartPutObject(true) option which guards against retries; the request would still fail under adverse network conditions, but you at least get a proper modeled error. We are working on a better fix for this but there is currently no ECD.

Alternatively, for the most robust multipart upload solution, you can set the same option (enableMultipartPutObject(true)) in the (non-Async) S3EncryptionClient, which uses "traditional" input/output streams for multipart upload using the putObject API. You can also use the "low-level" multipart upload API in the same client (createMultipartUpload/uploadPart/completeMultipartUpload) but you will need to be careful to always upload the parts in order, to avoid issues with decryption later on, if the parts are shuffled, and avoid retrying individual parts for the same reason as above.

Let us know if you have any further questions, thanks!

sid22 commented 1 month ago

@justplaz thanks for the detailed explanation. I modified my code to add enableMultipartPutObject(true) to the S3EncryptionClient and re-tried.

However, i still see the same error. On a side note we are using the "low level" multi part upload API ( createMultipartUpload/uploadPart/completeMultipartUpload )

We have a method which expects an object of S3Client interface and uses it with low level multipart methods to do the upload.

If i pass S3Client object directly the multi part upload succeeds so it is not an issue of improper login in the upload method.

justplaz commented 1 month ago

I see, that makes sense then. If you're using the the low-level multipart upload API, then setting enableMultipartPutObject(true) won't have any affect, it only modifies the behavior of putObject, in other words, it enables high-level multipart upload.

For low-level multipart upload, you need to ensure that each part is encrypted sequentially and in the correct order. Otherwise, the encryption will fail. Please refer to the low-level multipart upload example for an example of how to upload parts in sequence.

Let us know if you are still having issues, thanks!

sid22 commented 1 week ago

@justplaz thanks for the example, one key thing we are not doing is

// Set sdkPartType to SdkPartType.LAST for last part of the multipart upload. // // Note: Set sdkPartType parameter to SdkPartType.LAST for last part is required for Multipart Upload in S3EncryptionClient to call cipher.doFinal()

We were uploading parts sequentially in order but were not setting the above anywhere. I will modify the code to do this and then revert back