aws / aws-encryption-sdk-java

AWS Encryption SDK
https://docs.aws.amazon.com/encryption-sdk/latest/developer-guide/introduction.html
Apache License 2.0
220 stars 122 forks source link

Support mark with CryptoInputStream #1279

Closed neetikasinghal closed 7 months ago

neetikasinghal commented 1 year ago

Problem:

CryptoInputStream for encryption doesn't support mark. When Amazon S3 client tries to reset the mark on failures such as throttling as it expects the stream to have mark supported, there is an exception thrown as the mark is not supported by the CryptoInputStream. Ref: https://github.com/aws/aws-sdk-java/blob/master/aws-java-sdk-core/src/main/java/com/amazonaws/http/AmazonHttpClient.java#L1284

Background: When uploading objects to Amazon S3 using streams (either through S3 client or Transfer manager ), it is possible to run into network connectivity or timeout issues. The AWS Java SDK by default attempts to retry these failed transfers. The input stream is marked before the start of transfer and reset before retrying. The SDK recommends customers to use resettable streams (streams that support mark and reset operations). If the stream does not support mark and reset, then the SDK throws ResetException when there are any transient failures and retries are enabled.

Sample Stack trace:

[WARN ][o.e.s.SnapshotShardsService] [a3b4d48dde76d9008d530db68ed6d680] [[index-name][28]][cs-automated-enc:manual/xxxx] failed to snapshot shard
    UncategorizedExecutionException[Failed execution]; nested: ExecutionException[java.io.IOException: Unable to upload object [xxx] using a single upload]; nested: IOException[Unable to upload object [xxx] using a single upload]; nested: ResetException[The request to the service failed with a retryable reason, but resetting the request input stream has failed. See exception.getExtraInfo or debug-level logging for the original failure that caused this retry.;  If the request involves an input stream, the maximum stream buffer size can be configured via request.getRequestClientOptions().setReadLimit(int)]; nested: IOException[Resetting to invalid mark];
    at org.elasticsearch.common.util.concurrent.FutureUtils.rethrowExecutionException(FutureUtils.java:91)
    at org.elasticsearch.common.util.concurrent.FutureUtils.get(FutureUtils.java:83)
    at org.elasticsearch.common.util.concurrent.ListenableFuture$1.doRun(ListenableFuture.java:111)
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
    at org.elasticsearch.common.util.concurrent.EsExecutors$DirectExecutorService.execute(EsExecutors.java:253)
    at org.elasticsearch.common.util.concurrent.ListenableFuture.notifyListener(ListenableFuture.java:106)
    at org.elasticsearch.common.util.concurrent.ListenableFuture.lambda$done$0(ListenableFuture.java:98)
    at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
    at org.elasticsearch.common.util.concurrent.ListenableFuture.done(ListenableFuture.java:98)
    at org.elasticsearch.common.util.concurrent.BaseFuture.setException(BaseFuture.java:162)
    at org.elasticsearch.common.util.concurrent.ListenableFuture.onFailure(ListenableFuture.java:135)
    at org.elasticsearch.action.StepListener.innerOnFailure(StepListener.java:67)
    at org.elasticsearch.action.NotifyOnceListener.onFailure(NotifyOnceListener.java:47)
    at org.elasticsearch.action.support.GroupedActionListener.onResponse(GroupedActionListener.java:63)
    at org.elasticsearch.action.ActionListener$2.onResponse(ActionListener.java:89)
    at org.elasticsearch.repositories.blobstore.BlobStoreRepository.executeOneFileSnapshot(BlobStoreRepository.java:2166)
    at org.elasticsearch.repositories.blobstore.BlobStoreRepository.lambda$executeOneFileSnapshot$70(BlobStoreRepository.java:2171)
    at org.elasticsearch.action.ActionRunnable$2.doRun(ActionRunnable.java:73)
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:752)
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)
    Caused by: java.util.concurrent.ExecutionException: java.io.IOException: Unable to upload object [xxx] using a single upload
    at org.elasticsearch.common.util.concurrent.BaseFuture$Sync.getValue(BaseFuture.java:273)
    at org.elasticsearch.common.util.concurrent.BaseFuture$Sync.get(BaseFuture.java:246)
    at org.elasticsearch.common.util.concurrent.BaseFuture.get(BaseFuture.java:65)
    at org.elasticsearch.common.util.concurrent.FutureUtils.get(FutureUtils.java:76)
    ... 21 more
    Caused by: java.io.IOException: Unable to upload object [xxx] using a single upload
    at org.elasticsearch.repositories.s3.S3BlobContainer.executeSingleUpload(S3BlobContainer.java:430)
    at org.elasticsearch.repositories.s3.S3BlobContainer.lambda$writeBlob$1(S3BlobContainer.java:162)
    at java.base/java.security.AccessController.doPrivileged(Native Method)
    at org.elasticsearch.repositories.s3.SocketAccess.doPrivilegedIOException(SocketAccess.java:48)
    at org.elasticsearch.repositories.s3.S3BlobContainer.writeBlob(S3BlobContainer.java:160)
    at org.elasticsearch.repositories.blobstore.BlobStoreRepository.snapshotFile(BlobStoreRepository.java:2556)
    at org.elasticsearch.repositories.blobstore.BlobStoreRepository.lambda$executeOneFileSnapshot$70(BlobStoreRepository.java:2170)
    ... 6 more
    Caused by: com.amazonaws.ResetException: The request to the service failed with a retryable reason, but resetting the request input stream has failed. See exception.getExtraInfo or debug-level logging for the original failure that caused this retry.;  If the request involves an input stream, the maximum stream buffer size can be configured via request.getRequestClientOptions().setReadLimit(int)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.resetRequestInputStream(AmazonHttpClient.java:1465)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1266)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1139)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:796)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:764)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:738)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:698)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:680)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:544)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:524)
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5054)
    --
    at org.elasticsearch.repositories.s3.S3BlobContainer.executeSingleUpload(S3BlobContainer.java:426)
    ... 12 more
    Caused by: java.io.IOException: Resetting to invalid mark
    at java.base/java.io.BufferedInputStream.reset(BufferedInputStream.java:454)
    at com.amazonaws.internal.SdkBufferedInputStream.reset(SdkBufferedInputStream.java:106)
    at com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:120)
    at com.amazonaws.event.ProgressInputStream.reset(ProgressInputStream.java:168)
    at com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:120)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.resetRequestInputStream(AmazonHttpClient.java:1463)
    ... 32 more

Solution:

Proving mark support with the CryptoInputStream would highly be useful for our application and provide out-of the box support to support S3 retries.

justplaz commented 1 year ago

Hello,

Thank you for raising this issue. You are correct that the AWS Encryption SDK does not currently support mark/reset in its CryptoInputStream implementation. Unfortunately, there are some challenges around implementing this functionality in a secure way, and it is unlikely that we will be able to support this in the current version of the AWS Encryption SDK.