aws / aws-sdk-java-v2

The official AWS SDK for Java - Version 2
Apache License 2.0
2.16k stars 833 forks source link

NPE in TransformingAsyncResponseHandler#onError #1812

Open ktulinger opened 4 years ago

ktulinger commented 4 years ago

Describe the bug

From time to time, we get an ERROR log from TransformingAsyncResponseHandler telling us that there was an NPE. It does not affect our app in any way, however it feels that it shouldn't throw that.

Expected Behavior

No log record nor NPE.

Current Behavior

We get this log statement after NPE has been raised in aws sdk core

5352c77fa1b2:2020-05-04 09:36:31,681 [speculation-executor-1] ERROR s.a.a.c.i.h.BaseAsyncClientHandler - Error thrown from TransformingAsyncResponseHandler#onError, ignoring.
5352c77fa1b2:java.lang.NullPointerException: null
5352c77fa1b2:   at software.amazon.awssdk.core.internal.http.async.AsyncResponseHandler.onError(AsyncResponseHandler.java:76)
5352c77fa1b2:   at software.amazon.awssdk.core.internal.http.async.CombinedResponseAsyncHttpResponseHandler.onError(CombinedResponseAsyncHttpResponseHandler.java:70)
5352c77fa1b2:   at software.amazon.awssdk.core.internal.handler.BaseAsyncClientHandler.lambda$doExecute$3(BaseAsyncClientHandler.java:223)
5352c77fa1b2:   at software.amazon.awssdk.utils.FunctionalUtils.runAndLogError(FunctionalUtils.java:40)
5352c77fa1b2:   at software.amazon.awssdk.core.internal.handler.BaseAsyncClientHandler.doExecute(BaseAsyncClientHandler.java:220)
5352c77fa1b2:   at software.amazon.awssdk.core.internal.handler.BaseAsyncClientHandler.execute(BaseAsyncClientHandler.java:86)
5352c77fa1b2:   at software.amazon.awssdk.awscore.client.handler.AwsAsyncClientHandler.execute(AwsAsyncClientHandler.java:52)
5352c77fa1b2:   at software.amazon.awssdk.services.dynamodb.DefaultDynamoDbAsyncClient.query(DefaultDynamoDbAsyncClient.java:2707)
...
(our code)

Steps to Reproduce

I don't know a deterministic way.

Possible Solution

I realize that the NPE comes from AsyncResponseHandler:

    @Override
    public void onError(Throwable err) {
        streamFuture.completeExceptionally(err);
    }

where streamFuture is null, as prepare() has not been called yet.

Context

The context is we're doing an async requests to ddb, and for some requests we may cancel them via future.cancel(). Apart from that we're not doing anything special with our query requests.

Your Environment

AWS-SDK: 2.13.8 Java: 12

zoewangg commented 4 years ago

Thank you for reporting the issue! We will look into it.

adamu commented 4 years ago

I am also seeing this error when attempting to call SqsAsyncClient.receiveMessage(). In my case, the mesage is not received. The error doesn't occur in in v2.12.0.

ERROR [sof.ama.aws.cor.int.han.BaseAsyncClientHandler] (executor-thread-1) Error thrown from TransformingAsyncResponseHandler#onError, ignoring.: java.lang.NullPointerException
    at software.amazon.awssdk.core.internal.http.async.AsyncResponseHandler.onError(AsyncResponseHandler.java:76)
    at software.amazon.awssdk.core.internal.http.async.CombinedResponseAsyncHttpResponseHandler.onError(CombinedResponseAsyncHttpResponseHandler.java:70)
    at software.amazon.awssdk.core.internal.handler.BaseAsyncClientHandler.lambda$doExecute$3(BaseAsyncClientHandler.java:223)
    at software.amazon.awssdk.utils.FunctionalUtils.runAndLogError(FunctionalUtils.java:40)
    at software.amazon.awssdk.core.internal.handler.BaseAsyncClientHandler.doExecute(BaseAsyncClientHandler.java:220)
    at software.amazon.awssdk.core.internal.handler.BaseAsyncClientHandler.execute(BaseAsyncClientHandler.java:86)
    at software.amazon.awssdk.awscore.client.handler.AwsAsyncClientHandler.execute(AwsAsyncClientHandler.java:52)
    at software.amazon.awssdk.services.sqs.DefaultSqsAsyncClient.receiveMessage(DefaultSqsAsyncClient.java:1149)
dazito commented 4 years ago

Any news regarding this issue? This is also happening to me on the S3 client and in my case it fails to upload the file to S3. Stack trace:

s.a.a.c.i.h.BaseAsyncClientHandler       : Error thrown from TransformingAsyncResponseHandler#onError, ignoring.

java.lang.NullPointerException: null
    at software.amazon.awssdk.core.internal.http.async.AsyncResponseHandler.onError(AsyncResponseHandler.java:76) ~[sdk-core-2.13.26.jar:na]
    at software.amazon.awssdk.core.internal.handler.BaseAsyncClientHandler.lambda$doExecute$3(BaseAsyncClientHandler.java:223) ~[sdk-core-2.13.26.jar:na]
    at software.amazon.awssdk.utils.FunctionalUtils.runAndLogError(FunctionalUtils.java:40) ~[utils-2.13.26.jar:na]
    at software.amazon.awssdk.core.internal.handler.BaseAsyncClientHandler.doExecute(BaseAsyncClientHandler.java:220) ~[sdk-core-2.13.26.jar:na]
    at software.amazon.awssdk.core.internal.handler.BaseAsyncClientHandler.execute(BaseAsyncClientHandler.java:86) ~[sdk-core-2.13.26.jar:na]
    at software.amazon.awssdk.awscore.client.handler.AwsAsyncClientHandler.execute(AwsAsyncClientHandler.java:52) ~[aws-core-2.13.26.jar:na]
    at software.amazon.awssdk.services.s3.DefaultS3AsyncClient.putObject(DefaultS3AsyncClient.java:7762) ~[s3-2.13.26.jar:na]
    (...)

software.amazon.awssdk.core.exception.SdkClientException: null
    at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:98) ~[sdk-core-2.13.26.jar:na]
    at software.amazon.awssdk.core.internal.http.AmazonAsyncHttpClient$RequestExecutionBuilderImpl.execute(AmazonAsyncHttpClient.java:185) ~[sdk-core-2.13.26.jar:na]
    at software.amazon.awssdk.core.internal.handler.BaseAsyncClientHandler.invoke(BaseAsyncClientHandler.java:262) ~[sdk-core-2.13.26.jar:na]
    at software.amazon.awssdk.core.internal.handler.BaseAsyncClientHandler.doExecute(BaseAsyncClientHandler.java:204) ~[sdk-core-2.13.26.jar:na]
    at software.amazon.awssdk.core.internal.handler.BaseAsyncClientHandler.execute(BaseAsyncClientHandler.java:86) ~[sdk-core-2.13.26.jar:na]
    at software.amazon.awssdk.awscore.client.handler.AwsAsyncClientHandler.execute(AwsAsyncClientHandler.java:52) ~[aws-core-2.13.26.jar:na]
    at software.amazon.awssdk.services.s3.DefaultS3AsyncClient.putObject(DefaultS3AsyncClient.java:7762) ~[s3-2.13.26.jar:na]
    (...)
Caused by: software.amazon.awssdk.core.exception.SdkInterruptedException: null
    at software.amazon.awssdk.core.internal.http.InterruptMonitor.checkInterrupted(InterruptMonitor.java:40) ~[sdk-core-2.13.26.jar:na]
    at software.amazon.awssdk.core.internal.http.pipeline.stages.ApplyTransactionIdStage.execute(ApplyTransactionIdStage.java:43) ~[sdk-core-2.13.26.jar:na]
    at software.amazon.awssdk.core.internal.http.pipeline.stages.ApplyTransactionIdStage.execute(ApplyTransactionIdStage.java:29) ~[sdk-core-2.13.26.jar:na]
    at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) ~[sdk-core-2.13.26.jar:na]
    at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) ~[sdk-core-2.13.26.jar:na]
    at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) ~[sdk-core-2.13.26.jar:na]
    at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) ~[sdk-core-2.13.26.jar:na]
    at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) ~[sdk-core-2.13.26.jar:na]
    at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) ~[sdk-core-2.13.26.jar:na]
    at software.amazon.awssdk.core.internal.http.AmazonAsyncHttpClient$RequestExecutionBuilderImpl.execute(AmazonAsyncHttpClient.java:181) ~[sdk-core-2.13.26.jar:na]
    ... 12 common frames omitted

Downgrading to version 2.5.8 works fine.

pavelkuchin commented 3 years ago

Same issue here

ohshazbot commented 3 years ago

I bit this pretty hard and consistently. Root cause ended up being because I didn't setup an Xray segment for the thread the call was being made in so the call was failing with the xray interceptor in the sqs client, but I was only able to find that out by repackaging the aws sdk code with a println in AsyncResponseHandler.onError. I would say that this is pretty unusable in it's current form :'(

RealTYPICAL commented 3 years ago

This reproduced the issue consistently for me:

S3AsyncClient s3AsyncClient = S3AsyncClient.create();

Thread thread = new Thread(() -> s3AsyncClient.copyObject(CopyObjectRequest.builder().build()));

thread.interrupt();
thread.start();
thread.join();

I would expect this to throw an AbortedException/CancellationException/InterruptedException however due to the bug it's a NullPointerException.

Jamalarm commented 3 years ago

We are also seeing this error frequently using the Kinesis library during a putRecords operation. A fix would be appreciated.

elrob commented 3 years ago

also experiencing this with latest SqsAsyncClient with sdk version: 2.17.29 and Java 11

OliverGavin commented 2 years ago

For anyone else who landed here that is also using Reactor (and was confused by the logs they were seeing) or who might be incorrectly using the SDK;

I had code similar to this:

byte[] data = ...   
String key = ...
CompletableFuture<PutObjectResponse> future = client.putObject( 
       PutObjectRequest.builder().bucket(bucket).key(key).build(),  
       AsyncRequestBody.fromBytes(data) 

);
return Mono.fromFuture(future)
          .retryWhen(...)   
          .doOnSuccess(res -> {...})    
          .onErrorResume((throwable) -> {...});

The issue is that Mono.fromFuture(future) will not re-execute the already completed future (which failed). Subsequent retries will then also fail for the same reason.

PutObjectRequest request = PutObjectRequest.builder().bucket(bucket).key(key).build();  
AsyncRequestBody requestBody = AsyncRequestBody.fromBytes(data);    

return Mono.defer(() -> {   
  // Defer the creation of the future until subscription.   
  // This allows retries to create a new future on each attempt.    
  CompletableFuture<PutObjectResponse> future = client.putObject(request, requestBody); 
  return Mono.fromFuture(future);   
})
.retryWhen(...) 
.doOnSuccess(res -> {...})  
.onErrorResume((throwable) -> {...});

The above code uses Mono.defer() to delay the creation of the CompletableFuture and allows retries to create a new future on each attempt.

Now of course this does not fix the underlying issue with the SDK which is logging a misleading/unnecessary error - just thought this would be helpful for others who might also have been using the SDK wrong.

I think I have a patch for that though! I was able to stop the NullPointerException but I need to write some tests and validate my fix isn't breaking anything first.

jenshoffmann1331 commented 1 year ago

Hey AWS. This issue is open for about 3 years and we are still affected by this in one of our production systems. Can you please give us a timeline for when this will be fixed? Thank you so much!

yasminetalby commented 1 year ago

Hello @jenshoffmann1331 ,

Thank you very much for reaching out. We unfortunately do not have any current publicly available timeline. I will bring this issue back up for discussion. Are you able to work around this?

Best,

Yasmine

artembilan commented 1 year ago

Strange that this is still not fixed. I believe the problem is here in the BaseAsyncClientHandler.doExecute():

  invoke(marshalled,
                       finalizeSdkHttpRequestContext.asyncRequestBody().orElse(null),
                       inputT,
                       executionContext,
                       new AsyncAfterTransmissionInterceptorCallingResponseHandler<>(asyncResponseHandler,
                                                                                     executionContext));

That new AsyncAfterTransmissionInterceptorCallingResponseHandler() is never prepare()d.

Unfortunately we cannot override AwsAsyncClientHandler to workaround the problem:

    protected DefaultDynamoDbAsyncClient(SdkClientConfiguration clientConfiguration) {
        this.clientHandler = new AwsAsyncClientHandler(clientConfiguration);

I really would like to know the reason behind my failure instead of just plain NPE...

Thanks for understanding!

seantsb commented 10 months ago

Looks like this has been fixed now? https://github.com/aws/aws-sdk-java-v2/pull/4567/files#diff-18f6745a85126bf1b23736af8857375b8e7328956262a3d7f08e5bb5b7684303