aws / aws-sdk-java-v2

The official AWS SDK for Java - Version 2
Apache License 2.0
2.2k stars 853 forks source link

Netty implementation wraps InterruptedException in RuntimeException, causing issues in code that uses interrupts for cleanup #5694

Closed iconara closed 2 weeks ago

iconara commented 2 weeks ago

Describe the bug

We are debugging an issue where code that uses Guava's SimpleTimeLimiter doesn't work properly when a process times out and as part of its cleanup closes an AWS SDK client that uses Netty as the HTTP implementation.

The reason seems to be that the AwaitCloseChannelPoolMap#close implementation catches InterruptedException and wraps it in RuntimeException, hiding it from the calling code. This happens here: https://github.com/aws/aws-sdk-java-v2/blob/88abec27e7d5d35b21545c7e05875a7cc3d0f46e/http-clients/netty-nio-client/src/main/java/software/amazon/awssdk/http/nio/netty/internal/AwaitCloseChannelPoolMap.java#L174-L175

Regression Issue

Expected Behavior

When a thread is interrupted it should either re-throw the InterruptedException or mark the thread as interrupted with Thread.currentThread().interrupt(), but not both mark the thread as interrupted and throw an exception. Throwing an generic exception means that other code that expects to either get interrupted, or to handle InterruptedException to do cleanup instead sees an exception they don't understand.

Current Behavior

The thread marks the thread as interrupted and throws a generic RuntimeException.

Reproduction Steps

I'm not able to reproduce it in isolation, but here is a stack trace showing how an InterruptedException from CompletableFuture#get is wrapped as a RuntimeException by AwaitCloseChannelPoolMap#close:

java.lang.RuntimeException: java.lang.InterruptedException
    at software.amazon.awssdk.http.nio.netty.internal.AwaitCloseChannelPoolMap.close(AwaitCloseChannelPoolMap.java:175)
    at software.amazon.awssdk.http.nio.netty.internal.utils.NettyUtils.runAndLogError(NettyUtils.java:386)
    at software.amazon.awssdk.http.nio.netty.NettyNioAsyncHttpClient.close(NettyNioAsyncHttpClient.java:198)
    at software.amazon.awssdk.utils.IoUtils.closeQuietly(IoUtils.java:70)
    at software.amazon.awssdk.utils.IoUtils.closeIfCloseable(IoUtils.java:87)
    at software.amazon.awssdk.utils.AttributeMap.closeIfPossible(AttributeMap.java:678)
    at software.amazon.awssdk.utils.AttributeMap.access$1600(AttributeMap.java:49)
    at software.amazon.awssdk.utils.AttributeMap$DerivedValue.close(AttributeMap.java:632)
    at java.util.HashMap$Values.forEach(HashMap.java:1065)
    at software.amazon.awssdk.utils.AttributeMap.close(AttributeMap.java:107)
    at software.amazon.awssdk.core.client.config.SdkClientConfiguration.close(SdkClientConfiguration.java:118)
    at software.amazon.awssdk.core.internal.http.HttpClientDependencies.close(HttpClientDependencies.java:82)
    at software.amazon.awssdk.core.internal.http.AmazonAsyncHttpClient.close(AmazonAsyncHttpClient.java:75)
    at software.amazon.awssdk.core.internal.handler.BaseAsyncClientHandler.close(BaseAsyncClientHandler.java:254)
    at software.amazon.awssdk.services.athena.DefaultAthenaAsyncClient.close(DefaultAthenaAsyncClient.java:4906)
    (redacted)
    at java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
    at java.lang.Thread.run(Thread.java:840)
Caused by: java.lang.InterruptedException
    at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:386)
    at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2096)
    at software.amazon.awssdk.http.nio.netty.internal.AwaitCloseChannelPoolMap.close(AwaitCloseChannelPoolMap.java:172)

Possible Solution

Either call Thread.currentThread().interrupt() and don't throw an exception, or re-throw the InterruptedException (this would however require adding InterruptedException to the method signature).

Additional Information/Context

No response

AWS Java SDK version used

2.27.21

JDK version used

Multiple

Operating System and version

Multiple

iconara commented 2 weeks ago

I'm no longer sure whether this is an issue with the AWS SDK or with the timeout library that we use. I'll close this until I've figured it out.

github-actions[bot] commented 2 weeks ago

This issue is now closed. Comments on closed issues are hard for our team to see. If you need more assistance, please open a new issue that references this one.