awslabs / aws-java-nio-spi-for-s3

A Java NIO.2 service provider for Amazon S3
Apache License 2.0
62 stars 22 forks source link

java.lang.IllegalStateException: Connection pool shut down #494

Closed sberss closed 1 month ago

sberss commented 1 month ago

I have run into a quite unusual issue where after the service has been running for 1 hour an error is thrown from here with a completely useless error message:

software.amazon.awssdk.core.exception.SdkClientException: Failed to send the request: A callback has reported failure.

This only occured when using temporary credentials, both with AssumeRoleWithWebIdentity and AssumeRole requests. I had a bit of a dig around and eventually got a debugger into the right place and found the following error message thrown when trying to refresh the credentials java.lang.IllegalStateException: Connection pool shut down. From this troubleshooting guide it suggests the error is thrown when there is an attempt to use a closed DefaultCredentialsProvider.

The only place I could find that was closing the DefaultCredentialsProvider was here. I was able to fudge in a fix by pulling the client allocation out of the try-with-resource block and this seems to work. I'm not sure if this is the correct fix as there is likely some interaction here that I don't quite understnad.

diff --git a/src/main/java/software/amazon/nio/spi/s3/S3FileSystemProvider.java b/src/main/java/software/amazon/nio/spi/s3/S3FileSystemProvider.java
index 55254d2..29ed0f4 100644
--- a/src/main/java/software/amazon/nio/spi/s3/S3FileSystemProvider.java
+++ b/src/main/java/software/amazon/nio/spi/s3/S3FileSystemProvider.java
@@ -384,7 +384,8 @@ public class S3FileSystemProvider extends FileSystemProvider {
         var timeOut = TIMEOUT_TIME_LENGTH_1;
         final var unit = MINUTES;

-        try (S3AsyncClient client = s3Directory.getFileSystem().client()) {
+        var client = s3Directory.getFileSystem().client();
+        try {
             client.putObject(
                 PutObjectRequest.builder()
                     .bucket(s3Directory.bucketName())

Sorry I don't have full stacktraces, reproducing this takes an age and I've been chasing this down for so long my brain is starting to melt! The easiest way to reproduce the issue is to set up an AWS profile that assumes a role, and then wait for an hour.

markjschreiber commented 1 month ago

Your solution is likely the correct one. Although sdk clients are autoclosable they are expensive to create and the official documentation recommends reusing them. Moving it outside of the try with resources statement prevents it being automatically closed.

On Jul 11, 2024, at 10:50 PM, Sav @.***> wrote:



I have run into a quite unusual issue where after the service has been running for 1 hour an error is thrown from herehttps://github.com/awslabs/aws-java-nio-spi-for-s3/blob/eb9039829bef5f015a5726f0f137a9e01879409a/src/main/java/software/amazon/nio/spi/s3/S3BasicFileAttributes.java#L241 with a completely useless error message:

software.amazon.awssdk.core.exception.SdkClientException: Failed to send the request: A callback has reported failure.

This only occured when using temporary credentials, both with AssumeRoleWithWebIdentity and AssumeRole requests. I had a bit of a dig around and eventually got a debugger into the right place and found the following error message thrown when trying to refresh the credentials, java.lang.IllegalStateException: Connection pool shut down. From thishttps://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/troubleshooting.html#faq-connection-pool-shutdown-exception troubleshooting guide it suggests the error is thrown when there is an attempt to use a closed DefaultCredentialsProvider.

Sorry I don't have full stacktraces, reproducing this takes an age and I've been chasing this down for so long my brain is starting to melt! The easiest way to reproduce the issue is to set up an AWS profile that assumes a role, and then wait for an hour.

I was able to fudge in a fix by pulling the client allocation out of the try-with-resource block herehttps://github.com/awslabs/aws-java-nio-spi-for-s3/blob/eb9039829bef5f015a5726f0f137a9e01879409a/src/main/java/software/amazon/nio/spi/s3/S3FileSystemProvider.java#L387 and this seems to work. I'm not sure if this is the correct fix as there is likely some interaction here that I don't quite understnad.

diff --git a/src/main/java/software/amazon/nio/spi/s3/S3FileSystemProvider.java b/src/main/java/software/amazon/nio/spi/s3/S3FileSystemProvider.java index 55254d2..29ed0f4 100644 --- a/src/main/java/software/amazon/nio/spi/s3/S3FileSystemProvider.java +++ b/src/main/java/software/amazon/nio/spi/s3/S3FileSystemProvider.java @@ -384,7 +384,8 @@ public class S3FileSystemProvider extends FileSystemProvider { var timeOut = TIMEOUT_TIME_LENGTH_1; final var unit = MINUTES;

— Reply to this email directly, view it on GitHubhttps://github.com/awslabs/aws-java-nio-spi-for-s3/issues/494, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AF2E6EKZYFSCCMPK4CI2OFTZLZPM3AVCNFSM6AAAAABKWW4TPWVHI2DSMVQWIX3LMV43ASLTON2WKOZSGQYDEOBXGMYTOOI. You are receiving this because you are subscribed to this thread.Message ID: @.***>