aws / aws-sdk-java-v2

The official AWS SDK for Java - Version 2
Apache License 2.0
2.18k stars 842 forks source link

TransferListener bytesTransferred incorrect when S3AsyncClient is built using .builder() instead of .crtBuilder() #4598

Open tombryden opened 1 year ago

tombryden commented 1 year ago

Describe the bug

Using S3AsyncClient.builder() causes TransferListener not to work as expected. TransferListener only works properly when using .crtBuilder()

Expected Behavior

TransferListener to function the same as S3AsyncClient.crtBuilder() when using S3AsyncClient.builder()

Current Behavior

When S3AsyncClient is built using the below configuration, everything functions as expected when transferring a file via UploadFileRequest:

        S3AsyncClient s3AsyncClient2 = S3AsyncClient.crtBuilder()
                .credentialsProvider(provider)
                .region(Region.EU_WEST_2)
        .build();

The request:

            UploadFileRequest uploadFileRequest =
                    UploadFileRequest.builder()
                        .putObjectRequest(b -> b.bucket(bucket).key(s3Key).contentType(file.getContentType()))
                        .addTransferListener(new CustomTransferListener())
                        .source(convFile)
                        .build();

            s3TransferManager.uploadFile(uploadFileRequest);

However, if I want to use a custom http client using S3AsyncClient.builder() instead of .crtBuilder() as shown in the documentation, it doesn't function as expected. Even if I use no custom .httpClient and simply use .builder(), the same, broken behaviour occurs as seen in the reproduction steps.

        S3AsyncClient s3AsyncClient = S3AsyncClient.builder()
                .httpClient(
                        AwsCrtAsyncHttpClient.builder()
                        .maxConcurrency(100)
                        .connectionMaxIdleTime(Duration.ofHours(1))
                        .connectionTimeout(Duration.ofHours(1))
                        .build())
                .credentialsProvider(provider)
                .region(Region.EU_WEST_2)
                .build();

TransferListener bytesTransferred method jumps to 1.0 when calling context.progressSnapshot().ratioTransferred() extremely quickly (within a few ms), sits at 1.0 with no other calls until the file is uploaded, then completes. This is the same behaviour seen by these two issues (although these seem to occur due to different issues): https://github.com/aws/aws-sdk-java-v2/issues/4114 and https://github.com/aws/aws-sdk-java-v2/issues/3670.

I have tried 3 different configs with all three have the same outcome with the ratio jumping very quickly to 1.0:

  1. No custom .httpClient declared (however using S3AsyncClient.builder())
  2. The config posted above
  3. Using NettyNioAsyncHttpClient as seen below.
          S3AsyncClient s3AsyncClient = S3AsyncClient.builder()
                  .httpClient(NettyNioAsyncHttpClient.builder()
                      .maxConcurrency(500)
                      .maxPendingConnectionAcquires(10000)
                      .writeTimeout(Duration.ofHours(1))
                      .connectionMaxIdleTime(Duration.ofHours(1))
                      .connectionTimeout(Duration.ofHours(1))
                      .connectionAcquisitionTimeout(Duration.ofHours(1))
                      .connectionTimeToLive(Duration.ofHours(1))
                      .readTimeout(Duration.ofHours(1)).build())

                  .credentialsProvider(provider)
                  .region(Region.EU_WEST_2)
          .build();

Reproduction Steps

      S3AsyncClient s3AsyncClient2 = S3AsyncClient.builder()
                .credentialsProvider(provider)
                .region(Region.EU_WEST_2)
                .build();

        S3TransferManager transferManager =
        S3TransferManager.builder()
        .s3Client(s3AsyncClient2)
        .build();

UploadFileRequest uploadFileRequest =
                    UploadFileRequest.builder()
                        .putObjectRequest(b -> b.bucket(bucket).key(key))
                        .addTransferListener(LoggingTransferListener.create())
                        .source(file)
                        .build();

            transferManager.uploadFile(uploadFileRequest);

Observe transfer listener jumping to 100% within milliseconds, despite the upload taking much longer (when a large file is used)

Possible Solution

No response

Additional Information/Context

No response

AWS Java SDK version used

2.20.161

JDK version used

17.0.5

Operating System and version

Windows 10 Pro Version 10.0.19045 Build 19045

bhoradc commented 1 year ago

Hello @tombryden,

Thanks for reporting the issue. I tried replicating it, but had no luck. Below are the scenarios and the code sample I used. Kindly review and let me know if you find anything that would help reproduce the issue.

Using Java SDK version 2.20.161 and the default LoggingTransferListener.create().

Case 1 - No custom .httpClient declared (however using S3AsyncClient.builder()), File size - 1 GB and 5 GB

  public static void main(String[] args) {

        S3AsyncClient s3AsyncClient = S3AsyncClient.builder()
                .region(Region.US_EAST_1)
                .build();

        S3TransferManager transferManager = S3TransferManager.builder()
                .s3Client(s3AsyncClient)
                .build();

        String result = uploadFile(transferManager, "redacted",
                "redacted", "/S3Dir/onegbfile.txt");

        System.out.println(result);
    }
    public static String uploadFile(S3TransferManager transferManager, String bucketName,
                                    String key, String filePath) {

        UploadFileRequest uploadFileRequest =
                UploadFileRequest.builder()
                        .putObjectRequest(b -> b.bucket(bucketName).key(key))
                        .addTransferListener(LoggingTransferListener.create())
                        .source(Paths.get(filePath))
                        .build();

        FileUpload fileUpload = transferManager.uploadFile(uploadFileRequest);
        CompletedFileUpload uploadResult = fileUpload.completionFuture().join();
        LOGGER.info(uploadResult.response().eTag());
        return uploadResult.response().eTag();
   }

Case 2 - With custom .httpClient, File size - 1 GB and 5 GB

   public static void main(String[] args) {

        S3AsyncClient s3AsyncClient = S3AsyncClient.builder()
                .httpClient(
                        AwsCrtAsyncHttpClient.builder()
                                .maxConcurrency(100)
                                .connectionMaxIdleTime(Duration.ofHours(1))
                                .connectionTimeout(Duration.ofHours(1))
                                .build())
                .region(Region.US_EAST_1)
                .build();

        S3TransferManager transferManager = S3TransferManager.builder()
                .s3Client(s3AsyncClient)
                .build();

        String result = uploadFile(transferManager, "redacted",
                "redacted", "/S3Dir/onegbfile.txt");

        System.out.println(result);
    }
    public static String uploadFile(S3TransferManager transferManager, String bucketName,
                                    String key, String filePath) {

        UploadFileRequest uploadFileRequest =
                UploadFileRequest.builder()
                        .putObjectRequest(b -> b.bucket(bucketName).key(key))
                        .addTransferListener(LoggingTransferListener.create())
                        .source(Paths.get(filePath))
                        .build();

        FileUpload fileUpload = transferManager.uploadFile(uploadFileRequest);
        CompletedFileUpload uploadResult = fileUpload.completionFuture().join();
        LOGGER.info(uploadResult.response().eTag());
        return uploadResult.response().eTag();
    }

Case 3 - With NettyNioAsyncHttpClient configuration, File size - 1 GB and 5 GB

 public static void main(String[] args) {

        S3AsyncClient s3AsyncClient = S3AsyncClient.builder()
                .httpClient(NettyNioAsyncHttpClient.builder()
                        .maxConcurrency(500)
                        .maxPendingConnectionAcquires(10000)
                        .writeTimeout(Duration.ofHours(1))
                        .connectionMaxIdleTime(Duration.ofHours(1))
                        .connectionTimeout(Duration.ofHours(1))
                        .connectionAcquisitionTimeout(Duration.ofHours(1))
                        .connectionTimeToLive(Duration.ofHours(1))
                        .readTimeout(Duration.ofHours(1)).build())
                .region(Region.US_EAST_1)
                .build();

        S3TransferManager transferManager = S3TransferManager.builder()
                .s3Client(s3AsyncClient)
                .build();

        String result = uploadFile(transferManager, "redacted",
                "redacted", "/S3Dir/onegbfile.txt");

        System.out.println(result);
    }
    public static String uploadFile(S3TransferManager transferManager, String bucketName,
                                    String key, String filePath) {

        UploadFileRequest uploadFileRequest =
                UploadFileRequest.builder()
                        .putObjectRequest(b -> b.bucket(bucketName).key(key))
                        .addTransferListener(LoggingTransferListener.create())
                        .source(Paths.get(filePath))
                        .build();

        FileUpload fileUpload = transferManager.uploadFile(uploadFileRequest);
        CompletedFileUpload uploadResult = fileUpload.completionFuture().join();
        LOGGER.info(uploadResult.response().eTag());
        return uploadResult.response().eTag();
    }

Kindly share the custom TransferListener details if you see an issue specific to it. And it would also help, if you can share the logs with the timestamp.

Regards, Chaitanya

github-actions[bot] commented 12 months ago

It looks like this issue has not been active for more than five days. In the absence of more information, we will be closing this issue soon. If you find that this is still a problem, please add a comment to prevent automatic closure, or if the issue is already closed please feel free to reopen it.

tombryden commented 12 months ago

Looking into this now - will update shortly.

tombryden commented 12 months ago

When running a fresh spring boot application using .builder, I have noticed on startup I get the following error message (didn't notice this before), potentially explaining the issue with .builder.

The provided DefaultS3AsyncClient is not an instance of S3CrtAsyncClient, and thus multipart upload/download feature is not enabled and resumable file upload is not supported. To benefit from maximum throughput, consider using S3AsyncClient.crtBuilder().build() instead.

When running the scenarios you listed above on my instance using Spring Boot the behaviour I was experiencing before still occurs, sometimes erroring. Happy to upload the example project to git if that helps.

If this is the case that we must use .crtBuilder, how can I customise timeouts and other options - I am noticing lots of timeouts when uploading large files at the same time from a slow connection. software.amazon.awssdk.services.s3.model.S3Exception: Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed.

It is worth noting that even when using .crtBuilder, if you use an upload request with a byte array the percentage jumps to 100.

    private String uploadMultipartFileBytes(String bucketName, String key, MultipartFile mpFile) throws IOException {

        UploadRequest uploadRequest = UploadRequest.builder()
        .putObjectRequest(b -> b.bucket(bucketName).key(key))
        .addTransferListener(LoggingTransferListener.create())
        .requestBody(AsyncRequestBody.fromBytes(mpFile.getBytes()))
        .build();

        s3TransferManager.upload(uploadRequest);
        return "uploading";
    }

Many thanks.