[X] I acknowledge the upcoming end-of-support for AWS SDK for Java v1 was announced, and migration to AWS SDK for Java v2 is recommended.
Describe the bug
When a call to S3Client.headObject() fails with a 503 Slow Down error, I observe that for the resulting exception, S3Exception.isThrottlingException() returns false. For 503 failures with other APIs such as .getObject(), it returns true.
The isThrottlingException() method is used as part of the retry strategy: when set to LEGACY mode, throttling exceptions do not consume from the token bucket. The impact of this bug is that, even when setting a high number of retries to persistently retry throttled exceptions (with appropriate backoff settings, of course), I still see frequent failures after only a few retries due to token bucket exhaustion.
In my particular usecase, I'm executing an Apache Iceberg workload which executes a large number of headObject() requests, and the job is failing due to retry exhaustion despite having set a large number of the maximum number of retries. I imagine other big data workloads which extensively use this API could see the same behavior.
Expected Behavior
When a call to S3Client.headObject() fails with a 503 Slow Down error, I expect that S3Exception.isThrottlingException() returns true.
Current Behavior
When a call to S3Client.headObject() fails with a 503 Slow Down error, I observe that S3Exception.isThrottlingException() returns false.
Reproduction Steps
I only reproduced this in my-at scale application making a large number of requests to S3.
After enabling wire logging, I observe that S3's raw response is as follows. I imagine that the issue can be reproduced by mocking this response.
Here's what a logging of the exception looks like.
Code
LOG.error(
"Got service exception. Is throttling? {}. Error details: {}.",
e.isThrottlingException(),
e.awsErrorDetails(),
e);
Log output
Got service exception. Is throttling? false. Error details: AwsErrorDetails(serviceName=S3).
software.amazon.awssdk.services.s3.model.S3Exception: null (Service: S3, Status Code: 503, Request ID: DNJ9ZP64AW5HP5ZT, Extended Request ID: uBunIlZ0ytEiYNOyt7KND7OOpngDTjsSrYKveakQTyxO80MX0sHVOxLuu6jnbBSQlUq53yUkCiKPuknvXvOAW4ewXKhquD4P)
Here, notice that
isThrottlingException() returned false
AwsErrorDetails.errorCode wasn't printed, so it must be null
The first field of the exception text is null, which is another field that's pulled from the error XML.
Possible Solution
For HEAD requests, S3 does not provide an error XML. I speculate that this is the problem. From a cursory reading of the code, it appears that the data source for AwsServiceException.isThrottlingException() is awsErrorDetails.errorCode(), and this field is derived from the error XML in AwsXmlErrorUnmarshaller.unmarshall(). To solve this problem, the implementation of AwsServiceException.isThrottlingException() would need to look at the HTTP status code when the error code was not provided.
Upcoming End-of-Support
Describe the bug
When a call to
S3Client.headObject()
fails with a 503 Slow Down error, I observe that for the resulting exception,S3Exception.isThrottlingException()
returnsfalse
. For 503 failures with other APIs such as.getObject()
, it returnstrue
.The
isThrottlingException()
method is used as part of the retry strategy: when set toLEGACY
mode, throttling exceptions do not consume from the token bucket. The impact of this bug is that, even when setting a high number of retries to persistently retry throttled exceptions (with appropriate backoff settings, of course), I still see frequent failures after only a few retries due to token bucket exhaustion.In my particular usecase, I'm executing an Apache Iceberg workload which executes a large number of
headObject()
requests, and the job is failing due to retry exhaustion despite having set a large number of the maximum number of retries. I imagine other big data workloads which extensively use this API could see the same behavior.Expected Behavior
When a call to
S3Client.headObject()
fails with a 503 Slow Down error, I expect thatS3Exception.isThrottlingException()
returnstrue
.Current Behavior
When a call to
S3Client.headObject()
fails with a 503 Slow Down error, I observe thatS3Exception.isThrottlingException()
returnsfalse
.Reproduction Steps
I only reproduced this in my-at scale application making a large number of requests to S3.
After enabling wire logging, I observe that S3's raw response is as follows. I imagine that the issue can be reproduced by mocking this response.
Note that there is no XML body provided.
Here's what a logging of the exception looks like. Code
Log output
Here, notice that
isThrottlingException()
returnedfalse
AwsErrorDetails.errorCode
wasn't printed, so it must benull
null
, which is another field that's pulled from the error XML.Possible Solution
For HEAD requests, S3 does not provide an error XML. I speculate that this is the problem. From a cursory reading of the code, it appears that the data source for
AwsServiceException.isThrottlingException()
isawsErrorDetails.errorCode()
, and this field is derived from the error XML inAwsXmlErrorUnmarshaller.unmarshall()
. To solve this problem, the implementation ofAwsServiceException.isThrottlingException()
would need to look at the HTTP status code when the error code was not provided.Additional Information/Context
No response
AWS Java SDK version used
2.22.12
JDK version used
8
Operating System and version
EMR Serverless