Open xsnrg opened 1 year ago
Voting for Prioritization
Volunteering to Work on This Issue
This bug is still biting us pretty hard, so I decided to do some more digging.
What I found appears to be that AWS updated the errorCode, as the above image shows from Cloudwatch, but they did not update the aws-sdk-go
nor the aws-sdk-go-v2
to match. In looking through the Go SDK, the file aws/request/retryer_test.go
has not been updated since 2020, and in the v2 SDK, the file that defines the DefaultThrottleErrorCodes
is aws/retry/standard.go
. It has not been updated since 2020 either.
I will leave this ticket open, and create one against the aws-sdk-go-v2
project, referencing this one.
Terraform Core Version
1.3.6
AWS Provider Version
4.47.0
Affected Resource(s)
In our case, AWS ECS CreateService occasionally gets throttled. It has not been a problem prior to this year, but now it appears that the errorCode has changed in AWS from
ThrottlingException
toClientException
, and the errorMessage (Cloudwatch terms) has changed fromAn unknown error occurred
toReceived throttling error when describing target group arn...
.The ClientException is not re-tried, with only 1 error in the Cloudwatch logs when terraform aborts.
The desired outcome would be that the new error is still identified as throttling, and re-tried.
Image attached of the different throttling errorCode as seen from Cloudwatch, if I can get the issue form to accept one.
I suspect it may have something to do with the errors that changed, coming from AWS.
Also tried with terraform 1.3.7 and AWS provider 4.49.0 without any difference.
Expected Behavior
terraform identifies throttling and retries
Actual Behavior
throttling error causes abort
Relevant Error/Panic Output Snippet
Terraform Configuration Files
Will need to redact code if it is needed. The resource block is:
resource "aws_ecs_service" "service" {}
Steps to Reproduce
terraform apply with about 25 ECS services to cause throttling. Note that this does not always happen, but when it does, the run is aborted.
Debug Output
No response
Panic Output
No response
Important Factoids
No response
References
No response
Would you like to implement a fix?
None