Closed dannyrandall closed 1 year ago
Hi @dannyrandall,
The C2J API model you linked indeed specifies all the data the SDK's code generator needs for creating conditions and values for each waiter.
Since all of the AWS SDKs code is auto-generated from these API model files, the SDK team can't make direct changes to them. This data is defined upstream by the respective AWS services themselves (in this case, CloudFormation) and not on the SDK level.
Regarding terminating the waiter on a 403 error, there are some counterarguments to consider. Although 403 errors are generally not retryable, in distributed systems there can be a valid use case to force retries on 403s. Mainly surrounding propagation delays during IAM role creation or credential acquisition. These delays can lead to transient conditions where the system might actually benefit from retrying a 403 error.
For example, in an EKS environment, a pod might be in the process of scaling down or restarting. During this transient state, the SDK may not have the correct data to make a successful request, resulting in a 403. In such cases, automatically terminating the waiter on a 403 error could be counterproductive, as the error may resolve itself once the pod is back to a working state.
FWIW, I think you are making a reasonable argument, however this should be raised with the Cloudformation service team rather than the SDK team. You can use your AWS console to file a support ticket and raise this issue internally with them.
In the meantime your best option would be to handwrite your own waiter (you can use the generated one as a template) and add the logic to abort on 403s.
Thanks again, Ran~
Comments on closed issues are hard for our team to see. If you need more assistance, please either tag a team member or open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.
@RanVaknin thanks for the response! Makes sense that since this logic is shared across all the SDKs, the issue belongs with the CloudFormation team. I'll open a ticket from the AWS Console with them!😊
Describe the bug
When using the
cloudformation.WaitUntilChangeSetCreateCompleteContext
function with an IAM role with an explicitDENY
forcloudformation:DescribeChangeSet
, I expected that the function would return early (instead of waiting untilMaxAttempts
) with an error like:However, an
AccessDenied
error doesn't match any of the definedAcceptors
for that API. Because of this, the function loops up to the definedMaxAttempts
, withDelay
between each call before the error is finally returned.Usage example here: https://github.com/aws/copilot-cli/blob/dc6d9535b9fb1cd5cc87975f158b980b85fdd88f/internal/pkg/aws/cloudformation/changeset.go#L119-L124
Expected Behavior
I expected that an
AccessDenied
error would be returned to the caller immediately upon receiving that response.Current Behavior
The
request.Waiter
makes the request continuously for an hour (by default), constantly getting the same error.Reproduction Steps
Using credentials for an IAM role with the policy:
Code:
Possible Solution
There should be an acceptor to catch 403 responses defined here https://github.com/aws/aws-sdk-go/blob/eee2b1105ebe299ad8f002c227c61b4c1fb9aa57/models/apis/cloudformation/2010-05-15/waiters-2.json#L259-L277.
For example, testing with this worked for me:
Additional Information/Context
After glancing at other
WaitUntil*
functions, I see this extends to other functions as well.SDK version used
v1
Environment details (Version of Go (
go version
)? OS name and version, etc.)go version go1.21.1 darwin/amd64