aws / aws-sdk-go-v2

AWS SDK for the Go programming language.
https://aws.github.io/aws-sdk-go-v2/docs/
Apache License 2.0
2.65k stars 638 forks source link

unwanted context timeout added to imds requests #2571

Closed ndbaker1 closed 7 months ago

ndbaker1 commented 7 months ago

Acknowledgements

Describe the bug

the middleware that is added to all imds api requests inserts a timeout on the provided context that is only 5 seconds long, which invalidates retries configured for longer.

previously brought up in https://github.com/aws/aws-sdk-go-v2/issues/1247

Expected Behavior

after creating an imds.Client with a high number of retries I expect that when passing a context.TODO() into an imds api like GetUserData, all retries will be respected before returning an error.

Current Behavior

with the setup described above, the request errors with

operation error ec2imds: GetUserData, canceled, context deadline exceeded

Reproduction Steps

  1. follow repo in: https://github.com/aws/aws-sdk-go-v2/issues/1247
  2. create an imds client with a retry greater than 5 seconds, and mock out the environment AWS_EC2_METADATA_SERVICE_ENDPOINT to get the api to fail

i := imds.New(imds.Options{
  Retryer: retry.NewStandard(func(so *retry.StandardOptions) {
    so.MaxAttempts = 15
    so.MaxBackoff = 1 * time.Second
  }),
})
resp, err := i.GetUserData(ctx, &imds.GetUserDataInput{})
if err != nil {
  fmt.Fatal(err)
}

Possible Solution

do not enforce this timeout when adding middleware for these api calls

Additional Information/Context

began discussion in PR https://github.com/awslabs/amazon-eks-ami/pull/1732 for the purpose of initiating imds api calls before networking or imds may be available yet.

AWS Go SDK V2 Module Versions Used

1.24.1

Compiler and Version used

go version go1.22.1 linux/amd64

Operating System and version

AL2023

lucix-aws commented 7 months ago

I can't fathom why we would have done this to begin with. It looks like there was a previous patch applied here to attempt to respect explicit context timeouts, but there's no way to just NOT have a timeout without removing it from the middleware stack.

It's too risky behaviorally to just rip it out, but I'm going to add a config switch to the client itself that lets you shut it off entirely.

github-actions[bot] commented 7 months ago

This issue is now closed. Comments on closed issues are hard for our team to see. If you need more assistance, please open a new issue that references this one.