awslabs / amazon-ecr-credential-helper

Automatically gets credentials for Amazon ECR on docker push/docker pull
Apache License 2.0
2.48k stars 336 forks source link

ecr: Failed to get authorization token: not found, ResolveEndpointV2 #680

Open ashi009 opened 10 months ago

ashi009 commented 10 months ago

We encountered a weird error, which seems from the generated code.

time="2023-11-21T00:22:55Z" level=debug msg="Retrieving credentials" region=us-west-2 registry=NNNN serverURL=NNNN.dkr.ecr.us-west-2.amazonaws.com service=ecr time="2023-11-21T00:22:55Z" level=debug msg="Checking file cache" registry=NNNN time="2023-11-21T00:22:55Z" level=debug msg="Cached token is no longer valid" expiresAt="2023-11-17 07:24:04.14 +0000 UTC" requestedAt="2023-11-16 19:24:04.151845422 +0000 UTC" time="2023-11-21T00:22:55Z" level=debug msg="Calling ECR.GetAuthorizationToken" registry=NNNN time="2023-11-21T00:22:55Z" level=info msg="Got error fetching authorization token. Falling back to cached token." error="ecr: Failed to get authorization token: not found, ResolveEndpointV2"

We did some initial analysis on this.

  1. Our version is 0.70, Git commit: cd92a7a

  2. Failed to get authorization token: leads us to https://github.com/awslabs/amazon-ecr-credential-helper/blob/cd92a7ab13759e6f4bf2170f0818ae45e93e8fd2/ecr-login/api/client.go#L229-L229

  3. Given the error message, it's clear that the err is from c.ecrClient.GetAuthorizationToken, https://github.com/awslabs/amazon-ecr-credential-helper/blob/cd92a7ab13759e6f4bf2170f0818ae45e93e8fd2/ecr-login/vendor/github.com/aws/aws-sdk-go-v2/service/ecr/api_op_GetAuthorizationToken.go#L23-L36

  4. Following the invocations from the c.invokeOperation, https://github.com/awslabs/amazon-ecr-credential-helper/blob/cd92a7ab13759e6f4bf2170f0818ae45e93e8fd2/ecr-login/vendor/github.com/aws/aws-sdk-go-v2/service/ecr/api_client.go#L71-L106, and it calls c.addOperationGetAuthorizationTokenMiddlewares https://github.com/awslabs/amazon-ecr-credential-helper/blob/cd92a7ab13759e6f4bf2170f0818ae45e93e8fd2/ecr-login/vendor/github.com/aws/aws-sdk-go-v2/service/ecr/api_op_GetAuthorizationToken.go#L65-L139

  5. There are many return points of naked errors, we need to figure out which branch gives not found, ResolveEndpointV2. It turned out "not found, %v" is from https://github.com/awslabs/amazon-ecr-credential-helper/blob/cd92a7ab13759e6f4bf2170f0818ae45e93e8fd2/ecr-login/vendor/github.com/aws/smithy-go/middleware/ordered_group.go#L178-L230 and 2 other functions in the same file. Search ResolveEndpointV2 gives us 3 results:

    1. https://github.com/awslabs/amazon-ecr-credential-helper/blob/cd92a7ab13759e6f4bf2170f0818ae45e93e8fd2/ecr-login/vendor/github.com/aws/aws-sdk-go-v2/service/ecr/api_client.go#L132-L146
    2. https://github.com/awslabs/amazon-ecr-credential-helper/blob/cd92a7ab13759e6f4bf2170f0818ae45e93e8fd2/ecr-login/vendor/github.com/aws/aws-sdk-go-v2/aws/signer/v4/middleware.go#L104-L106
    3. https://github.com/awslabs/amazon-ecr-credential-helper/blob/cd92a7ab13759e6f4bf2170f0818ae45e93e8fd2/ecr-login/vendor/github.com/aws/aws-sdk-go-v2/service/ecr/api_client.go#L451-L455

However, this means the previously added ResolveEndpointV2 is missing after a few invocations. Could you please take a look?

seternate commented 5 months ago

I ran into this problem lately using Tekton. It came down that the credential-helper uses an older version of the aws-sdk and with v1.23.0 they introduced a breaking change. If you now use any depdendency using a version >v1.23.0 you will most likely run into this. See issue of aws-sdk https://github.com/aws/aws-sdk-go-v2/issues/2370 and the Tekton issue https://github.com/tektoncd/pipeline/issues/7698.

Hope I could help you identifing your problem a bit better.