argoproj-labs / argocd-image-updater

Automatic container image update for Argo CD
https://argocd-image-updater.readthedocs.io/en/stable/
Apache License 2.0
1.17k stars 241 forks source link

Not able to connect to ECR registry (no basic auth credentials), but credentials param is set in registries.conf #422

Open Dwisf opened 2 years ago

Dwisf commented 2 years ago

Describe the bug I have image updater running as a deployment in argocd kubernetes namespace. I am having problem getting the image updater to connect to AWS ECR, when in run mode. The test mode looks ok.

I went into the image updater pod and use its CLI (argocd-image-updater) to produce the following.

I get errors like these (also when running the updater in run mode):

argocd-image-updater run
INFO[0000] argocd-image-updater v0.12.0+aee153d starting [loglevel:DEBUG, interval:2m0s, healthport:8080]
WARN[0000] commit message template at /app/config/commit.template does not exist, using default
DEBU[0000] Successfully parsed commit message template
DEBU[0000] rate limiting is disabled                     prefix= registry="https://xxx.dkr.ecr.us-west-2.amazonaws.com"
DEBU[0000] Setting default registry endpoint to
DEBU[0000] Previous default registry was docker.io
INFO[0000] Loaded 1 registry configurations from /app/config/registries.conf
DEBU[0000] Creating in-cluster Kubernetes client
INFO[0000] ArgoCD configuration: [apiKind=kubernetes, server=argocd-server.argocd, auth_token=false, insecure=false, grpc_web=false, plaintext=false]
INFO[0000] Starting health probe server TCP port=8080
INFO[0000] Starting metrics server on TCP port=8081
INFO[0000] Warming up image cache
DEBU[0000] Processing application platformdev-platform
...
...
...
DEBU[0001] Using version constraint 'latest' when looking for a new tag  alias=rover-upload-media-to-amp-kstream application=platformdev-platform image_name=rover-upload-media-to-amp-kstream image_tag=latest registry=xxx.dkr.ecr.us-west-2.amazonaws.com
ERRO[0001] Could not get tags from registry: Get "https://xxx.dkr.ecr.us-west-2.amazonaws.com/v2/rover-upload-media-to-amp-kstream/tags/list": no basic auth credentials  alias=rover-upload-media-to-amp-kstream application=platformdev-platform image_name=rover-upload-media-to-amp-kstream image_tag=latest registry=xxx.dkr.ecr.us-west-2.amazonaws.com

It looks like in this case, the credentials setting in registries.conf is not being used.

However, if I run this in test mode, it seems to be able to use the credentials setting in registries.conf, which is running the script that retrieve the AWS token. For example:

> argocd-image-updater test rover-rest-service --registries-conf-path ~/config/registries.conf
DEBU[0000] Creating in-cluster Kubernetes client
INFO[0000] retrieving information about image            image_alias= image_name=rover-rest-service registry_url=
DEBU[0000] rate limiting is disabled                     prefix= registry="https://xxx.dkr.ecr.us-west-2.amazonaws.com"
DEBU[0000] Setting default registry endpoint to
DEBU[0000] Previous default registry was docker.io
INFO[0000] Loaded 1 registry configurations from /app/config/registries.conf
INFO[0000] /scripts/ecr-login.sh                         dir= execID=acf38
INFO[0002] Fetching available tags and metadata from registry  application=test image_alias= image_name=rover-rest-service registry_url=
INFO[0002] Found 32 tags in registry                     application=test image_alias= image_name=rover-rest-service registry_url=
DEBU[0002] could not parse input tag latest as semver: Invalid Semantic Version
DEBU[0002] found 31 from 31 tags eligible for consideration  image=rover-rest-service
INFO[0002] latest image according to constraint is rover-rest-service:2.85.0  application=test image_alias= image_name=rover-rest-service registry_url=

I have the following setup:

/app/config/registries.conf

- name: ECR
  api_url: https://xxx.dkr.ecr.us-west-2.amazonaws.com
  prefix: ""
  ping: yes
  default: true
  insecure: no
  credentials: ext:/scripts/ecr-login.sh
  credsexpire: 11h

/scripts/ecr-login.sh:

#!/bin/sh
aws ecr --region "us-west-2" get-authorization-token --output text --query 'authorizationData[].authorizationToken' | base64 -d

I am currently out of options as to resolving this issue. Is there anything that I am missing still?

I follow similar setup as mentioned in the later comment in this issue: https://github.com/argoproj-labs/argocd-image-updater/issues/112

Expected behavior The image updater should be able to authenticate with AWS ECR when in run mode.

Version argocd-image-updater: v0.12.0+aee153d BuildDate: 2022-03-14T12:45:27Z GitCommit: aee153dabeb8b592e4d091c933ae4f77181db653 GoVersion: go1.17.8 GoCompiler: gc Platform: linux/amd64

hown3d commented 2 years ago

Is your service account that runs the image updater pod configured with an IAM role in AWS? Image Updater needs to have a valid web session token (via IRSA) which is then able to execute the AWS ECR API Call GetAuthorizationToken.

Since the output of the script will probably be the error, which isn't in basic auth format, you're running into the log line above.

Also from reading your provided logs, the first one doesn't log that it executes the script.

Edit: Looked into the src and if I'm correct, if the return of the script is not in basic auth format, the following error should be returned: https://github.com/argoproj-labs/argocd-image-updater/blob/f12a5ab6d3c69299ccd02473bdebdebc24131cb4/pkg/image/credentials.go#L155

Dwisf commented 2 years ago

@hown3d - thanks for your response to the issue.

Is your service account that runs the image updater pod configured with an IAM role in AWS?

Yes I think so.

Also from reading your provided logs, the first one doesn't log that it executes the script.

Yeah, that's what I'm wondering. I went into the pod, and run both the test which seems to be able to authenticate fine with ECR.

argocd-image-updater test rover-rest-service --registries-conf-path ~/config/registries.conf

The run mode:

argocd-image-updater run --registries-conf-path ~/config/registries.conf

does not seem to work, which is the issue here. It may be that it's not using the script that's specified. If this is the case, what will be the reason why the script is not being used in run mode?

The fact that from within the pod that it can authenticate with ECR in test mode, leads me to think that the IAM role is fine. I wonder if run is done differently than test w.r.t. authenticating to registry? I am trying to see if I miss something.

Dwisf commented 2 years ago

I did a bit more experiment. Instead of setting the pull-secret at the registry level, I change it to image level, by adding this annotation to the application manifest that runs the apps:

argocd-image-updater.argoproj.io/rover-rest-service.pull-secret: ext:/scripts/ecr-login.sh

And this time I can see that the run mode is able to authenticate:

...
time="2022-04-22T18:17:45Z" level=debug msg="Considering this image for update" alias=rover-rest-service application=platformdev-platform image_name=rover-rest-service image_tag=latest registry
=xxx.dkr.ecr.us-west-2.amazonaws.com
time="2022-04-22T18:17:45Z" level=debug msg="Using version constraint 'latest' when looking for a new tag" alias=rover-rest-service application=platformdev-platform image_name=rover-rest-service image_tag=latest registry=xxx.dkr.ecr.us-west-2.amazonaws.com
time="2022-04-22T18:17:45Z" level=info msg=/scripts/ecr-login.sh dir= execID=b6d7d
time="2022-04-22T18:17:47Z" level=debug msg="found 1 from 1 tags eligible for consideration" image="xxx.dkr.ecr.us-west-2.amazonaws.com/rover-rest-service:latest"
time="2022-04-22T18:17:47Z" level=info msg="Setting new image to xxx.dkr.ecr.us-west-2.amazonaws.com/rover-rest-service@sha256:eaba46c42ea029b2b291aefe9f8beb9e1d4b293de7928298044069ba952bb247" alias=rover-rest-service application=platformdev-platform image_name=rover-rest-service image_tag=dummy registry=xxx.dkr.ecr.us-west-2.amazonaws.com
time="2022-04-22T18:17:47Z" level=debug msg="target parameters: image-spec= image-name=image.name, image-tag=image.tag" application=platformdev-platform image=xxx.dkr.ecr.us-west-2.amazonaws.com/rover-rest-service
time="2022-04-22T18:17:47Z" level=info msg="Successfully updated image 'xxx.dkr.ecr.us-west-2.amazonaws.com/rover-rest-service@dummy' to 'xxx.dkr.ecr.us-west-2.amazonaws.com/rover-rest-service@sha256:eaba46c42ea029b2b291aefe9f8beb9e1d4b293de7928298044069ba952bb247', but pending spec update (dry run=false)" alias=rover-rest-service application=platformdev-platform image_name=rover-rest-service image_tag=dummy registry=xxx.dkr.ecr.us-west-2.amazonaws.com
...

So the issue seems that the image updater may not be doing secret setup ad the registry level. I am curious if this is the case, given the code here?: https://github.com/argoproj-labs/argocd-image-updater/blob/113d8e74e5346132803605512bfac97bb0d84ae6/pkg/registry/registry.go#L197

rarecrumb commented 2 years ago

I had to use the prefix key to get it working...

- name: ECR
  api_url: https://xxx.dkr.ecr.us-west-2.amazonaws.com
  prefix: xxx.dkr.ecr.us-west-2.amazonaws.com # i had to add this
  ping: yes
  default: true
  insecure: no
  credentials: ext:/scripts/ecr-login.sh
  credsexpire: 11h
stevancris commented 2 years ago

i've some error with the respone "An error occurred (ValidationError) when calling the AssumeRoleWithWebIdentity operation: Request ARN is invalid". if I'm investigation, maybe error when running script ecr.sh. can anyone help me?

MicahDevOps commented 1 year ago

I followed the guide and configured argocd-image-updater,however still NOT able to connect to ECR. My steps are as follow:

  1. download offical helm chart from argocd-image-updater helm repo
  2. configure the values.yaml registries:

authScripts: enabled: true scripts: ecr-login.sh: |

!/bin/sh

  aws ecr --region cn-north-1 get-authorization-token --output text --query 'authorizationData[].authorizationToken' | base64 -d

serviceAccount: create: true annotations: eks.amazonaws.com/role-arn: arn:aws-cn:iam::aws-account:role/role-name

  1. AWS role (arn:aws-cn:iam::aws-account:role/role-name) has been attached full ECR permission and setup trust relationship with service account : argocd-image-updater

  2. kubectl exec -it argocd-image-updater-5b859f76c4-lxrhg -n argocd -- sh verified that the ecr-login.sh can return ECR token

  3. But when run argocd-image-updater test image_name --registries-conf-path ~/config/registries.conf It always outputed error: FATA[0002] could not get tags: errors: denied: requested access to the resource is denied unauthorized: authentication required application=test image_alias= image_name=xxx registry_url=

Did miss anything? the script can get ECR token why it still fails to be authorized ?

can anyone help me ?

huy-phan-of commented 1 year ago

Hi guys, any updates on this? I faced the same issue like this. In my case, the script getting ECR token does not work. Also, I could not perform any aws command in the argocd-image-updater pod

/scripts $ aws sts get-caller-identity

An error occurred (AccessDenied) when calling the AssumeRoleWithWebIdentity operation: Not authorized to perform sts:AssumeRoleWithWebIdentity
/scripts $ 

My setup is the same with issue #112 and @MicahDevOps Thanks

amohsenter09-github commented 1 year ago

any updates here

ricky1-gupta commented 1 year ago

Hi guys, any updates on this? I faced the same issue like this. In my case, the script getting ECR token does not work. Also, I could not perform any aws command in the argocd-image-updater pod

/scripts $ aws sts get-caller-identity

An error occurred (AccessDenied) when calling the AssumeRoleWithWebIdentity operation: Not authorized to perform sts:AssumeRoleWithWebIdentity
/scripts $ 

My setup is the same with issue #112 and @MicahDevOps Thanks

You need to attach suitable iam role with your service account . In my case , I created a role and attached this policy to my role AmazonEC2ContainerRegistryFullAccess .

seanturner026 commented 1 year ago

I had the same exact issue (no basic auth credentials), and for some reason adding default: true to my registries.conf solved the issue. No idea why 😢

Here are my values.

replicaCount: 1 # It is not advised to run more than one replica.

extraEnv:
  - name: AWS_REGION
    value: us-west-2
  - name: AWS_ROLE_SESSION_NAME
    value: cluster-name-argocd-image-updater

config:
  registries:
    - name: ECR
      api_url: https://0123456789.dkr.ecr.us-west-2.amazonaws.com
      prefix: 0123456789.dkr.ecr.us-west-2.amazonaws.com
      ping: yes
      insecure: no
      default: true
      credentials: ext:/scripts/ecr-login.sh
      credsexpire: 10h

authScripts:
  enabled: true
  scripts:
    ecr-login.sh: |
      #!/bin/sh
      aws ecr --region $AWS_REGION get-authorization-token --output text --query 'authorizationData[].authorizationToken' | base64 -d

serviceAccount:
  create: true
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::0123456789:role/cluster-name-argocd-image-updater

rbac:
  enabled: true
seanmorton commented 1 year ago

I initially hit this error as well but following the guidance of this comment got things working for me.

joshc-nimble commented 2 months ago

https://github.com/argoproj-labs/argocd-image-updater/issues/422ChatGPT The syntax error you're encountering might be due to indentation or quoting issues. Let's correct the script:

I'm officially confused.

Steps:

  1. Created a ISRA role to give the pod access to ecr within the cluster. Trust relationship:
    {
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Federated": "arn:aws:iam::123:oidc-provider/oidc.eks.eu-west-1.amazonaws.com/id/A1F0F7D782717717B1CB8E9849E61B85"
            },
            "Action": "sts:AssumeRoleWithWebIdentity",
            "Condition": {
                "StringEquals": {
                    "oidc.eks.eu-west-1.amazonaws.com/id/123:aud": "sts.amazonaws.com",
                    "oidc.eks.eu-west-1.amazonaws.com/id/123:sub": "system:serviceaccount:argocd:argocd-image-updater"
                }
            }
        }
    ]
    }

and to be on the safe side, I gave it AmazonEC2ContainerRegistryPowerUser

  1. I then setup the values on the helm chart per the suggestions on the blog @seanmorton posted and other suggestions.
    
    extraEnv:
    - name: AWS_REGION
    value: eu-west-1

config: registries:

authScripts: enabled: true scripts: ecr-login.sh: |

!/bin/sh

  aws ecr --region $AWS_REGION get-authorization-token --output text --query 'authorizationData[].authorizationToken' | base64 -d

serviceAccount: create: true annotations: eks.amazonaws.com/role-arn: arn:aws:iam::123:role/ecr_reader_role

rbac: enabled: true


also tried this

extraEnv:

config: registries:

authScripts: enabled: true scripts: ecr-login.sh: |

!/bin/sh

  # Retrieve the authorization token from AWS ECR
  auth_token=$(aws ecr get-authorization-token --region eu-west-1 --output text --query 'authorizationData[].authorizationToken')

  # Decode the authorization token
  decoded_token=$(echo $auth_token | base64 -d)

  # Extract username and password
  username=$(echo $decoded_token | cut -d: -f1)
  password=$(echo $decoded_token | cut -d: -f2)

  # Output username and password
  echo "$username:$password"

serviceAccount: create: true annotations: eks.amazonaws.com/role-arn: arn:aws:iam::123:role/ecr_reader_role

rbac: enabled: true


3. I enabled the helm chart, I get the following in the logs from the `image-updater` pod: 

v2/example/tags/list\": no basic auth credentials"


5. I  then  check if the script is valid on the `image-updater` pod, which outpus the auth.

/scripts $ sh ecr-login.sh File "/scripts/ecr-login.sh", line 2


6. I then ran the command manually  - exec into the `image-updater`
Got a base64 auth token back, which means my role is working correctly with webIdentity. 

7. I then run `aws ecr describe-images --repository-name micro --region $AWS_REGION` with a successful response

{ "imageDetails": [ { "registryId": "123", "repositoryName": "example", "imageDigest": "sha256:123", "imageTags": [ "0.16.0" ], "imageSizeInBytes" ... etc

8. So, I'm able to authenticate, actually get the images with tags from the repo, so I'm confused to why it doesn't work and when I run `argocd-image-updater test micro --registries-conf-path ~/config/registries.conf`
Response returned 

INFO[0000] Fetching available tags and metadata from registry application=test image_alias= image_name=micro registry_url= FATA[0001] could not get tags: denied: Your Authorization Token is invalid. application=test image_alias= image_name=example registry_url=



Any suggestions, would be greatly appreciated :)