argoproj-labs / argocd-image-updater

Automatic container image update for Argo CD
https://argocd-image-updater.readthedocs.io/en/stable/
Apache License 2.0
1.25k stars 258 forks source link

Lost authentication to GCR docker registry in GKE autopilot #883

Open kalote opened 1 day ago

kalote commented 1 day ago

Describe the bug I've setup argocd image updater in my GKE autopilot cluster using helm charts. It works well with the config below:

argocd-image-updater:
  serviceAccount:
    name: argo-img-updater-sa
    annotations:
      iam.gke.io/gcp-service-account: "argo-img-updater@myproject.iam.gserviceaccount.com"
  config:
    logLevel: "debug"
    gitCommitUser: "my-bot-action[bot]"
    gitCommitMail: "172827309+my-bot-action[bot]@users.noreply.github.com"
    registries:
      - name: google
        api_url: https://europe-docker.pkg.dev
        prefix: europe-docker.pkg.dev
        ping: no
        credentials: ext:/scripts/login.sh

  authScripts:
    enabled: true
    scripts:
      login.sh: |
        #!/bin/sh
        ACCESS_TOKEN=$(wget --header 'Metadata-Flavor: Google' http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token -q -O - | grep -Eo '"access_token":.*?[^\\]",' | cut -d '"' -f 4)
        echo "oauth2accesstoken:$ACCESS_TOKEN"

After a bit of time (~3h) the logs shows a failed authentication message. Only solution I found was to restart the argocd image updater deployment.

To Reproduce Steps to reproduce the behavior:

Expected behavior The authentication should not be lost after some time

Additional context Add any other context about the problem here.

Version

The SA is configured with proper access using OIDC + workloadIdentity.

Logs

time="2024-10-08T07:38:09Z" level=error msg="Could not get tags from registry: Get \"https://europe-docker.pkg.dev/v2/my-project/docker-my-app/my-app/tags/list\": unauthorized: authentication failed" alias=my-app application=my-app-staging-europe-west4 image_name=my-project/docker-my-app/my-app image_tag=240828-2523829-staging registry=europe-docker.pkg.dev
kalote commented 12 hours ago

After some research, I found out that the metadata token has a 3600s (1h) expiration. By adding credsexpire: 1h in my argocd-image-updater helm values, it seems to work now.

I will monitor if this is the correct solution and will close this ticket if that's the case.