argoproj / argo-cd

Declarative Continuous Deployment for Kubernetes
https://argo-cd.readthedocs.io
Apache License 2.0
16.9k stars 5.13k forks source link

ArgoCD webhook refresh causes apps to go into Unknown state #18404

Open LarryGF opened 2 months ago

LarryGF commented 2 months ago

Checklist:

Describe the bug

When you have an app-of-apps that gets refreshed by a push to the "app-of-apps repo", it refreshes successfully and updates the child apps. But sometimes when the change involves changing the targetRef the child app Sync Status goes into Unknown and there's an error saying that it can't find the Ref's SHA: Failed to load target state: failed to compare revisions for source 1 of 1: rpc error: code = Internal desc = unable to resolve git revision. It can get fixed by manually refreshing the child app but it doesn't happen automatically. It's worth noting that in our scenario these updates to the Refs of the child apps happen automatically when a new version is released, so the tag definitely exists

To Reproduce

Expected behavior

After the app-of-app refreshes, the child app should automatically refresh every time

Screenshots

image

Version

argocd: v2.11.0+d3f33c0
  BuildDate: 2024-05-07T16:01:41Z
  GitCommit: d3f33c00197e7f1d16f2a73ce1aeced464b07175
  GitTreeState: clean
  GoVersion: go1.21.9
  Compiler: gc
  Platform: linux/amd64

Logs Not a log, but this is the manifest for an application where this error happened:

project: system
source:
  repoURL: {repo}
  path: ./
  targetRevision: v1.1.5
  helm:
    valueFiles:
      - values.yaml
    values: |
      global:
          applicationFolder: gemstash
          applicationName: gemstash-management
          applicationNamespace: system
destination:
  namespace: system
  name: in-cluster
syncPolicy:
  automated:
    prune: true
    selfHeal: true
  syncOptions:
    - Validate=false
    - ServerSideApply=false
    - Replace=false
    - CreateNamespace=true
agaudreault commented 1 month ago

Hey @LarryGF, can you share the logs of the application-controller component for your application and add it to the issue. Also, if you can turn on debug logs for the repo-server component and share the logs related to this method, it would greatly help understand where the error come from. https://github.com/argoproj/argo-cd/blob/0f72c19e31481c38774a26e8147c1e3bfd5e0ecc/util/git/client.go#L619

LarryGF commented 1 month ago

Hi, I think I was able to fix the problem (maybe not the cause, but at least the symptoms) by increasing the ARGOCD_GIT_ATTEMPTS_COUNT env var in the repo-server. I will enable debug mode and see what the logs throw out