argoproj / argo-cd

Declarative Continuous Deployment for Kubernetes
https://argo-cd.readthedocs.io
Apache License 2.0
17.46k stars 5.3k forks source link

ArgoCD application throwing git fetch error after converting it to an ApplicationSet #9645

Open dan-m8t opened 2 years ago

dan-m8t commented 2 years ago

Checklist:

Describe the bug

A few days ago I converted all my Applications to a single git repo with an ApplicationSet because of easier management. Most of the applications converted just fine, some threw an error about cannot git fetch, not my ref, I solved this by non-cascading deletion of the app, it was recreated flawlessly and has no errors anymore. One application still got the error and I'm not sure how to solve. When I click on the app in the GUI I receive this error:

Unable to load data: Failed to checkout revision c50ac270f37b7083c751064a51bda18080761bd9: git fetch origin c50ac270f37b7083c751064a51bda18080761bd9 --tags --force failed exit status 128: fatal: remote error: upload-pack: not our ref c50ac270f37b7083c751064a51bda18080761bd9

This seems to have something to do with caching from the repo-server I guess?

To Reproduce

Not sure if this fully reproduceable but:

  1. Create a few applications
  2. convert to ApplicationSet with single git repo

ApplicationSet:

kind: ApplicationSet
metadata:
  name: my-apps
  namespace: argocd
spec:
  generators:
  - git:
      directories:
      - path: apps/*
      repoURL: https://mygit.example/apps.git
      revision: HEAD
  template:
    metadata:
      name: '{{path.basename}}'
    spec:
      destination:
        server: https://kubernetes.default.svc
      project: default
      source:
        path: '{{path}}'
        repoURL: https://mygit.example/apps.git
        targetRevision: HEAD
      syncPolicy:
        automated:
          selfHeal: true

Funfact: When I sync the application manually it will get the current commit as SyncStatus, after refreshing the page it somehow gets reverted to another commit inside the GUI - the application itselfs is left untouched. For testing purposes I edited a ConfigMap inside the app and synced the app, works just fine, ArgoCD changed the ConfigMap but the error stays.

I upgraded to ArgoCD 2.4.0 today, but the error has been there even before the upgrade. Deleting redis and repo-server didn't help either.

Expected behavior

No more git fetch errors

Version

argocd version                                                                                                                                                          ✔  15s  
argocd: v2.4.0+unknown
  BuildDate: 2022-06-11T04:42:53Z
  GitCommit: 
  GitTreeState: 
  GitTag: 2.4.0
  GoVersion: go1.18.3
  Compiler: gc
  Platform: linux/amd64
WARN[0000] Failed to invoke grpc call. Use flag --grpc-web in grpc calls. To avoid this warning message, use flag --grpc-web. 
argocd-server: v2.4.0+91aefab
  BuildDate: 2022-06-10T17:23:37Z
  GitCommit: 91aefabc5b213a258ddcfe04b8e69bb4a2dd2566
  GitTreeState: clean
  GoVersion: go1.18.3
  Compiler: gc
  Platform: linux/amd64
  Kustomize Version: v4.4.1 2021-11-11T23:36:27Z
  Helm Version: v3.8.1+g5cb9af4
  Kubectl Version: v0.23.1
  Jsonnet Version: v0.18.0
crenshaw-dev commented 2 years ago

I think you're saying that the error is applicable to the Application, not the ApplicationSet. Is that correct?

The ApplicationSet controller currently maintains its own repository cache. In the future we should probably switch it to make gRPC calls to the repo-server instead. So if the error is in the ApplicationSet rather than the Application, then that's a hint that there's a bug in ApplicationSet's cache logic.

But it sounds like there's a bug in the repo-server that, for some reason, is only triggered when the App is deployed from an ApplicationSet rather than via gitops and/or kubectl.

dan-m8t commented 2 years ago

Yes this is correct, this is the only Application of the ApplicationSet with that error. In the meantime I deleted the app in foreground mode with deleting all the ressources, no luck here. I double-checked if there's still ".git" folder in that particular application directory (I moved all my git-tracked applications to a single apps folder and created a new repo but deleted all former git references from the Applications) Is there a way to delete the cache of the argocd-reposerver? (I already deleted both redis and repo pods, as they have no PVs in my setup I thought this should do the trick but nope) Should I provide any suitable logs for debugging? (from which pod?) I can live with that bug for now as the ApplicationSet indeed does update the Application when I change something.

rafaeluchoa commented 2 years ago

+1

crenshaw-dev commented 2 years ago

I already deleted both redis and repo pods, as they have no PVs in my setup I thought this should do the trick but nope

That should have done the trick. The fact that the error didn't go away is actually a little encouraging. It hopefully means it's reproducible.

@dan-m8t or @rafaeluchoa do you have an example in a public repo where you can reproduce the issue? I'd love to dig in, but time's a little tight.

tulsluper commented 1 year ago

I faced the issue after adding a new repo, syncing an app from it, then switching back to the previous repo, and I see this error - commits ids are not from the current repo.

sjoukedv commented 1 year ago

I have a multi-source argocd application that contains two helm charts (from a git source) that throws the same error.

SnoozeFreddo commented 1 year ago

same

jeremych1000 commented 1 year ago

We're hitting this as well, I've tried everything. My repo has a submodule in it which is failing.

Nothing works, and am completely stuck atm. I can't see the application in the Argo UI.

Thankfully it's only on our dev Argo not prd but it's not great that there's no immediate fix.

crenshaw-dev commented 1 year ago

Can anyone piece together a minimal, reproducible, public example? At the moment, it sounds like maybe there are a few paths which lead to the same error message.

SnoozeFreddo commented 1 year ago

Install a fresh argocd and add a github repo/deployment. (https://github.com/argoproj/argo-cd//manifests/cluster-install?ref=v2.7.6) (I did it via the argo-vault-plugin kustomization example)

Then edit the configmap:

https://github.com/argoproj/argo-cd/issues/2802#issuecomment-1605633640

and kubectl apply it.

hardenchant commented 1 year ago

Got it when try to change repository for Application CRD. Fix with remove 'history' and 'sync' keys at Application with error (and all other fields which contains revision: with error)

pfldy2850 commented 11 months ago

+1

ajits7 commented 10 months ago

@hardenchant could you please elaborate how to fix this. We don't have 'history' and 'sync' key in the application manifest.

k-stz commented 9 months ago

In my case this was a harmless transitory Error when switching helm-repos-urls:

We saw this issue when switching the ArgoCD-Application's helm-repo-url to another. Then under the "history"-data of the ArgoCD-Application (we didn't use ApplicationSet) the error referred to the last successfully synced commit. But that commit was on the old repo.

Thus the ArgoCD was looking for the commit in its sync-history (that referred to a commit on the old repo) in the new-repo. This was simply fixed by clicking the "sync"-Button, thus creating a new history entry which now pointed to the commit-hash ("revision") of the new helm-repo and the error disappeared.

ForbiddenEra commented 6 months ago

This might be related to my issue https://github.com/argoproj/argo-cd/issues/17207

We're hitting this as well, I've tried everything. My repo has a submodule in it which is failing.

See the above issue; you and me might be the only people [smart/daring enough] using submodules :)

For me, it complains about getting an error from git about overwriting the 'changes' which are done by an override, in my case by argo-image-updater, and as well, I've tried everything in your list without much luck, sometimes fixes itself after deleting the app completely and waiting or doing everything you posted and waiting or something. Not great. My only workaround is to point directly to the submodule maybe, or, possibily (which I plan to do anyway, but haven't tested if it solves the submodule issue or whether the overrides are the only thing causing it to pop up) setup argo-image-updater to push overrides to git repo instead (which I think is maybe slightly better as it's less confusing to devs when they see Last Sync: 2 months ago when their manifest was last updated but they just pushed commits that built a new image and didn't require a manifest change, at least with the git method, it'll show Last Sync matching the most recent commit.

Also the randomness of the submodule issue is annoying; not sure if that's because I run multiple replicas, but sometimes it just works, other times it doesn't and I had a dev complaining just today that their app is now broken, sending a screenshot of the git error.

ALSO - I've had this happen with SEPARATE APPS in SEPARATE PROJECTS on SEPARATE NAMESPACES that point to the SAME REPO but DIFFERENT branches; so the cache folder is defined by the REPO itself, not APP or PROJECT, NAMESPACE, BRANCH?! Which means this could even be an issue for anyone using multiple branches on the same repo, submodules or not!

I've also seen this exact error from OP as well in some cases, syncing/refresh/hard refresh hasn't always fixed it especially when you've deleted/re-added an app and it fails to add and you have nothing to sync/refresh/hard refresh!

Is there a way to force flush/delete the cached files? Can we somehow ignore git errors or just overwrite the cache with git certain git errors? Or just (optionally?) overwrite the cached repo everytime? If we know there's new commits, there's not much reason to not wipe/reclone? I assume internally it's doing something like git pull which I suppose is more efficient when it works but when it doesn't, there should be an alternative like automatically re-cloning or something.

ebuildy commented 3 months ago

In our case, it happens with version 2.11.1, as soon as you change app repoURL.

Argocd tries to checkout the commit hash of old repository ! Didnt dig, maybe a cache key should be just prefixed by the repository URL ^^

jorgelon commented 2 days ago

In our case, it happens with version 2.11.1, as soon as you change app repoURL.

Argocd tries to checkout the commit hash of old repository ! Didnt dig, maybe a cache key should be just prefixed by the repository URL ^^

same here. I have solved it deleting the application's history (status.history) then sync again