Open H777K opened 1 year ago
update: it seems like the scmProvider can´t handle network issues/timeouts regarding the github enterprise instance Is there any way or config to counteract this behavior?
We rely on the go-github client to tell us whether there was an error getting repos. If that client doesn't return an error when we list repos, we'll honor the response.
It's possible that we should inspect further into the response, e.g. validate a 200 return code.
I've recently faced a similar problem while our Github Enterprise was under maintenance.
We saw different errors with different status codes such as 500, 502, 503, but they did not trigger the
"generated 0 applications"
However, we saw this specific error
error listing repos: error listing repositories for xxx: invalid character '\u003c' looking for beginning of value
before applicationsetcontroller decided to "generated 0 applications"
and delete the previously generated applications.
I'm trying to reproduce but has not succeeded yet.
possibly additional clues from my homelab:
time="2024-05-12T15:28:21Z" level=error msg="error generating application from params" applicationset=argocd/cluster-resources error="error generating params from git: error retrieving Git files: rpc error: code = Internal desc = unable to resolve git revision : Get \"https://gitea.redacted.io/redacted/argocd-autopilot.git/info/refs?service=git-upload-pack\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)" generator="{nil nil &GitGenerator{RepoURL:https://gitea.redacted.io/redacted/argocd-autopilot.git,Directories:[]GitDirectoryGeneratorItem{},Files:[]GitFileGeneratorItem{GitFileGeneratorItem{Path:bootstrap/cluster-resources/*.json,},},Revision:,RequeueAfterSeconds:*20,Template:ApplicationSetTemplate{ApplicationSetTemplateMeta:ApplicationSetTemplateMeta{Name:,Namespace:,Labels:map[string]string{},Annotations:map[string]string{},Finalizers:[],},Spec:ApplicationSpec{Source:&ApplicationSource{RepoURL:,Path:,TargetRevision:,Helm:nil,Kustomize:nil,Directory:nil,Plugin:nil,Chart:,Ref:,},Destination:ApplicationDestination{Server:,Namespace:,Name:,},Project:,SyncPolicy:nil,IgnoreDifferences:[]ResourceIgnoreDifferences{},Info:[]Info{},RevisionHistoryLimit:nil,Sources:[]ApplicationSource{},},},PathParamPrefix:,Values:map[string]string{},} nil nil nil nil nil nil nil}"
the gitea server is running on rather slow machines, and will occasionally tieout and not produce a response. On those situations, it is not desirable to have argocd delete all applications that are defined there.
Checklist:
argocd version
.Describe the bug
I use an ApplicationSet with the scmProvider generator. This ApplicationSet templates an Application Resource which templates multiple AppProject and Application Resources which are located on a specific path in my github repository. For some reason the ApplicationSet Controller logs "generated 0 applications" which will cause a deletion of all generated AppProjects and Applications. In the next reconcile loop the logs shows "generated 1 applications" although no changes were made.
This behaviour is very inconsistent, it can work for weeks without any problem and sometimes it happens every few days.
I have an second cluster which has the almost the same configuration as the first one. The only difference is the path of the AppProject and Application resources in the github repository. I experienced the same issue on this cluster, but never at the same time as the first one.
To Reproduce
I think it´s difficult to reproduce because the behaviour is inconsistent. But the steps would be:
Expected behavior
I would expect that the log of the ApplicationSet controller always shows "generated 1 applications".
Version
Logs
Applied Applicationset
Generated Application from ApplicationSet