Open acelinkio opened 5 months ago
We're experiencing this issue as well. Restarting the statefulset resolves the immediate issue, but we'd love to understand why the controller is getting stuck and how to ensure it either recovers gracefully or restart the process automatically
ArgoCD versions 2.10 and below have reached EOL. Can you upgrade and tell us if the issue is still present, please?
Checklist:
argocd version
.Describe the bug argocd-application-controller stopped processing workloads entirely without crashing. The controller appeared to be hung on one of the ArgoCD Applications when it stopped functioning. That Application's custom resource had an
Operations
field that had been added but did not process.During this time the application-controller produced no new log messages and did not update any applications. 30 minutes elapsed with no new information. Restarting the statefulset appeared to resolve the issue, however I am concerned about this reoccuring.
To Reproduce Was unable to reproduce on demand.
The application that appeared to cause the hang manages >100 Kubernetes objects 25 x Namespace 25 x Secret 25 x ApplicationSet (Each of these applicationsets spawns 2-5 child applications) 5 x Application (standalone)
Expected behavior
argocd-application-controller does not freeze. In the event of a freeze, I expect:
Screenshots n/a
Version
Deployed using argo-cd helm chart version 6.7.8 using subchart for creating a 3 node redis cluster.
Logs
Did not see any relevant logs. No new logs were produced when the application was hung. All processing appeared to stop on inside of the container.
Additional Comments One concern that comes to mind is if the Kubernetes object the application-controller is trying to manage is too large and being rejected by the Kubernetes api however I did not see anything in Kubernetes logs or inside of the application-controller to indicate that is the case.
Should ApplicationSet child applications be tracked in a parent Application? TopLevelApplication currently displays each ApplicationSet (per namespace) and each Application generated from ApplicationSet
Is there any handling for when requests are too large? I appear to be reaching the limits of what the default controller can handle. I was unable to find any best practices surrounding how many objects should be managed.