Observed behavior - Application refreshes are taking a painfully and abnormally long time to complete.
After much research and debugging our team has found the following abnormalities:
Workqueue depth is far higher than it should be, indicating there are near constant refreshes. We have configured everything we can find online to reduce the workqueue depth to close to its expected values of ~0-4. (see screenshot for current values)
Logs indicate that resources not tied to an argo application, but created/managed in the same namespace as other argo applications, are causing argocd to initiate reconciliation for all applications in that namespace upon the resource's update. This appears to be a near constant trigger across each of our 25 clusters, contributing to the observed workqueue depth.
To Reproduce
Deploy, via Argo-CD, the goldilocks operator into namespace container-platform - the operator pod creates and manages a Vertical Pod Autoscaler for every deployment it is configured for. It adds the VPA to the same namespace as the deployment it references. This VPA is not managed by argocd via a tracking-id.
Configure it to reference all deployments in namespaces kube-system, container-platform, and container-monitoring.
Deploy a few applications to each namespace.
When the generated VPAs update themselves with their resource recommendations, it then triggers an attempt to refresh for every single application in the namespace.
Expected behavior
I expect when the VPA object's resource recommendations are updated, argo will be entirely unaffected because the VPA object is not being tracked. I expect that a refresh should only be triggered if an application's resource is modified, not an unmanaged resource in the same namespace.
Checklist:
argocd version
.Describe the bug
Observed behavior - Application refreshes are taking a painfully and abnormally long time to complete.
After much research and debugging our team has found the following abnormalities:
To Reproduce
container-platform
- the operator pod creates and manages a Vertical Pod Autoscaler for every deployment it is configured for. It adds the VPA to the same namespace as the deployment it references. This VPA is not managed by argocd via a tracking-id.kube-system
,container-platform
, andcontainer-monitoring
.When the generated VPAs update themselves with their resource recommendations, it then triggers an attempt to refresh for every single application in the namespace.
Expected behavior
I expect when the VPA object's resource recommendations are updated, argo will be entirely unaffected because the VPA object is not being tracked. I expect that a refresh should only be triggered if an application's resource is modified, not an unmanaged resource in the same namespace.
Screenshots
Version
Logs
Some sample logs: