argoproj / argo-cd

Declarative Continuous Deployment for Kubernetes
https://argo-cd.readthedocs.io
Apache License 2.0
18.01k stars 5.49k forks source link

Non-Managed Resource Causing Managed Applications in the Same Namespace to Refresh #19359

Open lets-call-n-walk opened 3 months ago

lets-call-n-walk commented 3 months ago

Checklist:

Describe the bug

Observed behavior - Application refreshes are taking a painfully and abnormally long time to complete.

After much research and debugging our team has found the following abnormalities:

To Reproduce

  1. Deploy, via Argo-CD, the goldilocks operator into namespace container-platform - the operator pod creates and manages a Vertical Pod Autoscaler for every deployment it is configured for. It adds the VPA to the same namespace as the deployment it references. This VPA is not managed by argocd via a tracking-id.
  2. Configure it to reference all deployments in namespaces kube-system, container-platform, and container-monitoring.
  3. Deploy a few applications to each namespace.

When the generated VPAs update themselves with their resource recommendations, it then triggers an attempt to refresh for every single application in the namespace.

Expected behavior

I expect when the VPA object's resource recommendations are updated, argo will be entirely unaffected because the VPA object is not being tracked. I expect that a refresh should only be triggered if an application's resource is modified, not an unmanaged resource in the same namespace.

Screenshots

Screenshot 2024-08-02 at 11 56 24 AM

Version

argocd: v2.11.7+e4a0246
  BuildDate: 2024-07-24T15:56:20Z
  GitCommit: e4a0246c4d920bc1e5ee5f9048a99eca7e1d53cb
  GitTreeState: clean
  GoVersion: go1.22.5
  Compiler: gc
  Platform: darwin/amd64

Logs

Some sample logs:

time="2024-08-02T16:14:12Z" level=debug msg="Requesting app refresh caused by object update" api-version=autoscaling.k8s.io/v1 application=container-platform/aws-node-{{clustername}} cluster-name={{clustername}} fields.level=1 kind=VerticalPodAutoscaler name=goldilocks-vertical-pod-autoscaler-admission-controller namespace=kube-system server="{{clusterserver}}"
time="2024-08-02T16:14:12Z" level=debug msg="Requesting app refresh caused by object update" api-version=autoscaling.k8s.io/v1 application=container-platform/aws-ebs-csi-driver-{{clustername}} cluster-name={{clustername}} fields.level=1 kind=VerticalPodAutoscaler name=goldilocks-ebs-csi-controller namespace=kube-system server="{{clusterserver}}"
time="2024-08-02T16:14:12Z" level=debug msg="Requesting app refresh caused by object update" api-version=autoscaling.k8s.io/v1 application=container-platform/windows-gmsa-{{clustername}} cluster-name={{clustername}} fields.level=1 kind=VerticalPodAutoscaler name=goldilocks-karpenter namespace=kube-system server="{{clusterserver}}"
andrii-korotkov-verkada commented 1 week ago

Try out 2.13 - refreshes got much faster there. Also, all apps would refresh by default every 3 min.