argoproj / argo-cd

Declarative Continuous Deployment for Kubernetes
https://argo-cd.readthedocs.io
Apache License 2.0
17.33k stars 5.26k forks source link

OutOfSync for HPA, due not sorted yaml keys #7846

Open knusperkrone opened 2 years ago

knusperkrone commented 2 years ago

Describe the bug

When adding a HPA, with memory and cpu limits, then argoCD application appears OutOfSync.

This is because of not consequent sorting the live manifest and the desired manifest.

To Reproduce

Applying a hpa like:

metrics:
  {{ if $root.Values.resources.hpa.targetCPUUtilizationPercentage }}
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: {{ $root.Values.resources.hpa.targetCPUUtilizationPercentage }}
  {{ end }}
  {{ if $root.Values.resources.hpa.targetMemoryUtilizationPercentage }}
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: {{ $root.Values.resources.hpa.targetCPUUtilizationPercentage }}
  {{ end }}

Expected behavior

ArgoCd shoulnd't mark such projects OutOfSync

Screenshots

Screenshot from 2021-12-03 10-57-26

Version

Argo CD v2.1.7+a408e29

mikebryant commented 2 years ago

Related upstream issue: https://github.com/kubernetes/kubernetes/issues/74099

fernandesnikhil commented 2 years ago

Ran into the same issue. The workaround mentioned in https://github.com/argoproj/argo-cd/issues/1079#issuecomment-463845851 under item 1 does not work for us, because while we have some control over the ordering of cpu and memory in the metrics list, we allow any number of custom metrics to be appended to the metrics list in the HPA spec via a helm values.

dmitry-mightydevops commented 2 years ago

I got the same issue with karpenter image

ricardojdsilva87 commented 1 year ago

Hello everyone, I have the exact same problem in different HPAs, for example on the nginx controller: https://github.com/kubernetes/ingress-nginx/issues/10038

It seems that somehow ArgoCD is ordering alphabetically the metrics on the generated manifest and that causes the differences because the oficial helm chart is written in another way: https://github.com/kubernetes/ingress-nginx/blob/main/charts/ingress-nginx/templates/controller-hpa.yaml And on the cluster the the HPA manifest is applied as is on the helm chart: image

This means that argocd is ordering alphabetically the metrics to be compared when it should follow the helm chart manifest. i think that this should be fixed on ArgoCD code because we might have hundreds of external helm charts that can use the approach they want.

Thank you!

mikebryant commented 1 year ago

This isn't ArgoCD doing it - ArgoCD is comparing the order in the chart to the one the cluster reports, and seeing a diff. When it changes the cluster to match the chart, something in the cluster (HPA Controller, apiserver etc) is reordering them.

The component changing the ordering is the one that needs to be fixed - see https://github.com/kubernetes/kubernetes/issues/74099

mikebryant commented 1 year ago

To demonstrate, without using ArgoCD at all:

[sbx|default] ➜  ~ cat /tmp/hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: php-apache
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: php-apache
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 50
[sbx|default] ➜  ~ k apply -f /tmp/hpa.yaml
horizontalpodautoscaler.autoscaling/php-apache created
[sbx|default] ➜  ~ k get hpa php-apache -o yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"autoscaling/v2","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"php-apache","namespace":"default"},"spec":{"maxReplicas":10,"metrics":[{"resource":{"name":"cpu","target":{"averageUtilization":50,"type":"Utilization"}},"type":"Resource"},{"resource":{"name":"memory","target":{"averageUtilization":50,"type":"Utilization"}},"type":"Resource"}],"minReplicas":1,"scaleTargetRef":{"apiVersion":"apps/v1","kind":"Deployment","name":"php-apache"}}}
  creationTimestamp: "2023-06-22T14:36:54Z"
  name: php-apache
  namespace: default
  resourceVersion: "802102959"
  uid: 6412a1ef-b454-468f-ac3e-8bc9f460c67b
spec:
  maxReplicas: 10
  metrics:
  - resource:
      name: memory
      target:
        averageUtilization: 50
        type: Utilization
    type: Resource
  - resource:
      name: cpu
      target:
        averageUtilization: 50
        type: Utilization
    type: Resource
  minReplicas: 1
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: php-apache
status:
  currentMetrics: null
  desiredReplicas: 0

When we check it after apply, they come back in a different order to the one I asked for

ricardojdsilva87 commented 1 year ago

Thanks for the explanation @mikebryant. It seems that for now we would have to ignore these fields on argocd until kubernetes fixes it on their side

mikebryant commented 1 year ago

@ricardojdsilva87 What version of Kubernetes are you on? Per https://github.com/kubernetes/kubernetes/issues/74099#issuecomment-1603097979, I think this may be fixed from 1.27.0 onwards