argoproj / argo-cd

Declarative Continuous Deployment for Kubernetes
https://argo-cd.readthedocs.io
Apache License 2.0
17.65k stars 5.38k forks source link

All k8s namespace labels removed by ArgoCD after one numeric label was added #9408

Open orondon opened 2 years ago

orondon commented 2 years ago

Slack Ref. https://cloud-native.slack.com/archives/C01TSERG0KZ/p1651577780669539

Checklist:

Describe the bug

We are using ArgoCD to provision resources in RedHat Openshift Cluster Platform 4.6 and 4.8 clusters via HelmCharts (Git HelmChart -> ArgoCD -> OCP) One of the resources we provision are namespaces, which contained several namespace string labels. We added one numeric label (unquoted numeric value) as the k8s spec for namespace labels indicates support of alphanumeric values (https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/), but this caused ArgoCD to invalidate all the labels field and removed all pre-existing string labels.

To Reproduce

  1. Create a namespace in k8s via ArgoCD with additional custom string labels We use this template on the helm chart

    {{- if .Values.createNamespace }}
    apiVersion: v1
    kind: Namespace
    metadata:
    {{- with .Values.labels }}
    labels:
    {{- toYaml . | nindent 4 }}
    {{- end }}
    {{- with .Values.annotations }}
    annotations:
    {{- toYaml . | nindent 4 }}
    {{- end }}
    name: {{ .Release.Namespace }}
    {{- end }}

    With these values:

    apiVersion: argoproj.io/v1alpha1
    kind: Application
    metadata:
    name: nbd-nrvs-prd
    namespace: argocd
    labels:
    environment: mydev.mycompany.ca
    tenant: nbd
    spec:
    destination:
    namespace: nbd-nrvs-prd
    server: 'https://kubernetes.default.svc'
    project: myteam
    source:
    chart: namespace-base-config
    repoURL: https://artifactory.mycompany.ca/artifactory/helm-charts/
    targetRevision: 1.1.2
    helm:
      values: |
        labels:
          tenant: network-big-data
          project: Network-Resource-Visualization-System
          app: nrvs
          env: prd
    syncPolicy:
    automated:
      prune: true
      selfHeal: true

    So we obtain a live manifest:

    apiVersion: v1
    kind: Namespace
    metadata:
    labels:
    app: nrvs
    app.kubernetes.io/instance: nbd-nrvs-prd
    env: prd
    project: Network-Resource-Visualization-System
    tenant: network-big-data
    name: nbd-nrvs-prd
  2. Add a new unquoted numeric label Updating the values:

    apiVersion: argoproj.io/v1alpha1
    kind: Application
    metadata:
    name: nbd-nrvs-prd
    namespace: argocd
    labels:
    environment: mydev.mycompany.ca
    tenant: nbd
    spec:
    destination:
    namespace: nbd-nrvs-prd
    server: 'https://kubernetes.default.svc'
    project: myteam
    source:
    chart: namespace-base-config
    repoURL: https://artifactory.mycompany.ca/artifactory/helm-charts/
    targetRevision: 1.1.2
    helm:
      values: |
        labels:
          tenant: network-big-data
          project: Network-Resource-Visualization-System
          app: nrvs
          env: prd
          owner_pei: 123456
    syncPolicy:
    automated:
      prune: true
      selfHeal: true
  3. Validate labels in OCP and ArgoCD This is the live manifest

    apiVersion: v1
    kind: Namespace
    metadata:
    annotations:
    kubectl.kubernetes.io/last-applied-configuration: >
      {"apiVersion":"v1","kind":"Namespace","metadata":{"annotations":{},"labels":{"app.kubernetes.io/instance":"nbd-nrvs-prd"},"name":"nbd-nrvs-prd"}}
    openshift.io/sa.scc.mcs: 's0:c29,c14'
    openshift.io/sa.scc.supplemental-groups: 1000840000/10000
    openshift.io/sa.scc.uid-range: 1000840000/10000
    creationTimestamp: '2022-04-27T15:24:57Z'
    labels:
    app.kubernetes.io/instance: nbd-nrvs-prd
    kubernetes.io/metadata.name: nbd-nrvs-prd

    and the desired manifest:

    apiVersion: v1
    kind: Namespace
    metadata:
    labels:
    app.kubernetes.io/instance: nbd-nrvs-prd
    name: nbd-nrvs-prd

No diff, and no event on Argo. Previous labels (project, tenant, app, env) were removed in ArgoCD and OCP.

If we introduce the numeric label as string by quoting it the labels are correctly applied.

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: nbd-nrvs-prd
  namespace: argocd
  labels:
    environment: mydev.mycompany.ca
    tenant: nbd
spec:
  destination:
    namespace: nbd-nrvs-prd
    server: 'https://kubernetes.default.svc'
  project: myteam
  source:
    chart: namespace-base-config
    repoURL: https://artifactory.mycompany.ca/artifactory/helm-charts/
    targetRevision: 1.1.2
    helm:
      values: |
        labels:
          tenant: network-big-data
          project: Network-Resource-Visualization-System
          app: nrvs
          env: prd
          owner_pei: "123456"
  syncPolicy:
    automated:
      prune: true
      selfHeal: true

Expected behavior

Expected behavior is:

  1. ArgoCD will show an error indicating invalid label field.
  2. ArgoCD will keep valid manifest and will not delete the labels that previously existed in the namespace.

Screenshots

Version Reproducible in both versions below:

argocd: v2.0.0+f5119c0
  BuildDate: 2021-04-07T06:00:33Z
  GitCommit: f5119c06686399134b3f296d44445bcdbc778d42
  GitTreeState: clean
  GoVersion: go1.16
  Compiler: gc
  Platform: linux/amd64

and

argocd: v2.3.3+unknown
  BuildDate: 2022-04-14T19:42:50Z
  GitCommit: 
  GitTreeState: clean
  GoVersion: go1.17.5
  Compiler: gc
  Platform: linux/amd64

and in OCP 4.6 and 4.8

Logs

ArgoCD did not show any event messages.

Paste any relevant application logs here.
robermar23 commented 2 years ago

I just ran into this myself.

Except, for me, I was adding a boolean, not a numeric. unquoted.

apiVersion: v1
kind: Namespace
metadata:
  name: 14west-iris-plus-dev
  labels:
    cost-tenancy: Dedicated
    dept: Development
    org: WMC
    network-share: iris-plus
    newrelic-metadata-injection: enabled
    14west.io/patch-ingress-route: true

Once I quoted the boolean, it recognized all of the labels, otherwise, it saw none and removed the labels on a sync.

jwitko commented 1 year ago

This issue still happening in 2.6.3. For me its boolean and quoting is not helping.

lchechik-cloudinary commented 1 year ago

We experience the same issue, using version - 2.6.2 We use numeric values, and quoting does help to fix the issue.