argoproj / argo-cd

Declarative Continuous Deployment for Kubernetes
https://argo-cd.readthedocs.io
Apache License 2.0
18.01k stars 5.49k forks source link

server side apply broke pv #16058

Open rwong2888 opened 1 year ago

rwong2888 commented 1 year ago

version: Argo CD v2.8.4+c279299

resourceVersion and uid should be ignored in the claim ref.

Incidentally, I also had to add below to the manifest, when before it did not matter

apiVersion: v1
kind: PersistentVolumeClaim
image
rwong2888 commented 1 year ago

workaround in argocd-cm.yaml

  resource.customizations.ignoreDifferences.PersistentVolume: |
    jqPathExpressions:
    - .spec.claimRef.resourceVersion
    - .spec.claimRef.uid
leoluz commented 1 year ago

@rwong2888 Can you please confirm if this was working in 2.7.x?

rwong2888 commented 1 year ago

@leoluz , we did not have server side apply in 2.7.x, but I did revert to 2.7.9 and the issue is present. we also upgraded to 2.9.0-rc2 and it is present there as well

leoluz commented 1 year ago

@rwong2888 Thank you for confirming. Can you please provide the full live yaml of your PV with the managedFields included? Also, which Kubernetes version are you running?

rwong2888 commented 1 year ago

@leoluz see below for the live manifest. issue was with these fields in the claimRef

  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    resourceVersion: '1189329445'
    uid: 99d7b94e-6f11-4aff-9655-806066999cca
apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    argocd.argoproj.io/tracking-id: 'rsc.staging.us-east4:/PersistentVolume:rsc/staging-rsc-nfs-pv'
  creationTimestamp: '2023-10-23T03:02:18Z'
  finalizers:
    - kubernetes.io/pv-protection
  labels:
    app.kubernetes.io/instance: staging
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: rsc
    argocd/application: rsc.staging.us-east4
    cluster: miscellaneous
    environment: staging
    helm.sh/chart: rsc-0.0.1
    metaLabel: rsc-staging
    project: redacted-development
    provider: gcp
    region: us-east4
    team: redacted
    version: staging-v1
  name: staging-rsc-nfs-pv
  resourceVersion: '1189329450'
  uid: 64b351cf-ea5f-4b90-a262-0ede681eb825
spec:
  accessModes:
    - ReadWriteMany
  capacity:
    storage: 99Gi
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: staging-rsc-nfs-pvc
    namespace: rsc
    resourceVersion: '1189329445'
    uid: 99d7b94e-6f11-4aff-9655-806066999cca
  nfs:
    path: /exports/files
    server: dev-rsc-nfs-server.rsc.svc.cluster.local
  persistentVolumeReclaimPolicy: Delete
  storageClassName: nfs-storage
  volumeMode: Filesystem
status:
  phase: Bound
leoluz commented 1 year ago

@rwong2888 I'd like to inspect the managedFields in your live resource. To be able to retrieve it please run the command:

kubectl get pv staging-rsc-nfs-pv -oyaml --show-managed-fields

If you are retrieving from Argo CD, there is a checkbox in the UI to include the managedFields.

Please provide the kubernetes version as well.

rwong2888 commented 1 year ago

We've been doing k8s upgrades from 1.24. We are now on 1.27.4-gke.900

apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    argocd.argoproj.io/tracking-id: rsc.staging.us-east4:/PersistentVolume:rsc/staging-rsc-nfs-pv
  creationTimestamp: "2023-10-23T03:02:18Z"
  finalizers:
  - kubernetes.io/pv-protection
  labels:
    app.kubernetes.io/instance: staging
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: rsc
    argocd/application: rsc.staging.us-east4
    cluster: miscellaneous
    environment: staging
    helm.sh/chart: rsc-0.0.1
    metaLabel: rsc-staging
    project: redacted-development
    provider: gcp
    region: us-east4
    team: redacted
    version: staging-v1
  managedFields:
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          f:argocd.argoproj.io/tracking-id: {}
        f:labels:
          f:app.kubernetes.io/instance: {}
          f:app.kubernetes.io/managed-by: {}
          f:app.kubernetes.io/name: {}
          f:argocd/application: {}
          f:cluster: {}
          f:environment: {}
          f:helm.sh/chart: {}
          f:metaLabel: {}
          f:project: {}
          f:provider: {}
          f:region: {}
          f:team: {}
          f:version: {}
      f:spec:
        f:accessModes: {}
        f:capacity:
          f:storage: {}
        f:claimRef:
          f:apiVersion: {}
          f:kind: {}
          f:name: {}
          f:namespace: {}
        f:nfs:
          f:path: {}
          f:readOnly: {}
          f:server: {}
        f:persistentVolumeReclaimPolicy: {}
        f:storageClassName: {}
    manager: argocd-controller
    operation: Apply
    time: "2023-10-23T03:02:18Z"
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:spec:
        f:claimRef:
          f:resourceVersion: {}
          f:uid: {}
    manager: kube-controller-manager
    operation: Update
    time: "2023-10-23T03:02:18Z"
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:status:
        f:phase: {}
    manager: kube-controller-manager
    operation: Update
    subresource: status
    time: "2023-10-23T03:02:18Z"
  name: staging-rsc-nfs-pv
  resourceVersion: "1189329450"
  uid: 64b351cf-ea5f-4b90-a262-0ede681eb825
spec:
  accessModes:
  - ReadWriteMany
  capacity:
    storage: 99Gi
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: staging-rsc-nfs-pvc
    namespace: rsc
    resourceVersion: "1189329445"
    uid: 99d7b94e-6f11-4aff-9655-806066999cca
  nfs:
    path: /exports/files
    server: dev-rsc-nfs-server.rsc.svc.cluster.local
  persistentVolumeReclaimPolicy: Delete
  storageClassName: nfs-storage
  volumeMode: Filesystem
status:
  phase: Bound
andrii-korotkov-verkada commented 1 week ago

ArgoCD versions 2.10 and below have reached EOL. Can you upgrade and let us know if the issue is still present, please?

rwong2888 commented 1 day ago

@andrii-korotkov-verkada , looks like a different issue now with httpRoutes

image

andrii-korotkov-verkada commented 1 day ago

These changes are for the same manifests, right? Which argocd version is it? We may have a bug in diffing logic.

rwong2888 commented 1 day ago

It's a different manifest now @andrii-korotkov-verkada . I'm on v2.13.0+347f221.

Below is httpRoute desired manifest.

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
    external-dns.alpha.kubernetes.io/ttl: '60'
  name: staging-rsc-gateway-mesh-httproute
  namespace: rsc
spec:
  hostnames:
    - staging.rsc.redacted.com
  parentRefs:
    - name: gateway-external-01
      namespace: istio-ingress
    - group: ''
      kind: Service
      name: staging
  rules:
    - backendRefs:
        - name: staging
          port: 80
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  annotations:
    external-dns.alpha.kubernetes.io/ttl: '60'
  name: staging-rsc-service-entry-httproute
  namespace: rsc
spec:
  hostnames:
    - staging.rsc.redacted.com
  parentRefs:
    - group: networking.istio.io
      kind: ServiceEntry
      name: staging-rsc-serviceentry
  rules:
    - backendRefs:
        - group: networking.istio.io
          kind: Hostname
          name: staging.rsc.redacted.com
          port: 443
andrii-korotkov-verkada commented 1 day ago

Hm, can it be because the rule matches has a sort of default value? I don't see it in manifests though. How does it end up in live manifests?

rwong2888 commented 1 day ago

Yeah, I think it has some defaults and the live version looks like the red portion? in the diff mentioned in this comment. I just disabled SSA so it won't sync loop.

https://github.com/argoproj/argo-cd/issues/16058#issuecomment-2494258251