[flux v0.26.0-2] Kustomization tries to modify immutable fields

Legion2 commented 2 years ago

Describe the bug

I updated to flux 0.26.1 and then observed an reconciliation error in a deployment. I deleted the deployment and now there is a problem with the pvc of that deployment.

PersistentVolumeClaim/monitoring/influxdb-volume dry-run failed, reason: Invalid, error: PersistentVolumeClaim "influxdb-volume" is invalid: spec: Forbidden: spec is immutable after creation except resources.requests for bound claims
  core.PersistentVolumeClaimSpec{
        AccessModes:      {"ReadWriteOnce"},
        Selector:         nil,
        Resources:        {Requests: {s"storage": {i: {...}, s: "10Gi", Format: "BinarySI"}}},
-       VolumeName:       "",
+       VolumeName:       "pvc-c7f9929e-2741-43ce-b690-ed00816092ad",
        StorageClassName: &"aws-gp2-dynamic",
        VolumeMode:       &"Filesystem",
        DataSource:       nil,
  }

I tried to downgrade the kustomization controller, but that did not resolve the issue.

Steps to reproduce

Install flux 0.25.3

create a PVC

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: influxdb-volume
labels:
kustomize.toolkit.fluxcd.io/prune: disabled
spec:
storageClassName: aws-gp2-dynamic
accessModes:
- ReadWriteOnce
volumeMode: Filesystem
resources:
requests:
  storage: 10Gi

Update flux to 0.26.1

Expected behavior

Should work after the update

Screenshots and recordings

No response

OS / Distro

Ubuntu 21

Flux version

flux: v0.26.1

Flux check

► checking prerequisites ✔ Kubernetes 1.20.11-eks-f17b81 >=1.20.6-0 ► checking controllers ✔ helm-controller: deployment ready ► ghcr.io/fluxcd/helm-controller:v0.16.0 ✔ image-automation-controller: deployment ready ► ghcr.io/fluxcd/image-automation-controller:v0.20.0 ✔ image-reflector-controller: deployment ready ► ghcr.io/fluxcd/image-reflector-controller:v0.16.0 ✔ kustomize-controller: deployment ready ► ghcr.io/fluxcd/kustomize-controller:v0.19.1 ✔ notification-controller: deployment ready ► ghcr.io/fluxcd/notification-controller:v0.21.0 ✔ source-controller: deployment ready ► ghcr.io/fluxcd/source-controller:v0.21.1 ✔ all checks passed

Git provider

Gitlab

Container Registry provider

Gitlab

Additional context

No response

Code of Conduct

[X] I agree to follow this project's Code of Conduct

kingdonb commented 2 years ago

I'm going to try to reproduce this, but first could you please add some more information about how you created the PVC?

Is it just, create a PVC (without volumeName) through Flux 0.25.3, then upgrade to a different version?

(I did not run into this issue in my testing, possibly because I am always creating PVCs with volumeName set)

Legion2 commented 2 years ago

I updated the issue, I use PVC with dynamic provisioning of storage classes on aws. As a workaround I removed the pvc from the kustomization.

kingdonb commented 2 years ago

Can you please post the content output from:

kubectl get pvc influxdb-volume -oyaml --show-managed-fields

kingdonb commented 2 years ago

Also, there are some serious formatting errors in the YAML that you posted, and it's missing a namespace. This would not work in Flux. Can you put the YAML that was added in the commit to Flux?

kingdonb commented 2 years ago

I tried to replicate your scenario here:

https://github.com/kingdonb/fleet-testing/blob/main/apps/keycloak/example-pvc.yaml

Installed Flux 0.25.3, added a PVC and a pod bound to it so the PV would be created... omitted volumeName from my spec, as it appears you have (and most people would do)

Upgraded to Flux 0.26.1

I am not seeing any errors across the Kubernetes versions in my test matrix. If the issue is resolved for you, I cannot reproduce it and we'll have to close this, unless you can provide more information (or if somebody else has this issue.)

Thanks again for your report.

Legion2 commented 2 years ago

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    pv.kubernetes.io/bind-completed: "yes"
    pv.kubernetes.io/bound-by-controller: "yes"
    volume.beta.kubernetes.io/storage-provisioner: kubernetes.io/aws-ebs
    volume.kubernetes.io/selected-node: ip-192-168-41-199.eu-central-1.compute.internal
    volume.kubernetes.io/storage-resizer: kubernetes.io/aws-ebs
  creationTimestamp: "2020-10-02T16:43:43Z"
  finalizers:
  - kubernetes.io/pvc-protection
  labels:
    kustomize.toolkit.fluxcd.io/name: infrastructure
    kustomize.toolkit.fluxcd.io/namespace: flux-system
    kustomize.toolkit.fluxcd.io/prune: disabled
  managedFields:
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .: {}
          f:pv.kubernetes.io/bind-completed: {}
          f:pv.kubernetes.io/bound-by-controller: {}
          f:volume.beta.kubernetes.io/storage-provisioner: {}
          f:volume.kubernetes.io/selected-node: {}
          f:volume.kubernetes.io/storage-resizer: {}
        f:finalizers:
          .: {}
          v:"kubernetes.io/pvc-protection": {}
        f:labels:
          .: {}
          f:kustomize.toolkit.fluxcd.io/namespace: {}
          f:kustomize.toolkit.fluxcd.io/prune: {}
      f:spec:
        f:accessModes: {}
        f:resources:
          f:requests:
            .: {}
            f:storage: {}
        f:storageClassName: {}
        f:volumeMode: {}
        f:volumeName: {}
      f:status:
        f:accessModes: {}
        f:capacity:
          .: {}
          f:storage: {}
        f:phase: {}
    manager: kustomize-controller
    operation: Apply
    time: "2021-10-27T11:39:29Z"
  name: influxdb-volume
  namespace: monitoring
  resourceVersion: "456920544"
  uid: c7f9929e-2741-43ce-b690-ed00816092ad
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: aws-gp2-dynamic
  volumeMode: Filesystem
  volumeName: pvc-c7f9929e-2741-43ce-b690-ed00816092ad
status:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 10Gi
  phase: Bound

I have a kustomization.yaml file which defines the namespace for all resources

kingdonb commented 2 years ago

        f:volumeName: {}

This suggests that the Flux YAML in your git repo contains the volumeName field in its spec. Is that the case?

Legion2 commented 2 years ago

no

Legion2 commented 2 years ago

maybe does the kustomization add this field somehow?

kingdonb commented 2 years ago

It is possible the Flux manager took the field over when it shouldn't have, I'm not sure.

In my cluster, I have this:

- apiVersion: v1
  fieldsType: FieldsV1
  fieldsV1:
    f:metadata:
      f:annotations:
        f:pv.kubernetes.io/bind-completed: {}
        f:pv.kubernetes.io/bound-by-controller: {}
        f:volume.beta.kubernetes.io/storage-provisioner: {}
        f:volume.kubernetes.io/storage-provisioner: {}
    f:spec:
      f:volumeName: {}
  manager: kube-controller-manager
  operation: Update
  time: "2022-02-04T15:35:52Z"

If flux finds fields managed by kube-controller-manager, it does not manage or take them over as I understand it.

My cluster is kind + local-path-provisioner, I can try a different storage class provider as this might not be representative.

(The best bet is probably for me to try replicating this on EKS next, since that's your environment...)

Legion2 commented 2 years ago

I have kubernetes 1.20 and the volume is from 2020, back then it did not use server side apply

kingdonb commented 2 years ago

I have a test suite that runs on Flux 0.17.2 and upgrades it to the current version, I can use that to try to replicate the issue. If your volumes are that old, it might have different behavior – we resolved a number of issues like that to bring out 0.26.0, where you can only see them if you have started with Flux before serverside apply, and upgraded through 0.18.x-0.25.x.

Like this issue:

https://github.com/fluxcd/kustomize-controller/issues/486

I might actually need to start with a Kubernetes version before serverside apply to reliably reproduce all of these kind of issues. As if kube-controller-manager uses serverside apply, or if it gets captured in managedFields however that happens nowadays, it won't matter what version of Flux created the PVC initially in the cluster, I won't see the issue reproduced...

kingdonb commented 2 years ago

I confirmed that Kubernetes 1.20 with Flux 0.17.2 together with local-path-provisioner produces a PVC that looks like this:

kind: PersistentVolume
metadata:
  annotations:
    pv.kubernetes.io/provisioned-by: rancher.io/local-path
  creationTimestamp: "2022-02-04T17:52:28Z"
  finalizers:
  - kubernetes.io/pv-protection
  managedFields:
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:status:
        f:phase: {}
    manager: kube-controller-manager
    operation: Update
    time: "2022-02-04T17:52:28Z"

In other words, K8s 1.20.x is not old enough to satisfy the required test conditions. I'm looking into trying to create a K8s cluster @1.15.x, install Flux 0.17.x on it, ... but that will not work, since Flux hasn't supported K8s versions < 1.16.0 since Flux2 v0.0.7

This is an extreme case :)

I don't think we should balk at it though... I think we'll need to upgrade a cluster from K8s 1.15.x, to know for sure what happens when the cluster has resources that were initially created before server side apply had even reached beta.

SSA was marked beta in 1.16.x. It might be easier and just as effective to test against a cluster with 1.16.x, and just ensure that beta SSA feature is turned off before the volume is created. (That way, I don't also need to start with Flux v1...)

I've spent too much time on this today, but I think you have likely got something here. We may need to give some advisory if your cluster was in production before a certain threshold date. Hopefully there's a way we can reconcile this more broadly.

Legion2 commented 2 years ago

kind: PersistentVolume
metadata:
  annotations:
    pv.kubernetes.io/provisioned-by: rancher.io/local-path
  creationTimestamp: "2022-02-04T17:52:28Z"
  finalizers:
  - kubernetes.io/pv-protection
  managedFields:
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:status:
        f:phase: {}
    manager: kube-controller-manager
    operation: Update
    time: "2022-02-04T17:52:28Z"

@kingdonb this is the pv not the pvc

kingdonb commented 2 years ago

Whoops, you're right... I already tore down the test scaffold, I'll have to repeat the test again later. Sorry for the noise.

somtochiama commented 2 years ago

@Legion2 is it possible for you to paste the managedFields of the PVC before you updated flux,

Legion2 commented 2 years ago

No I don't have this data

stefanprodan commented 2 years ago

Removing f:volumeName from the managedFields with a kubectl patch should unblock you. But we need to figure out how did kustomize-controller ended up owning that field. Was this PVC created with Flux v1 or with kubectl?

Legion2 commented 2 years ago

We used flux from the beginning and used it for everything in the cluster, so I think it was created with flux 1. However, since then we migrated stuff multiple times and need to manually fix stuff, so I'm not 100% sure if kubectl is not involved here.

Legion2 commented 2 years ago

@stefanprodan I'm not familiar with the kubectl patch command, could you give an example on how to remove a managed field?

stefanprodan commented 2 years ago

Create a YAML with only metadata and managed fields, remove the volumeName, then apply it with kubectl patch, docs here https://kubernetes.io/docs/tasks/manage-kubernetes-objects/update-api-object-kubectl-patch/

monotek commented 2 years ago

We had the same problem with a lot of kustomizations and immutable CluserIP, volumname and stroageclass fields, updating from kustomize controller 0.14.1 to 0.20.2.

flux-system   backstage                   False   Service/backstage/patroni-metrics dry-run failed, reason: Invalid, error: Service "patroni-metrics" is invalid: [spec.clusterIPs[0]: Invalid value: []string(nil): primary clusterIP can not be unset, spec.ipFamilies[0]: Invalid value: []core.IPFamily(nil): primary ipFamily can not be unset]...

Removing the the managed fields via patch helps though.

kubectl -n namespace patch svc service-name --type=json -p='[{"op": "remove", "path": "/metadata/managedFields/0/fieldsV1/f:spec/f:clusterIP"}]'
kubectl -n namespace patch pvc pvc-name --type=json -p='[{"op": "remove", "path": "/metadata/managedFields/0/fieldsV1/f:spec/f:volumeName"}]'
kubectl -n namespace patch storageclass storageclass-name --type=json -p='[{"op": "remove", "path": "/metadata/managedFields/0/fieldsV1/f:spec/f:storageClassName"}]'

ludovicm67 commented 2 years ago

I was facing the same issue.

Using the following commands was helpful for me:

kubectl -n NAMESPACE patch pvc PVC_NAME --type=json -p='[{"op": "remove", "path": "/metadata/managedFields/0/fieldsV1/f:spec/f:storageClassName"}]'
kubectl -n NAMESPACE patch pvc PVC_NAME --type=json -p='[{"op": "remove", "path": "/metadata/managedFields/0/fieldsV1/f:spec/f:volumeName"}]'

SStorm commented 2 years ago

I've got the same problem migration from flux v1 to flux v2 (0.26.1) and with ingress-nginx.

Flux complains with:

❯ flux reconcile kustomization infra --with-source
► annotating GitRepository flux-system in flux-system namespace
✔ GitRepository annotated
◎ waiting for GitRepository reconciliation
✔ fetched revision aks1-westeurope/4bcf2a7eb50d2f869d217f850fa59955754d6375
► annotating Kustomization infra in flux-system namespace
✔ Kustomization annotated
◎ waiting for Kustomization reconciliation
✗ Kustomization reconciliation failed: Service/ingress-nginx/ingress-nginx dry-run failed, reason: Invalid, error: Service "ingress-nginx" is invalid: spec.clusterIPs[0]: Invalid value: []string(nil): primary clusterIP can not be unset

Here's what's in k8s:

apiVersion: v1
kind: Service
metadata:
  annotations:
    external-dns.alpha.kubernetes.io/hostname: [snip].
  creationTimestamp: "2020-10-22T16:20:39Z"
  finalizers:
  - service.kubernetes.io/load-balancer-cleanup
  labels:
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/part-of: ingress-nginx
    kustomize.toolkit.fluxcd.io/name: infra
    kustomize.toolkit.fluxcd.io/namespace: flux-system
  name: ingress-nginx
  namespace: ingress-nginx
  resourceVersion: "192626589"
  uid: a79bc74c-aa57-4223-9f80-0b25249f13b9
spec:
  clusterIP: 10.113.45.120
  clusterIPs:
  - 10.113.45.120
  externalTrafficPolicy: Local
  healthCheckNodePort: 32210
  ports:
  - name: http
    nodePort: 32695
    port: 80
    protocol: TCP
    targetPort: http
  - name: https
    nodePort: 30120
    port: 443
    protocol: TCP
    targetPort: https
  selector:
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/part-of: ingress-nginx
  sessionAffinity: None
  type: LoadBalancer
status:
  loadBalancer:
    ingress:
    - ip: [snip]

And here's what we always had in git:

kind: Service
apiVersion: v1
metadata:
  name: ingress-nginx
  namespace: ingress-nginx
  annotations:
    external-dns.alpha.kubernetes.io/hostname: ingress.aks1.westeurope.azure.cratedb.net.
  labels:
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/part-of: ingress-nginx
spec:
  type: LoadBalancer
  externalTrafficPolicy: Local
  ports:
  - name: http
    port: 80
    targetPort: http
  - name: https
    port: 443
    targetPort: https
  selector:
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/part-of: ingress-nginx

Banging my head on the table with this. Any help much appreciated.

monotek commented 2 years ago

If you want to see the actual problematic fields you have to add "--show-managed-fields" to your kubectl command.

See my post above for a workaround: https://github.com/fluxcd/flux2/issues/2386#issuecomment-1040102499

kubectl -n ingress-nginx patch svc ingress-nginx --type=json -p='[{"op": "remove", "path": "/metadata/managedFields/0/fieldsV1/f:spec/f:clusterIP"}]'

SStorm commented 2 years ago

Thank you! I did not know that managed fields are no longer shown by default, assumed it was a different issue. Problem solved.

kingdonb commented 2 years ago

If you have not tried upgrading to Flux ~v0.26.2~ v0.26.3 or greater, it has breaking changes around this area:

https://github.com/fluxcd/kustomize-controller/pull/562

The issue it fixes is here:

https://github.com/fluxcd/kustomize-controller/issues/558

This could easily be the same issue, and I think it most likely is. I see reports mentioning version v0.26.1 but I do not see anyone who has mentioned this issue on a version >= ~v0.26.2~ v0.26.3, which included the fix above (and another important one, @stefanprodan mentions below.)

As Flux takes over all the managed fields, so if you have edits which you expect to remain in the cluster but they are not in git, they will have to set a manager to avoid being overwritten by Flux. So I want to be careful about advising this upgrade, though it is important and it makes Flux work more like as advertised, (so I do not want to caution anyone away from it.)

https://fluxcd.io/docs/faq/#why-are-kubectl-edits-rolled-back-by-flux

If we can confirm this issue is still present in later versions of Flux, I will be glad to investigate. The kubectl patch described above should no longer be necessary after the upgrade. If anyone is still struggling with this, please let us know. 🙏

stefanprodan commented 2 years ago

I think those that upgraded to v0.26.0 or v0.26.1 or v0.26.2 will have this issue. In v0.26.3 we found a better way to take over the fields managed by kubectl. We'll need to point people that upgraded to v0.26.0-2 to this issue, as patching the managedFields to remove the immutable ones is the only way to fix this.

alex-berger commented 2 years ago

Same Problem here with flux v0.24.1 and Kubernetes v1.21.5. I have a PersistentVolumeClaim, which was created from a HelmRelease by FluxCD v0.24.1 on that very cluster some hours before and that looks like this:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: data-loki-compactor
  namespace: loki-system
  uid: 87edda4f-3878-4c14-becc-4a5d5eb398ae
  resourceVersion: "122520272"
  creationTimestamp: "2022-02-25T11:38:06Z"
  labels:
    app.kubernetes.io/component: compactor
    app.kubernetes.io/instance: loki-system-loki-distributed
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: loki-distributed
    app.kubernetes.io/version: 2.4.2
    helm.sh/chart: loki-distributed-0.44.0
    helm.toolkit.fluxcd.io/name: loki-distributed
    helm.toolkit.fluxcd.io/namespace: kube-system
  annotations:
    meta.helm.sh/release-name: loki-system-loki-distributed
    meta.helm.sh/release-namespace: loki-system
    pv.kubernetes.io/bind-completed: "yes"
    pv.kubernetes.io/bound-by-controller: "yes"
    volume.beta.kubernetes.io/storage-provisioner: ebs.csi.aws.com
    volume.kubernetes.io/selected-node: ip-10-176-39-133.eu-central-1.compute.internal
  finalizers:
    - kubernetes.io/pvc-protection
  managedFields:
    - manager: helm-controller
      operation: Update
      apiVersion: v1
      time: "2022-02-25T11:38:06Z"
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:annotations:
            .: {}
            f:meta.helm.sh/release-name: {}
            f:meta.helm.sh/release-namespace: {}
          f:labels:
            .: {}
            f:app.kubernetes.io/component: {}
            f:app.kubernetes.io/instance: {}
            f:app.kubernetes.io/managed-by: {}
            f:app.kubernetes.io/name: {}
            f:app.kubernetes.io/version: {}
            f:helm.sh/chart: {}
            f:helm.toolkit.fluxcd.io/name: {}
            f:helm.toolkit.fluxcd.io/namespace: {}
        f:spec:
          f:accessModes: {}
          f:resources:
            f:requests:
              .: {}
              f:storage: {}
          f:storageClassName: {}
          f:volumeMode: {}
    - manager: kube-scheduler
      operation: Update
      apiVersion: v1
      time: "2022-02-25T11:38:07Z"
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:annotations:
            f:volume.kubernetes.io/selected-node: {}
    - manager: kube-controller-manager
      operation: Update
      apiVersion: v1
      time: "2022-02-25T11:38:10Z"
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:annotations:
            f:pv.kubernetes.io/bind-completed: {}
            f:pv.kubernetes.io/bound-by-controller: {}
            f:volume.beta.kubernetes.io/storage-provisioner: {}
        f:spec:
          f:volumeName: {}
        f:status:
          f:accessModes: {}
          f:capacity:
            .: {}
            f:storage: {}
          f:phase: {}
  selfLink: /api/v1/namespaces/loki-system/persistentvolumeclaims/data-loki-compactor
status:
  phase: Bound
  accessModes:
    - ReadWriteOnce
  capacity:
    storage: 10Gi
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  volumeName: pvc-87edda4f-3878-4c14-becc-4a5d5eb398ae
  storageClassName: gp
  volumeMode: Filesystem

And now I get the following error upon reconcile of the corresponding HelmRelease:

❯ flux reconcile hr -n kube-system loki-distributed                                                                                                                     
► annotating HelmRelease loki-distributed in kube-system namespace
✔ HelmRelease annotated
◎ waiting for HelmRelease reconciliation
✗ HelmRelease reconciliation failed: Helm upgrade failed: failed to replace object: PersistentVolumeClaim "data-loki-compactor" is invalid: spec: Forbidden: spec is immutable after creation except resources.requests for bound claims
  core.PersistentVolumeClaimSpec{
        AccessModes:      {"ReadWriteOnce"},
        Selector:         nil,
        Resources:        {Requests: {s"storage": {i: {...}, s: "10Gi", Format: "BinarySI"}}},
-       VolumeName:       "",
+       VolumeName:       "pvc-87edda4f-3878-4c14-becc-4a5d5eb398ae",
        StorageClassName: &"gp",
        VolumeMode:       &"Filesystem",
        DataSource:       nil,
  }

Here the link to the offending template, it does not set volumeName (which is correct).

mhsh64 commented 2 years ago

@stefanprodan

Hi

We have same issue we are using latest version

We are using “v0.27.2"

K8s version 1.21.1

Can you advise please?

`flux-system xxx False Service/xxx/xxx-manager-metrics-service dry-run failed, reason: Invalid, error: Service "xxx-manager-metrics-service" is invalid: [spec.clusterIPs[0]: Invalid value: []string(nil): primary clusterIP can not be unset, spec.ipFamilies[0]: Invalid value: []core.IPFamily(nil): primary ipFamily can not be unset]... 126d

flux-system yyy False Service/yyy/yyy dry-run failed, reason: Invalid, error: Service "yyy" is invalid: [spec.clusterIPs[0]: Invalid value: []string(nil): primary clusterIP can not be unset, spec.ipFamilies[0]: Invalid value: []core.IPFamily(nil): primary ipFamily can not be unset]... 126d

flux-system zzz False Service/zzz/zzz-controller-metrics dry-run failed, reason: Invalid, error: Service "zzz-controller-metrics" is invalid: spec.clusterIPs[0]: Invalid value: []string(nil): primary clusterIP can not be unset...

flux-system www False Service/www/www-webhook-service dry-run failed, reason: Invalid, error: Service "www-webhook-service" is invalid: [spec.clusterIPs[0]: Invalid value: []string(nil): primary clusterIP can not be unset, spec.ipFamilies[0]: Invalid value: []core.IPFamily(nil): primary ipFamily can not be unset]...`

kingdonb commented 2 years ago

This issue appears to be extant when clusters are being upgraded from a version of Kubernetes before v1.18, or thereabouts, when Server-Side Apply and managedFields became GA. I saw an almost identical report yesterday, I can't find it just now...

If your services are created before K8s v1.18, then the cluster adds clusterIP to spec when the service is created, and they are created as though they have been populated completely by kubectl. (Flux sees this as a drift to be corrected.)

If your services are created after K8s v1.18, then when the cluster adds clusterIP it does it through server-side apply. This means the field is marked as owned by kube-controller-manager or something like that, which Flux knows as an important source of drift that is permitted to remain in the Service, Flux will not try to overwrite it.

If the field is not present in Git and it's not marked as owned by the well-known manager, then Flux has no way to detect it as "not a drift that should be reverted." Maybe we should add another special case here for Service.spec.clusterIP? This is likely to be the most common case of this issue, but there can most certainly be others. It is unusual but definitely not quite "uncommon" for a controller to update the spec. (Other native Kubernetes resources do it too, like PersistentVolumes.)

If it is safe for you to delete and recreate the services that are offending, then you should be able to delete and recreate them through Flux by the old-fashioned way, or by adding spec.force: true to the Kustomization. I think that will allow the immutable field to be overwritten, as Flux can delete and recreate the service. Be careful doing this, as it will interrupt the service temporarily.

~I'm not aware of any better way to solve this for now~ (Edit: the better way is to patch managedFields, https://github.com/fluxcd/flux2/issues/2386#issuecomment-1055123204) else you have to delete and recreate those services. @mhsh64 has your cluster been upgraded at some time from a K8s version that was earlier than v1.18.x? If so, I think you definitely have the same issue that I'm describing here.

mhsh64 commented 2 years ago

@kingdonb thanks for details

I don't recall when we had 1.18 but it is long time back. And you are right these services/resources exist on clusters from that version and we just upgraded them

Our clusters running in 1.21 version for a while, and I am wondering if there is no other fix in flux itself to cover this scenario!?

mhsh64 commented 2 years ago

But I am also wondering why there was no issue in our previous flux version?

Shouldn't we upgrade the flux version because of this issue?

stefanprodan commented 2 years ago

I'm not aware of any better way to solve this for now, you have to delete and recreate those services.

To avoid deleting the Service, a patch can be used to remove the ClusterIP from the managed fields:

kubectl patch svc my-svc --type=json -p='[{"op": "remove", "path": "/metadata/managedFields/0/fieldsV1/f:spec/f:clusterIP"}]'

Note that the index must match the kustomize-controller manager, the example above is for index 0.

tshak commented 2 years ago

I believe that I've reproduced this issue when migrating from Fluxv1->Fluxv2. A Knative Serving resource manages all annotations. Its validating webhook prevents modification of these annotations. The yaml in git does not contain any annotations. Everything applies fine with Flux v1. When applying with Fluxv2 I get the following error:

BadRequest, error: admission webhook "validation.webhook.serving.knative.dev" denied the request: validation failed: annotation value is immutable: metadata.annotations.serving.knative.dev/creator

Note that metadata.annotations is a managed field owned by Knative Serving. Is it fair to assume that v0.27.3 will fix this issue without a workaround? Please let me know if you need full debug info.

tshak commented 2 years ago

I'm still reproducing my issue with v0.27.3. I misread the previous comment about fixes being in 0.26.3, not 0.27.3 so this is not surprising. It's not clear to me why the patch would mutate existing annotations as 1) our resource in git contains no annotations and 2) annotations are managed by another controller as specified in managedFields. This is a Fluxv1->Fluxv2 migration blocker for us.

tshak commented 2 years ago

I've tested using flux build kustomization... | kubectl apply -f - and I can apply the manifest without any issues. This leads me to believe that the issue is with how the kustomize controller is applying the manifests.

kingdonb commented 2 years ago

If knative updates annotations in resources that Flux applies, then what do you see in managedFields before Flux overwrites it?

kubectl get [resource that errored] -oyaml --show-managed-fields

@tshak the problem is likely that Flux doesn't allow arbitrary writes from arbitrary writers to persist, that is considered drift and then reverted. This is new behavior https://fluxcd.io/docs/faq/#why-are-kubectl-edits-rolled-back-by-flux

It was required to make changes to the way that Flux does applies after server-side apply was implemented because of some issues with Kubernetes. Many users reported that Flux was allowing drift to persist in their clusters even though they had made attempts to overwrite it in git, or the section of config had already been deleted, but it still persisted in the cluster.

Flux has adjusted the approach to honor the expectations of GitOps users everywhere that expect configuration to reflect what is in git, and not some drift introduced from an unknown source. But some resources are not consistently putting their (intentional, expected, according-to-specifications) drift in places where it can work with GitOps that behaves this way.

Instead of using kubectl apply -f use kubectl apply --server-side -f and see what happens. That's more similar to what Flux is doing now than kubectl apply -f, which is a client-side apply. Also, please provide the content of managed fields, it will help us better understand what knative is doing and if there is any workaround we can build into Flux to resolve this now.

tshak commented 2 years ago

Thank you for the detailed explanation. Here is an example Knative Service that is failing to apply:

Git:

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: emojivoto
  namespace: demo
spec:
  template:
    spec:
      containers:
      - name: emojivoto
        image: buoyantio/emojivoto-web:v6
        env:
        - name: EMOJISVC_HOST
          value: emojivoto-emoji-svc.demo.svc.cluster.local:80
        - name: VOTINGSVC_HOST
          value: emojivoto-voting-svc.demo.svc.cluster.local:80
        - name: INDEX_BUNDLE
          value: dist/index_bundle.js
        - name: WEB_PORT
          value: "80"
        ports:
        - containerPort: 80
          protocol: TCP

Server (status is redacted for brevity) :

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  annotations:
    serving.knative.dev/creator: system:serviceaccount:flux:flux
    serving.knative.dev/lastModifier: system:serviceaccount:flux:flux
  creationTimestamp: "2021-12-09T12:21:32Z"
  generation: 5
  labels:
    kustomize.toolkit.fluxcd.io/name: apps-kubernetes-state
    kustomize.toolkit.fluxcd.io/namespace: flux-system
  managedFields:
  - apiVersion: serving.knative.dev/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations: {}
        f:labels:
          .: {}
          f:kustomize.toolkit.fluxcd.io/name: {}
          f:kustomize.toolkit.fluxcd.io/namespace: {}
      f:spec:
        .: {}
        f:template:
          .: {}
          f:spec:
            .: {}
            f:containers: {}
    manager: kustomize-controller
    operation: Apply
    time: "2021-12-09T15:49:33Z"
  - apiVersion: serving.knative.dev/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:status:
        .: {}
        f:address:
          .: {}
          f:url: {}
        f:conditions: {}
        f:latestCreatedRevisionName: {}
        f:latestReadyRevisionName: {}
        f:observedGeneration: {}
        f:traffic: {}
        f:url: {}
    manager: controller
    operation: Update
    time: "2021-12-09T15:40:00Z"
  name: emojivoto
  namespace: demo
  resourceVersion: "533389058"
  uid: d367349c-aa69-4e82-96f8-f88804108c27
spec:
  template:
    metadata:
      creationTimestamp: null
    spec:
      containerConcurrency: 0
      containers:
      - env:
        - name: EMOJISVC_HOST
          value: emojivoto-emoji-svc.demo.svc.cluster.local:80
        - name: VOTINGSVC_HOST
          value: emojivoto-voting-svc.demo.svc.cluster.local:80
        - name: INDEX_BUNDLE
          value: dist/index_bundle.js
        - name: WEB_PORT
          value: "80"
        image: buoyantio/emojivoto-web:v6
        name: emojivoto
        ports:
        - containerPort: 80
          protocol: TCP
        readinessProbe:
          successThreshold: 1
          tcpSocket:
            port: 0
        resources:
          limits:
            cpu: "1"
            memory: 2G
          requests:
            cpu: 50m
            memory: 50M
      enableServiceLinks: false
      timeoutSeconds: 30
  traffic:
  - latestRevision: true
    percent: 100

tshak commented 2 years ago

I was able to successfully perform a server side apply ut it required --force-conflict due to the managed field .spec.template.spec.containers. This is expected as knative manages this field (although I'm not sure why). When I set force: true for the Kustomization resource I still see the annotation error:

Service/demo/emojivoto dry-run failed, reason: BadRequest, error: admission webhook "validation.webhook.serving.knative.dev" denied the request: validation failed: annotation value is immutable: metadata.annotations.serving.knative.dev/creator

Interestingly, a flux diff... using the exact same arguments fails with this error as well.

stefanprodan commented 2 years ago

@tshak try to remove the annotations with kubectl -n demo patch service emojivoto --type=json -p='[{"op": "remove", "path": "/metadata/managedFields/0/fieldsV1/f:metadata/f:annotations"}]'

tshak commented 2 years ago

After patching the resource I can still repro the issue. I don't believe that the root cause is due to managed fields or server-side apply behaviour. Here is a recap of what I've tried (all apply commands are using --server-side):

0) kubectl apply... the standalone yaml from the git repo that contains only the offending resource (emojivoto knative service) ✅
0) flux build... | kubectl apply... the Kustomization which contains the offending resource ✅ 0) flux diff... the Kustomization which contains the offending resource ❌ 0) Kustomization applied by the kustomization-controller ❌

In all ❌ cases, the error is the same annotation value is immutable error from the knative admission webhook.

stefanprodan commented 2 years ago

Flux performs a server-side apply, and since it manages f:annotations it tries to remove them as they are not in Git. After you've run the kubectl patch can you confirm that annotations are no longer managed by kustomize-controller?

tshak commented 2 years ago

I have confirmed that f:annotations: {} is no longer present in the offending resource. However, the error persists.

stefanprodan commented 2 years ago

So if you do kubectl apply --server-side does it fail the same?

tshak commented 2 years ago

In this case it fails but as expected:

error: Apply failed with 1 conflict: conflict with "kustomize-controller": .spec.template.spec.containers

It passes by adding --force-conflicts.

stefanprodan commented 2 years ago

I see in the annotations "serving.knative.dev/creator: system:serviceaccount:flux:flux" but there is no such service account in cluster. Have you changed the SA?

tshak commented 2 years ago

This resource was created with FluxV1. I'm testing a FluxV1->FluxV2 migration.

stefanprodan commented 2 years ago

@tshak there is nothing we can do in Flux about this, if Knative decided to make annotations immutable then you can't reconcile this with anything else but the flux SA. I have no clue why would Knative do such a crazy thing...

tshak commented 2 years ago

I think that it's just these specific knative annotations that are immutable. And again, even a flux build... | kubectl apply --server-side... works.

fluxcd / flux2