Open Legion2 opened 2 years ago
I'm going to try to reproduce this, but first could you please add some more information about how you created the PVC?
Is it just, create a PVC (without volumeName) through Flux 0.25.3, then upgrade to a different version?
(I did not run into this issue in my testing, possibly because I am always creating PVCs with volumeName
set)
I updated the issue, I use PVC with dynamic provisioning of storage classes on aws. As a workaround I removed the pvc from the kustomization.
Can you please post the content output from:
kubectl get pvc influxdb-volume -oyaml --show-managed-fields
Also, there are some serious formatting errors in the YAML that you posted, and it's missing a namespace. This would not work in Flux. Can you put the YAML that was added in the commit to Flux?
I tried to replicate your scenario here:
https://github.com/kingdonb/fleet-testing/blob/main/apps/keycloak/example-pvc.yaml
Installed Flux 0.25.3, added a PVC and a pod bound to it so the PV would be created... omitted volumeName from my spec, as it appears you have (and most people would do)
Upgraded to Flux 0.26.1
I am not seeing any errors across the Kubernetes versions in my test matrix. If the issue is resolved for you, I cannot reproduce it and we'll have to close this, unless you can provide more information (or if somebody else has this issue.)
Thanks again for your report.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
annotations:
pv.kubernetes.io/bind-completed: "yes"
pv.kubernetes.io/bound-by-controller: "yes"
volume.beta.kubernetes.io/storage-provisioner: kubernetes.io/aws-ebs
volume.kubernetes.io/selected-node: ip-192-168-41-199.eu-central-1.compute.internal
volume.kubernetes.io/storage-resizer: kubernetes.io/aws-ebs
creationTimestamp: "2020-10-02T16:43:43Z"
finalizers:
- kubernetes.io/pvc-protection
labels:
kustomize.toolkit.fluxcd.io/name: infrastructure
kustomize.toolkit.fluxcd.io/namespace: flux-system
kustomize.toolkit.fluxcd.io/prune: disabled
managedFields:
- apiVersion: v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.: {}
f:pv.kubernetes.io/bind-completed: {}
f:pv.kubernetes.io/bound-by-controller: {}
f:volume.beta.kubernetes.io/storage-provisioner: {}
f:volume.kubernetes.io/selected-node: {}
f:volume.kubernetes.io/storage-resizer: {}
f:finalizers:
.: {}
v:"kubernetes.io/pvc-protection": {}
f:labels:
.: {}
f:kustomize.toolkit.fluxcd.io/namespace: {}
f:kustomize.toolkit.fluxcd.io/prune: {}
f:spec:
f:accessModes: {}
f:resources:
f:requests:
.: {}
f:storage: {}
f:storageClassName: {}
f:volumeMode: {}
f:volumeName: {}
f:status:
f:accessModes: {}
f:capacity:
.: {}
f:storage: {}
f:phase: {}
manager: kustomize-controller
operation: Apply
time: "2021-10-27T11:39:29Z"
name: influxdb-volume
namespace: monitoring
resourceVersion: "456920544"
uid: c7f9929e-2741-43ce-b690-ed00816092ad
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: aws-gp2-dynamic
volumeMode: Filesystem
volumeName: pvc-c7f9929e-2741-43ce-b690-ed00816092ad
status:
accessModes:
- ReadWriteOnce
capacity:
storage: 10Gi
phase: Bound
I have a kustomization.yaml file which defines the namespace for all resources
f:volumeName: {}
This suggests that the Flux YAML in your git repo contains the volumeName
field in its spec. Is that the case?
no
maybe does the kustomization add this field somehow?
It is possible the Flux manager took the field over when it shouldn't have, I'm not sure.
In my cluster, I have this:
- apiVersion: v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
f:pv.kubernetes.io/bind-completed: {}
f:pv.kubernetes.io/bound-by-controller: {}
f:volume.beta.kubernetes.io/storage-provisioner: {}
f:volume.kubernetes.io/storage-provisioner: {}
f:spec:
f:volumeName: {}
manager: kube-controller-manager
operation: Update
time: "2022-02-04T15:35:52Z"
If flux finds fields managed by kube-controller-manager
, it does not manage or take them over as I understand it.
My cluster is kind + local-path-provisioner, I can try a different storage class provider as this might not be representative.
(The best bet is probably for me to try replicating this on EKS next, since that's your environment...)
I have kubernetes 1.20 and the volume is from 2020, back then it did not use server side apply
I have a test suite that runs on Flux 0.17.2 and upgrades it to the current version, I can use that to try to replicate the issue. If your volumes are that old, it might have different behavior – we resolved a number of issues like that to bring out 0.26.0, where you can only see them if you have started with Flux before serverside apply, and upgraded through 0.18.x-0.25.x.
Like this issue:
I might actually need to start with a Kubernetes version before serverside apply to reliably reproduce all of these kind of issues. As if kube-controller-manager uses serverside apply, or if it gets captured in managedFields however that happens nowadays, it won't matter what version of Flux created the PVC initially in the cluster, I won't see the issue reproduced...
I confirmed that Kubernetes 1.20 with Flux 0.17.2 together with local-path-provisioner produces a PVC that looks like this:
kind: PersistentVolume
metadata:
annotations:
pv.kubernetes.io/provisioned-by: rancher.io/local-path
creationTimestamp: "2022-02-04T17:52:28Z"
finalizers:
- kubernetes.io/pv-protection
managedFields:
- apiVersion: v1
fieldsType: FieldsV1
fieldsV1:
f:status:
f:phase: {}
manager: kube-controller-manager
operation: Update
time: "2022-02-04T17:52:28Z"
In other words, K8s 1.20.x is not old enough to satisfy the required test conditions. I'm looking into trying to create a K8s cluster @1.15.x, install Flux 0.17.x on it, ... but that will not work, since Flux hasn't supported K8s versions < 1.16.0 since Flux2 v0.0.7
This is an extreme case :)
I don't think we should balk at it though... I think we'll need to upgrade a cluster from K8s 1.15.x, to know for sure what happens when the cluster has resources that were initially created before server side apply had even reached beta.
SSA was marked beta in 1.16.x. It might be easier and just as effective to test against a cluster with 1.16.x, and just ensure that beta SSA feature is turned off before the volume is created. (That way, I don't also need to start with Flux v1...)
I've spent too much time on this today, but I think you have likely got something here. We may need to give some advisory if your cluster was in production before a certain threshold date. Hopefully there's a way we can reconcile this more broadly.
kind: PersistentVolume
metadata:
annotations:
pv.kubernetes.io/provisioned-by: rancher.io/local-path
creationTimestamp: "2022-02-04T17:52:28Z"
finalizers:
- kubernetes.io/pv-protection
managedFields:
- apiVersion: v1
fieldsType: FieldsV1
fieldsV1:
f:status:
f:phase: {}
manager: kube-controller-manager
operation: Update
time: "2022-02-04T17:52:28Z"
@kingdonb this is the pv not the pvc
Whoops, you're right... I already tore down the test scaffold, I'll have to repeat the test again later. Sorry for the noise.
@Legion2 is it possible for you to paste the managedFields of the PVC before you updated flux,
No I don't have this data
Removing f:volumeName
from the managedFields with a kubectl patch should unblock you. But we need to figure out how did kustomize-controller ended up owning that field. Was this PVC created with Flux v1 or with kubectl?
We used flux from the beginning and used it for everything in the cluster, so I think it was created with flux 1. However, since then we migrated stuff multiple times and need to manually fix stuff, so I'm not 100% sure if kubectl is not involved here.
@stefanprodan I'm not familiar with the kubectl patch command, could you give an example on how to remove a managed field?
Create a YAML with only metadata and managed fields, remove the volumeName, then apply it with kubectl patch, docs here https://kubernetes.io/docs/tasks/manage-kubernetes-objects/update-api-object-kubectl-patch/
We had the same problem with a lot of kustomizations and immutable CluserIP, volumname and stroageclass fields, updating from kustomize controller 0.14.1 to 0.20.2.
flux-system backstage False Service/backstage/patroni-metrics dry-run failed, reason: Invalid, error: Service "patroni-metrics" is invalid: [spec.clusterIPs[0]: Invalid value: []string(nil): primary clusterIP can not be unset, spec.ipFamilies[0]: Invalid value: []core.IPFamily(nil): primary ipFamily can not be unset]...
Removing the the managed fields via patch helps though.
kubectl -n namespace patch svc service-name --type=json -p='[{"op": "remove", "path": "/metadata/managedFields/0/fieldsV1/f:spec/f:clusterIP"}]'
kubectl -n namespace patch pvc pvc-name --type=json -p='[{"op": "remove", "path": "/metadata/managedFields/0/fieldsV1/f:spec/f:volumeName"}]'
kubectl -n namespace patch storageclass storageclass-name --type=json -p='[{"op": "remove", "path": "/metadata/managedFields/0/fieldsV1/f:spec/f:storageClassName"}]'
I was facing the same issue.
Using the following commands was helpful for me:
kubectl -n NAMESPACE patch pvc PVC_NAME --type=json -p='[{"op": "remove", "path": "/metadata/managedFields/0/fieldsV1/f:spec/f:storageClassName"}]'
kubectl -n NAMESPACE patch pvc PVC_NAME --type=json -p='[{"op": "remove", "path": "/metadata/managedFields/0/fieldsV1/f:spec/f:volumeName"}]'
I've got the same problem migration from flux v1 to flux v2 (0.26.1) and with ingress-nginx
.
Flux complains with:
❯ flux reconcile kustomization infra --with-source
► annotating GitRepository flux-system in flux-system namespace
✔ GitRepository annotated
◎ waiting for GitRepository reconciliation
✔ fetched revision aks1-westeurope/4bcf2a7eb50d2f869d217f850fa59955754d6375
► annotating Kustomization infra in flux-system namespace
✔ Kustomization annotated
◎ waiting for Kustomization reconciliation
✗ Kustomization reconciliation failed: Service/ingress-nginx/ingress-nginx dry-run failed, reason: Invalid, error: Service "ingress-nginx" is invalid: spec.clusterIPs[0]: Invalid value: []string(nil): primary clusterIP can not be unset
Here's what's in k8s:
apiVersion: v1
kind: Service
metadata:
annotations:
external-dns.alpha.kubernetes.io/hostname: [snip].
creationTimestamp: "2020-10-22T16:20:39Z"
finalizers:
- service.kubernetes.io/load-balancer-cleanup
labels:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
kustomize.toolkit.fluxcd.io/name: infra
kustomize.toolkit.fluxcd.io/namespace: flux-system
name: ingress-nginx
namespace: ingress-nginx
resourceVersion: "192626589"
uid: a79bc74c-aa57-4223-9f80-0b25249f13b9
spec:
clusterIP: 10.113.45.120
clusterIPs:
- 10.113.45.120
externalTrafficPolicy: Local
healthCheckNodePort: 32210
ports:
- name: http
nodePort: 32695
port: 80
protocol: TCP
targetPort: http
- name: https
nodePort: 30120
port: 443
protocol: TCP
targetPort: https
selector:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
sessionAffinity: None
type: LoadBalancer
status:
loadBalancer:
ingress:
- ip: [snip]
And here's what we always had in git:
kind: Service
apiVersion: v1
metadata:
name: ingress-nginx
namespace: ingress-nginx
annotations:
external-dns.alpha.kubernetes.io/hostname: ingress.aks1.westeurope.azure.cratedb.net.
labels:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
spec:
type: LoadBalancer
externalTrafficPolicy: Local
ports:
- name: http
port: 80
targetPort: http
- name: https
port: 443
targetPort: https
selector:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/part-of: ingress-nginx
Banging my head on the table with this. Any help much appreciated.
If you want to see the actual problematic fields you have to add "--show-managed-fields" to your kubectl command.
See my post above for a workaround: https://github.com/fluxcd/flux2/issues/2386#issuecomment-1040102499
kubectl -n ingress-nginx patch svc ingress-nginx --type=json -p='[{"op": "remove", "path": "/metadata/managedFields/0/fieldsV1/f:spec/f:clusterIP"}]'
Thank you! I did not know that managed fields are no longer shown by default, assumed it was a different issue. Problem solved.
If you have not tried upgrading to Flux ~v0.26.2~ v0.26.3 or greater, it has breaking changes around this area:
The issue it fixes is here:
This could easily be the same issue, and I think it most likely is. I see reports mentioning version v0.26.1 but I do not see anyone who has mentioned this issue on a version >= ~v0.26.2~ v0.26.3, which included the fix above (and another important one, @stefanprodan mentions below.)
As Flux takes over all the managed fields, so if you have edits which you expect to remain in the cluster but they are not in git, they will have to set a manager to avoid being overwritten by Flux. So I want to be careful about advising this upgrade, though it is important and it makes Flux work more like as advertised, (so I do not want to caution anyone away from it.)
https://fluxcd.io/docs/faq/#why-are-kubectl-edits-rolled-back-by-flux
If we can confirm this issue is still present in later versions of Flux, I will be glad to investigate. The kubectl patch described above should no longer be necessary after the upgrade. If anyone is still struggling with this, please let us know. 🙏
I think those that upgraded to v0.26.0 or v0.26.1 or v0.26.2 will have this issue. In v0.26.3 we found a better way to take over the fields managed by kubectl. We'll need to point people that upgraded to v0.26.0-2 to this issue, as patching the managedFields
to remove the immutable ones is the only way to fix this.
Same Problem here with flux v0.24.1
and Kubernetes v1.21.5
. I have a PersistentVolumeClaim
, which was created from a HelmRelease
by FluxCD v0.24.1
on that very cluster some hours before and that looks like this:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: data-loki-compactor
namespace: loki-system
uid: 87edda4f-3878-4c14-becc-4a5d5eb398ae
resourceVersion: "122520272"
creationTimestamp: "2022-02-25T11:38:06Z"
labels:
app.kubernetes.io/component: compactor
app.kubernetes.io/instance: loki-system-loki-distributed
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: loki-distributed
app.kubernetes.io/version: 2.4.2
helm.sh/chart: loki-distributed-0.44.0
helm.toolkit.fluxcd.io/name: loki-distributed
helm.toolkit.fluxcd.io/namespace: kube-system
annotations:
meta.helm.sh/release-name: loki-system-loki-distributed
meta.helm.sh/release-namespace: loki-system
pv.kubernetes.io/bind-completed: "yes"
pv.kubernetes.io/bound-by-controller: "yes"
volume.beta.kubernetes.io/storage-provisioner: ebs.csi.aws.com
volume.kubernetes.io/selected-node: ip-10-176-39-133.eu-central-1.compute.internal
finalizers:
- kubernetes.io/pvc-protection
managedFields:
- manager: helm-controller
operation: Update
apiVersion: v1
time: "2022-02-25T11:38:06Z"
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.: {}
f:meta.helm.sh/release-name: {}
f:meta.helm.sh/release-namespace: {}
f:labels:
.: {}
f:app.kubernetes.io/component: {}
f:app.kubernetes.io/instance: {}
f:app.kubernetes.io/managed-by: {}
f:app.kubernetes.io/name: {}
f:app.kubernetes.io/version: {}
f:helm.sh/chart: {}
f:helm.toolkit.fluxcd.io/name: {}
f:helm.toolkit.fluxcd.io/namespace: {}
f:spec:
f:accessModes: {}
f:resources:
f:requests:
.: {}
f:storage: {}
f:storageClassName: {}
f:volumeMode: {}
- manager: kube-scheduler
operation: Update
apiVersion: v1
time: "2022-02-25T11:38:07Z"
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
f:volume.kubernetes.io/selected-node: {}
- manager: kube-controller-manager
operation: Update
apiVersion: v1
time: "2022-02-25T11:38:10Z"
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
f:pv.kubernetes.io/bind-completed: {}
f:pv.kubernetes.io/bound-by-controller: {}
f:volume.beta.kubernetes.io/storage-provisioner: {}
f:spec:
f:volumeName: {}
f:status:
f:accessModes: {}
f:capacity:
.: {}
f:storage: {}
f:phase: {}
selfLink: /api/v1/namespaces/loki-system/persistentvolumeclaims/data-loki-compactor
status:
phase: Bound
accessModes:
- ReadWriteOnce
capacity:
storage: 10Gi
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
volumeName: pvc-87edda4f-3878-4c14-becc-4a5d5eb398ae
storageClassName: gp
volumeMode: Filesystem
And now I get the following error upon reconcile of the corresponding HelmRelease:
❯ flux reconcile hr -n kube-system loki-distributed
► annotating HelmRelease loki-distributed in kube-system namespace
✔ HelmRelease annotated
◎ waiting for HelmRelease reconciliation
✗ HelmRelease reconciliation failed: Helm upgrade failed: failed to replace object: PersistentVolumeClaim "data-loki-compactor" is invalid: spec: Forbidden: spec is immutable after creation except resources.requests for bound claims
core.PersistentVolumeClaimSpec{
AccessModes: {"ReadWriteOnce"},
Selector: nil,
Resources: {Requests: {s"storage": {i: {...}, s: "10Gi", Format: "BinarySI"}}},
- VolumeName: "",
+ VolumeName: "pvc-87edda4f-3878-4c14-becc-4a5d5eb398ae",
StorageClassName: &"gp",
VolumeMode: &"Filesystem",
DataSource: nil,
}
Here the link to the offending template, it does not set volumeName
(which is correct).
@stefanprodan
Hi
We have same issue we are using latest version
We are using “v0.27.2"
K8s version 1.21.1
Can you advise please?
`flux-system xxx False Service/xxx/xxx-manager-metrics-service dry-run failed, reason: Invalid, error: Service "xxx-manager-metrics-service" is invalid: [spec.clusterIPs[0]: Invalid value: []string(nil): primary clusterIP can not be unset, spec.ipFamilies[0]: Invalid value: []core.IPFamily(nil): primary ipFamily can not be unset]... 126d
flux-system yyy False Service/yyy/yyy dry-run failed, reason: Invalid, error: Service "yyy" is invalid: [spec.clusterIPs[0]: Invalid value: []string(nil): primary clusterIP can not be unset, spec.ipFamilies[0]: Invalid value: []core.IPFamily(nil): primary ipFamily can not be unset]... 126d
flux-system zzz False Service/zzz/zzz-controller-metrics dry-run failed, reason: Invalid, error: Service "zzz-controller-metrics" is invalid: spec.clusterIPs[0]: Invalid value: []string(nil): primary clusterIP can not be unset...
flux-system www False Service/www/www-webhook-service dry-run failed, reason: Invalid, error: Service "www-webhook-service" is invalid: [spec.clusterIPs[0]: Invalid value: []string(nil): primary clusterIP can not be unset, spec.ipFamilies[0]: Invalid value: []core.IPFamily(nil): primary ipFamily can not be unset]...`
This issue appears to be extant when clusters are being upgraded from a version of Kubernetes before v1.18, or thereabouts, when Server-Side Apply and managedFields became GA. I saw an almost identical report yesterday, I can't find it just now...
If your services are created before K8s v1.18, then the cluster adds clusterIP
to spec
when the service is created, and they are created as though they have been populated completely by kubectl. (Flux sees this as a drift to be corrected.)
If your services are created after K8s v1.18, then when the cluster adds clusterIP
it does it through server-side apply. This means the field is marked as owned by kube-controller-manager
or something like that, which Flux knows as an important source of drift that is permitted to remain in the Service, Flux will not try to overwrite it.
If the field is not present in Git and it's not marked as owned by the well-known manager, then Flux has no way to detect it as "not a drift that should be reverted." Maybe we should add another special case here for Service.spec.clusterIP? This is likely to be the most common case of this issue, but there can most certainly be others. It is unusual but definitely not quite "uncommon" for a controller to update the spec. (Other native Kubernetes resources do it too, like PersistentVolumes.)
If it is safe for you to delete and recreate the services that are offending, then you should be able to delete and recreate them through Flux by the old-fashioned way, or by adding spec.force: true
to the Kustomization. I think that will allow the immutable field to be overwritten, as Flux can delete and recreate the service. Be careful doing this, as it will interrupt the service temporarily.
~I'm not aware of any better way to solve this for now~ (Edit: the better way is to patch managedFields
, https://github.com/fluxcd/flux2/issues/2386#issuecomment-1055123204) else you have to delete and recreate those services. @mhsh64 has your cluster been upgraded at some time from a K8s version that was earlier than v1.18.x? If so, I think you definitely have the same issue that I'm describing here.
@kingdonb thanks for details
I don't recall when we had 1.18 but it is long time back. And you are right these services/resources exist on clusters from that version and we just upgraded them
Our clusters running in 1.21 version for a while, and I am wondering if there is no other fix in flux itself to cover this scenario!?
But I am also wondering why there was no issue in our previous flux version?
Shouldn't we upgrade the flux version because of this issue?
I'm not aware of any better way to solve this for now, you have to delete and recreate those services.
To avoid deleting the Service, a patch can be used to remove the ClusterIP from the managed fields:
kubectl patch svc my-svc --type=json -p='[{"op": "remove", "path": "/metadata/managedFields/0/fieldsV1/f:spec/f:clusterIP"}]'
Note that the index must match the kustomize-controller manager, the example above is for index 0.
I believe that I've reproduced this issue when migrating from Fluxv1->Fluxv2. A Knative Serving resource manages all annotations. Its validating webhook prevents modification of these annotations. The yaml in git does not contain any annotations. Everything applies fine with Flux v1. When applying with Fluxv2 I get the following error:
BadRequest, error: admission webhook "validation.webhook.serving.knative.dev" denied the request: validation failed: annotation value is immutable: metadata.annotations.serving.knative.dev/creator
Note that metadata.annotations
is a managed field owned by Knative Serving. Is it fair to assume that v0.27.3
will fix this issue without a workaround? Please let me know if you need full debug info.
I'm still reproducing my issue with v0.27.3
. I misread the previous comment about fixes being in 0.26.3
, not 0.27.3
so this is not surprising. It's not clear to me why the patch would mutate existing annotations as 1) our resource in git contains no annotations and 2) annotations are managed by another controller as specified in managedFields
. This is a Fluxv1->Fluxv2 migration blocker for us.
I've tested using flux build kustomization... | kubectl apply -f -
and I can apply the manifest without any issues. This leads me to believe that the issue is with how the kustomize controller is applying the manifests.
If knative updates annotations in resources that Flux applies, then what do you see in managedFields
before Flux overwrites it?
kubectl get [resource that errored] -oyaml --show-managed-fields
@tshak the problem is likely that Flux doesn't allow arbitrary writes from arbitrary writers to persist, that is considered drift and then reverted. This is new behavior https://fluxcd.io/docs/faq/#why-are-kubectl-edits-rolled-back-by-flux
It was required to make changes to the way that Flux does applies after server-side apply was implemented because of some issues with Kubernetes. Many users reported that Flux was allowing drift to persist in their clusters even though they had made attempts to overwrite it in git, or the section of config had already been deleted, but it still persisted in the cluster.
Flux has adjusted the approach to honor the expectations of GitOps users everywhere that expect configuration to reflect what is in git, and not some drift introduced from an unknown source. But some resources are not consistently putting their (intentional, expected, according-to-specifications) drift in places where it can work with GitOps that behaves this way.
Instead of using kubectl apply -f
use kubectl apply --server-side -f
and see what happens. That's more similar to what Flux is doing now than kubectl apply -f
, which is a client-side apply. Also, please provide the content of managed fields, it will help us better understand what knative is doing and if there is any workaround we can build into Flux to resolve this now.
Thank you for the detailed explanation. Here is an example Knative Service that is failing to apply:
Git:
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: emojivoto
namespace: demo
spec:
template:
spec:
containers:
- name: emojivoto
image: buoyantio/emojivoto-web:v6
env:
- name: EMOJISVC_HOST
value: emojivoto-emoji-svc.demo.svc.cluster.local:80
- name: VOTINGSVC_HOST
value: emojivoto-voting-svc.demo.svc.cluster.local:80
- name: INDEX_BUNDLE
value: dist/index_bundle.js
- name: WEB_PORT
value: "80"
ports:
- containerPort: 80
protocol: TCP
Server (status
is redacted for brevity) :
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
annotations:
serving.knative.dev/creator: system:serviceaccount:flux:flux
serving.knative.dev/lastModifier: system:serviceaccount:flux:flux
creationTimestamp: "2021-12-09T12:21:32Z"
generation: 5
labels:
kustomize.toolkit.fluxcd.io/name: apps-kubernetes-state
kustomize.toolkit.fluxcd.io/namespace: flux-system
managedFields:
- apiVersion: serving.knative.dev/v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations: {}
f:labels:
.: {}
f:kustomize.toolkit.fluxcd.io/name: {}
f:kustomize.toolkit.fluxcd.io/namespace: {}
f:spec:
.: {}
f:template:
.: {}
f:spec:
.: {}
f:containers: {}
manager: kustomize-controller
operation: Apply
time: "2021-12-09T15:49:33Z"
- apiVersion: serving.knative.dev/v1
fieldsType: FieldsV1
fieldsV1:
f:status:
.: {}
f:address:
.: {}
f:url: {}
f:conditions: {}
f:latestCreatedRevisionName: {}
f:latestReadyRevisionName: {}
f:observedGeneration: {}
f:traffic: {}
f:url: {}
manager: controller
operation: Update
time: "2021-12-09T15:40:00Z"
name: emojivoto
namespace: demo
resourceVersion: "533389058"
uid: d367349c-aa69-4e82-96f8-f88804108c27
spec:
template:
metadata:
creationTimestamp: null
spec:
containerConcurrency: 0
containers:
- env:
- name: EMOJISVC_HOST
value: emojivoto-emoji-svc.demo.svc.cluster.local:80
- name: VOTINGSVC_HOST
value: emojivoto-voting-svc.demo.svc.cluster.local:80
- name: INDEX_BUNDLE
value: dist/index_bundle.js
- name: WEB_PORT
value: "80"
image: buoyantio/emojivoto-web:v6
name: emojivoto
ports:
- containerPort: 80
protocol: TCP
readinessProbe:
successThreshold: 1
tcpSocket:
port: 0
resources:
limits:
cpu: "1"
memory: 2G
requests:
cpu: 50m
memory: 50M
enableServiceLinks: false
timeoutSeconds: 30
traffic:
- latestRevision: true
percent: 100
I was able to successfully perform a server side apply ut it required --force-conflict
due to the managed field .spec.template.spec.containers
. This is expected as knative manages this field (although I'm not sure why). When I set force: true
for the Kustomization
resource I still see the annotation error:
Service/demo/emojivoto dry-run failed, reason: BadRequest, error: admission webhook "validation.webhook.serving.knative.dev" denied the request: validation failed: annotation value is immutable: metadata.annotations.serving.knative.dev/creator
Interestingly, a flux diff...
using the exact same arguments fails with this error as well.
@tshak try to remove the annotations with kubectl -n demo patch service emojivoto --type=json -p='[{"op": "remove", "path": "/metadata/managedFields/0/fieldsV1/f:metadata/f:annotations"}]'
After patching the resource I can still repro the issue. I don't believe that the root cause is due to managed fields or server-side apply behaviour. Here is a recap of what I've tried (all apply
commands are using --server-side
):
0) kubectl apply...
the standalone yaml from the git repo that contains only the offending resource (emojivoto knative service) ✅
0) flux build... | kubectl apply...
the Kustomization
which contains the offending resource ✅
0) flux diff...
the Kustomization
which contains the offending resource ❌
0) Kustomization
applied by the kustomization-controller
❌
In all ❌ cases, the error is the same annotation value is immutable
error from the knative admission webhook.
Flux performs a server-side apply, and since it manages f:annotations
it tries to remove them as they are not in Git. After you've run the kubectl patch
can you confirm that annotations are no longer managed by kustomize-controller
?
I have confirmed that f:annotations: {}
is no longer present in the offending resource. However, the error persists.
So if you do kubectl apply --server-side
does it fail the same?
In this case it fails but as expected:
error: Apply failed with 1 conflict: conflict with "kustomize-controller": .spec.template.spec.containers
It passes by adding --force-conflicts
.
I see in the annotations "serving.knative.dev/creator: system:serviceaccount:flux:flux" but there is no such service account in cluster. Have you changed the SA?
This resource was created with FluxV1. I'm testing a FluxV1->FluxV2 migration.
@tshak there is nothing we can do in Flux about this, if Knative decided to make annotations immutable then you can't reconcile this with anything else but the flux SA. I have no clue why would Knative do such a crazy thing...
I think that it's just these specific knative annotations that are immutable. And again, even a flux build... | kubectl apply --server-side...
works.
Describe the bug
I updated to flux 0.26.1 and then observed an reconciliation error in a deployment. I deleted the deployment and now there is a problem with the pvc of that deployment.
I tried to downgrade the kustomization controller, but that did not resolve the issue.
Steps to reproduce
Expected behavior
Should work after the update
Screenshots and recordings
No response
OS / Distro
Ubuntu 21
Flux version
flux: v0.26.1
Flux check
► checking prerequisites ✔ Kubernetes 1.20.11-eks-f17b81 >=1.20.6-0 ► checking controllers ✔ helm-controller: deployment ready ► ghcr.io/fluxcd/helm-controller:v0.16.0 ✔ image-automation-controller: deployment ready ► ghcr.io/fluxcd/image-automation-controller:v0.20.0 ✔ image-reflector-controller: deployment ready ► ghcr.io/fluxcd/image-reflector-controller:v0.16.0 ✔ kustomize-controller: deployment ready ► ghcr.io/fluxcd/kustomize-controller:v0.19.1 ✔ notification-controller: deployment ready ► ghcr.io/fluxcd/notification-controller:v0.21.0 ✔ source-controller: deployment ready ► ghcr.io/fluxcd/source-controller:v0.21.1 ✔ all checks passed
Git provider
Gitlab
Container Registry provider
Gitlab
Additional context
No response
Code of Conduct