What happened:
I'm using k8s cluster on AWS eks, and I'm using spot instances for node groups. I see that randomly and not on all clusters one pod that manage the CSI NFS controller goes in crashloopback and report these logs:
csi-snapshotter E1029 09:35:37.115611 1 leaderelection.go:340] Failed to update lock optimitically: Operation cannot be fulfilled on leases.coordination.k8s.io "external-snapshotter-leader-nfs-csi-k8s-io": the object has been modified; please apply your changes to the latest version and try again, falling back to slow path
If I delete the pod, all starts without any issue:
It seems that every time (or mostly) that an ec2 is retired and swapped with another one, csi-nfs-controller has some lock that can be solved only with a brutal pod delete.
What you expected to happen:
No crashloopback status on a controller pod
How to reproduce it:
Try to deploy a cluster with spot instances and install nfs-csi-controller and see IF happens and WHEN.
Anything else we need to know?:
What happened: I'm using k8s cluster on AWS eks, and I'm using spot instances for node groups. I see that randomly and not on all clusters one pod that manage the CSI NFS controller goes in crashloopback and report these logs:
If I delete the pod, all starts without any issue:
It seems that every time (or mostly) that an ec2 is retired and swapped with another one, csi-nfs-controller has some lock that can be solved only with a brutal pod delete.
What you expected to happen: No crashloopback status on a controller pod How to reproduce it: Try to deploy a cluster with spot instances and install nfs-csi-controller and see IF happens and WHEN. Anything else we need to know?:
Environment:
kubectl version
): 1.31