Open evgenii-avdiukhin opened 2 months ago
By design, StatefulSet pods do not get rescheduled to a new node when the original node becomes unavailable. This is because Kubernetes does not distinguish between a deliberate shutdown and a network partition, so it marks the pods on the down node as Unknown rather than deleting them. That is what you see when power-off/shutdown a node. It rewquires manual rescheduling in case of a StatefulSet.
However if you do a drain or delete of the node running the Jenkins pod, it all works as you may expect. The behavior is the most responsive when draining or deleting nodes. Some 'exclusively attached' events on the workload, but all in all the PVC re-attaches in a reasonable time:
Normal Scheduled 22s default-scheduler Successfully assigned jenkins/jenkins-0 to dev-pool-small-static-worker2
Warning FailedAttachVolume 23s attachdetach-controller Multi-Attach error for volume "pvc-8b1a23a1-cc85-4b09-9231-2c963885e366" Volume is already exclusively attached to one node and can't be attached to another
Normal SuccessfulAttachVolume 0s attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-8b1a23a1-cc85-4b09-9231-2c963885e366"
TL;DR
I have configure csi-driver and deployed jenkins statefullset to test the volume was automatically created and attached to worker-1 jenkins pod then was scheduled on the same node then i wanted to test how reattachment works i shutdown worker-1 hetzner vm but nothing happened, volume is not being reattached since tolerations are configured, jenkins pod is terminating and then try to schedule on the node that has the pvc, but he cant because pvc is still on the dead node what do i do wrong? or this behaviour is not supported by csi-driver?
Expected behavior
hetzne volume is moved to healthy node and pod schedule successfully
Observed behavior
volume is not being reattached
Minimal working example
No response
Log output
No response
Additional information
No response