Open CoreyCook8 opened 2 months ago
this is expected since on Azure VM, device name is not bound to disk name, e.g. disk1 is mounted as /dev/sdc, and when disk1 is manually detached and disk2 is attached to the VM, disk2 is mounted as /dev/sdc, if you delete & recreate the first pod with disk1 volume, then disk1 would still use /dev/sdc since at that time CSI driver thinks that disk1 is still attached to the VM, it would just reuse the previous device name(de/sdc).
BTW, manual volume detach is not supported CSI driver scenario, that's out of CSI driver control.
I understand that manual detach is out of the control of the csi driver. But, I would expect the CSI driver to ensure that a new pod is using the volume it has requested and not another pod's volume. If the pod is deleted, and the new pod is attached to the same VM, I would expect the csi driver to check the drive, and make sure the expected volume == the actual volume.
Or, when attaching the second disk to the same drive as the first disk, it would realize that a disk should already be there / realize that the first disk is no longer attached.
due to the manual detach, the kubelet thinks that the disk1 is already attached to the node, thus CSI driver won't be called (no NodeStageVolume call) to verify the drive.
When attaching disk2 to the VM, using the same device name(/dev/sdc) is actually ok (this is also out of CSI driver control, it's controlled by linux disk kernel driver), I think the main problem is that when you do the manual detach, you should reschedule the first pod to other node, that would work. Otherwise we don't have a solution how to make this work since it's out of CSI driver control
What happened:
After a volume was manually detached from a VM, two pods were mistakenly using the same Volume as their mounted volume.
What you expected to happen:
In an AWS cluster, this same issue gives this error message:
I would expect this to be handled in a similar manner.
How to reproduce it:
Anything else we need to know?:
Environment:
kubectl version
): 1.28.9uname -a
): Linux 5.4.0-1138-azure # 145-Ubuntu SMP Fri Aug 30 16:04:18 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux