Closed sbskas closed 1 year ago
The logs show /csi.v1.Node/NodeUnstageVolume
call that failed because the filesystem is keeping the RBD-device open. There should have been a /csi.v1.Node/NodeUnpublishVolume
call somewhere too, that call triggers the unmounting of the filesystem.
Could you share the complete csi-rbdplugin
container logs from the node where the problem occured? Is this something that happened once, or does it happen constantly?
Here's the rbdcsiplugin log. It happens quite frequently.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.
This issue has been automatically closed due to inactivity. Please re-open if this still requires investigation.
Any updates on this? Faced the same issue after upgrading k8s from 1.23 to 1.24.
Describe the bug
ceph-csi-node plugin fails to unmap rbd device from worker where volume are unstaged.
Environment details
fuse
orkernel
. for rbd itskrbd
orrbd-nbd
) : kernelSteps to reproduce
Steps to reproduce the behavior:
Actual results
Volumeattachment from previous node is still present, pv is still mapped and fs is mounted. fs is unmoutable without any issue but is not unmount.
New pod is unable to attache volume and thus unable to start.
Expected behavior
pvc FS should be umounted rdb should be unmapped correctly. Previous Volumeattachment should be delete
Logs
Please find an example from production Here an elasticsearch pv is being deleted after cluster decommissionning.
Describe PV:
By looking at the volumeAttachment:
Looking at node13 rbdplugin logs indeed shows the problem:
On the node13, volume is still mounted:
umounting the fs is ok.
rbd unmap failed with error 16. lsof show no usage. We had to
rdb unmap -o force
to release the PV.This behaviour is present from nautilus to quincy. We browsed previous issues regarding multipathd usage and so on with no luck.
Problems: