ceph / ceph-csi

CSI driver for Ceph
Apache License 2.0
1.19k stars 528 forks source link

e2e: corrupted mount not attempting to recover. #4633

Open iPraveenParihar opened 1 month ago

iPraveenParihar commented 1 month ago
          > > replacing, `k8s.io/mount-utils => k8s.io/mount-utils v0.29.3`

this works!, So, will add a TODO: update once fixed in mount-utils and visit later? @Madhu-1

@iPraveenParihar Thank you, lets get it merged and we will revisit mount-utils, lets check what is changed in mount-utils release notes.

Originally posted by @Madhu-1 in https://github.com/ceph/ceph-csi/issues/4614#issuecomment-2126600054

iPraveenParihar commented 1 week ago

There is a change in the IsLikelyNotMountPoint method: https://github.com/kubernetes/mount-utils/compare/v0.29.3...v0.30.2

From the method description, it is clear that IsLikelyNotMountPoint is not suitable for detecting bind mounts: https://github.com/kubernetes/mount-utils/blob/dfd453c575c09ba21aa9f45bf9a9ddb8724c5e0b/mount_linux.go#L498-L513

ceph-csi uses this method for fuse mount recovery to determine if the path is a valid MountPoint or not. https://github.com/ceph/ceph-csi/blob/202f43c82d63d37bd765dea7ecf8de4037eca7b6/internal/util/util.go#L327-L335

we need alternative method to be used? thoughts @nixpanic @Madhu-1 ?

nixpanic commented 1 week ago

Probably you can delete this function and call x.mounter.IsMountPoint(...) directly? I don't see why the indirection is beneficial for anything.

iPraveenParihar commented 1 week ago

Probably you can delete this function and call x.mounter.IsMountPoint(...) directly? I don't see why the indirection is beneficial for anything.

IsMountPoint(...) is an expensive operation as it enumerates mount points on the Node and then does the evaluate operation on each mount point. https://github.com/kubernetes/mount-utils/blob/dfd453c575c09ba21aa9f45bf9a9ddb8724c5e0b/mount.go#L72-L79

may be because of that, they chose IsLikelyNotMountPoint()?

Madhu-1 commented 1 week ago

Probably you can delete this function and call x.mounter.IsMountPoint(...) directly? I don't see why the indirection is beneficial for anything.

IsMountPoint(...) is an expensive operation as it enumerates mount points on the Node and then does the evaluate operation on each mount point. https://github.com/kubernetes/mount-utils/blob/dfd453c575c09ba21aa9f45bf9a9ddb8724c5e0b/mount.go#L72-L79

may be because of that, they chose IsLikelyNotMountPoint()?

@iPraveenParihar IsMountPoint is already used for RBD and IsLikelyNotMountPoint currently used only for cephfs. we can switch from IsLikelyNotMountPoint to IsMountPoint for cephfs as well.

Madhu-1 commented 1 week ago

@nixpanic https://github.com/ceph/ceph-csi/pull/3281 in this PR we switch to IsLikelyNotMountPoint for specific case for rbd, if we switch back to IsMountPoint wont we get into the same problem again, can you please check or we need to still use IsLikelyNotMountPoint for specific cases and switch to IsMointPoint in util function.