canonical / k8s-snap

Canonical Kubernetes is an opinionated and CNCF conformant Kubernetes operated by Snaps and Charms, which come together to bring simplified operations and an enhanced security posture on any infrastructure.
GNU General Public License v3.0
40 stars 11 forks source link

K8s snap leaks NFS volume mounts after removal #612

Closed claudiubelu closed 1 month ago

claudiubelu commented 1 month ago

Summary

If a local NFS CSI provider is used, and the k8s snap is removed afterwards, it may leak any NFS volume mounts, and also increase the snap removal time significantly.

Not only that, due to the volume leak, the /var/lib/kubelet folder is never cleaned up properly either, which means that on reinstalling and bootstrapping the cluster, the current node will not be registered to the new cluster.

What Should Happen Instead?

Volumes should not be leaked and the /var/lib/kubelet folder should be cleaned up on snap removal.

Reproduction Steps

  1. Install k8s snap: sudo snap install k8s --channel=1.30-classic/beta --classic
  2. Deploy NFS server.
  3. Deploy CSI provider.
  4. Deploy a Pod with a PVC.
  5. Uninstall k8s snap.
  6. Run: cat /proc/mounts | grep kubelet
  7. Observe leaked NFS mount:
cat /proc/mounts  | grep kubelet
nfs-server.default.svc.cluster.local:/ /var/lib/kubelet/pods/130ef4eb-b053-4f27-9cf3-084ce668e8b9/volumes/kubernetes.io~csi/pv-nginx/mount nfs4 rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=172.19.198.120,local_lock=none,addr=10.152.183.186 0 0

Alternatively, the following can be run instead to replicate the issue:

# Clone repo.
git clone https://github.com/canonical/csi-driver-nfs-rocks
cd csi-driver-nfs-rocks/tests

# Rock image required for integration test.
export BUILT_ROCKS_METADATA='[{"name":"nfsplugin","version":"4.7.0","path":"nfsplugin/4.7.0","arch":"amd64","image":"ghcr.io/canonical/nfsplugin:8de13f6a861f5107fc1c15a6346b5456da9b4747f83cfd4a948c1300eee65444-amd64","rockcraft-revision":"1783","runs-on-labels":["ubuntu-22.04"]},{"name":"nfsplugin","version":"4.7.0","path":"nfsplugin/4.7.0","arch":"arm64","image":"ghcr.io/canonical/nfsplugin:8de13f6a861f5107fc1c15a6346b5456da9b4747f83cfd4a948c1300eee65444-arm64","rockcraft-revision":"1784","runs-on-labels":["self-hosted","Linux","ARM64","jammy"]}]'
tox -e integration

# check leaked volumes.
cat /proc/mounts | grep kubelet

System information

N/A

Can you suggest a fix?

-f and -l can be added to the umount commands, resolving the issue.

Are you interested in contributing with a fix?

Yes