kubernetes-sigs / blob-csi-driver

Azure Blob Storage CSI driver
Apache License 2.0
123 stars 83 forks source link

MountVolume.SetUp failed for blobfuse volume on the edge K3s #1620

Closed Hidayathullashaik closed 1 month ago

Hidayathullashaik commented 1 month ago

Installed Blobfuse2 packages and CSI drivers via Helm on to the onpremise edge Kubernetes [K3s] server. Created a secret using azure storage account sas token, PV, PVCs & deployments. Mounts are sucessfull, pods are running and able to push the data from the onpremise to the Azure blob storage account container.

[PROBLEM] Rolled out to 7+ individual kubernetes [K3s] clusters with the same configurations including the sas token, storageaccount and container, new mounts are successful but the existing mounts are getting failed and unable to send the data to the cloud. When I map the failedmount cluster to the different container of the same storageaccount, it is working fine.

Note: Mounting is not happening for the same container of the storageaccount for more than 7 clusters.

ERROR:

Nginx logs

MountVolume.SetUp failed for volume "pv-blob" : rpc error: code = Internal desc = Could not mount "/var/lib/kubelet/plugins/kubernetes.io/csi/blob.csi.azure.com/d15383f56406a2c08a1896bfb52c576a0500c32c4238d8ba1fcf556b4f0d4b07/globalmount" at "/var/lib/kubelet/pods/9779c0bd-100e-437c-938c-bbb3322358b0/volumes/kubernetes.io~csi/pv-blob/mount": mount failed: exit status 32 Mounting command: mount Mounting arguments: -o bind /var/lib/kubelet/plugins/kubernetes.io/csi/blob.csi.azure.com/d15383f56406a2c08a1896bfb52c576a0500c32c4238d8ba1fcf556b4f0d4b07/globalmount /var/lib/kubelet/pods/9779c0bd-100e-437c-938c-bbb3322358b0/volumes/kubernetes.io~csi/pv-blob/mount Output: mount: /var/lib/kubelet/pods/9779c0bd-100e-437c-938c-bbb3322358b0/volumes/kubernetes.io~csi/pv-blob/mount: special device /var/lib/kubelet/plugins/kubernetes.io/csi/blob.csi.azure.com/d15383f56406a2c08a1896bfb52c576a0500c32c4238d8ba1fcf556b4f0d4b07/globalmount does not exist.

CSI Controller Logs

root@peplap11118:~# kubectl logs csi-blob-controller-798b6c9cfd-cttgz -n kube-system Defaulted container "csi-provisioner" out of: csi-provisioner, liveness-probe, blob, csi-resizer I0929 22:35:06.415607 1 feature_gate.go:249] feature gates: &{map[HonorPVReclaimPolicy:true]} I0929 22:35:06.415669 1 csi-provisioner.go:154] Version: v3.5.0-0-gab68435fa I0929 22:35:06.415673 1 csi-provisioner.go:177] Building kube configs for running in cluster... I0929 22:35:08.467441 1 common.go:111] Probing CSI driver for readiness I0929 22:35:08.470355 1 csi-provisioner.go:230] Detected CSI driver blob.csi.azure.com I0929 22:35:08.471366 1 csi-provisioner.go:302] CSI driver does not support PUBLISH_UNPUBLISH_VOLUME, not watching VolumeAttachments I0929 22:35:08.471585 1 controller.go:732] Using saving PVs to API server in background I0929 22:35:08.472024 1 leaderelection.go:245] attempting to acquire leader lease kube-system/blob-csi-azure-com... I0929 22:35:26.373280 1 leaderelection.go:255] successfully acquired lease kube-system/blob-csi-azure-com I0929 22:35:26.373368 1 leader_election.go:178] became leader, starting I0929 22:35:26.474246 1 controller.go:811] Starting provisioner controller blob.csi.azure.com_peplap11118_79071a60-4e84-4d00-b005-b9b365298e9f! I0929 22:35:26.474273 1 volume_store.go:97] Starting save volume queue I0929 22:35:26.575202 1 controller.go:860] Started provisioner controller blob.csi.azure.com_peplap11118_79071a60-4e84-4d00-b005-b9b365298e9f!

andyzhangx commented 1 month ago

@Hidayathullashaik do you have two PVs with same volumeHandle value? could be similar to this issue: https://github.com/kubernetes-sigs/blob-csi-driver/issues/762

Hidayathullashaik commented 1 month ago

@andyzhangx - Yes, Seems to be the similar issue. I have created a same volumehandle in different clusters not more than one per node but I am not creating mutpile volumehandles with-in the node. It was working fine upto some extent like 7clusters beyond that it started failing. What could be the best practicices to proceed futher

Hidayathullashaik commented 1 month ago

@andyzhangx - I have reconfigured the PV with the unique name across the clusters and it is working fine now.

andyzhangx commented 1 month ago

thanks for the confirmation