Closed rach-sharp closed 4 years ago
I have the same problem with the csi in my cluster. Does anybody know if there are workarounds? Please let me know, if more infos are needed of configs or error-messages.
/cc @thcyron
Could you try removing volumeMode: Block
?
@LKaemmerling
I removed the volumeMode: Block
and now the osd-prepare Pod does not start cause of this message:
Warning FailedMount 2m5s kubelet, kubeworker1 Unable to attach or mount volumes: unmounted volumes=[set1-0-data-q5tz4], unattached volumes=[rook-binaries rook-ceph-osd-token-c9r9z rook-ceph-log set1-0-data-q5tz4 udev set1-0-data-q5tz4-bridge rook-data rook-ceph-crash ceph-conf-emptydir devices]: volume set1-0-data-q5tz4 has volumeMode Filesystem, but is specified in volumeDevices
@code-chris okay, then try to let it in there. Which csi-driver version are you running? Could you try the latest
tag? (https://github.com/hetznercloud/csi-driver/blob/master/deploy/kubernetes/hcloud-csi-master.yml)
I used this one: https://github.com/hetznercloud/csi-driver/blob/master/deploy/kubernetes/hcloud-csi.yml. Looks like almost the same...
They are almost the same, except the used CSI Driver image (hcloud-csi.yml: image: hetznercloud/hcloud-csi-driver:1.2.2
; hcloud-csi-master.yml image: hetznercloud/hcloud-csi-driver:latest
)
Ah you're right. I will try this one in the evening and give feedback!
@LKaemmerling I removed the
volumeMode: Block
and now the osd-prepare Pod does not start cause of this message:Warning FailedMount 2m5s kubelet, kubeworker1 Unable to attach or mount volumes: unmounted volumes=[set1-0-data-q5tz4], unattached volumes=[rook-binaries rook-ceph-osd-token-c9r9z rook-ceph-log set1-0-data-q5tz4 udev set1-0-data-q5tz4-bridge rook-data rook-ceph-crash ceph-conf-emptydir devices]: volume set1-0-data-q5tz4 has volumeMode Filesystem, but is specified in volumeDevices
Rook storageClassDeviceSets work only with volumeMode: Block
@LKaemmerling No, doesn't work. Then the original error appears again:
MapVolume.SetUpDevice failed for volume "pvc-92672abb-11c7-46e7-a6cd-ca847a603e7f" : kubernetes.io/csi: blockMapper.stageVolumeForBlock failed: rpc error: code = InvalidArgument desc = no mount capability
This issue has been marked as stale because it has not had recent activity. The bot will close the issue if no further action occurs.
@LKaemmerling Any news for this issue? The problem still persists....
I'm also experiencing the same issue when setting up ceph via hcloud-volumes
(using the templates of hetnercloud csi-driver 1.2.3)
Same issue. Any hope we can get an update soon?
I worked around it by deploying Rancher's Longhorn to my k8s (didn't use a hcloud volume, but since it replicates on 3 nodes I'm fine with that, but I guess it should be possible to use hcloud volumes for it - didn't dig into it). Then I set the storage-class to longhorn in cluster-on-pvc.yaml and was very surprised to see everything working perfectly.
Still would love if it worked with hcloud-volumes directly but at least I got rook-cephfs running in my cloud and it performs so much better then nfs-provisioner in case you need RWX volumes for your pods ♥
@LKaemmerling Maybe this should be reopened, cause there are enough folks which have the same problem...
Just encountered the same while trying to setup a rook ceph cluster. I realized rook requires block devices (not formatted filesystems):
https://rook.io/docs/rook/v1.3/ceph-cluster-crd.html#storage-class-device-sets
So if volumeMode is not set to block, then the error is instead:
volume set1-data-0-vx7h5 has volumeMode Filesystem, but is specified in volumeDevices
Which I believe hcloud doesn't support.
Any update on this? Why its not possible to get a raw block device?
I'm facing the same open issue, when trying to use Rook-Ceph cluster on PVC.
Our provider does not support getting raw block devices at the moment. We may look into this in the future.
I've added support for block mode in my branch: https://github.com/ahilsend/csi-driver/tree/volumemode-block
Didn't have time to add test yet, but have been running it successfully for 2 weeks to provision both block & fs volumes on k8s 1.18+.
I'm hoping I'll find some time to add those tests next week, and open a PR.
@ahilsend unfortunately I wasn't able to manage it to work with rook-ceph using a cluster on pvc. I see the volumes coming up and beeing assigned to one of my three storage-nodes, that belongs to ceph monitors. But then, for some reasons they are beeing unattached and attached again to the wrong servers, often it ended with multiple volumes on one server and the "osd-prepare" pod never finishes, with a lot of errors while initializing.
Let me know, if some of the logs may help you or if I should dig deeper.
often it ended with multiple volumes on one server
Multiple pods on the same node can happen, have you configured podAntiAffinities on your cluster?
the "osd-prepare" pod never finishes, with a lot of errors while initializing.
Is it something with rook itself or the CSI driver? I'm no expert with rook, not sure how much help I can help be. Check the rook osd-prepare and operator logs.
If the CSI driver is not doing what it should, logs would help.
I have AntiAffinity rules that prevents multiple mons and osd's on the same node. And first I can see that every of my storage-nodes got one volume attached as soon as they are requested. But with the new csi-driver image that includes your code changes, the volumes were detached and reattached, sometimes multiple volumes to the same node, even if the volumes runs in different DCs. Thats pretty weird, since I can't do that using the webfrontend.
Probably this has nothing to do with rook, I'm almost using the same setup in AWS and Azure, of course using their csi driver for the pv's.
I already take a look into the csi-driver logs and compare them to the one, before your changes and there were some warnings, but nothing meaningful. Let me run a clean install for fresh logs.
Just for the record, I'm using v1.3.1/deploy/kubernetes/hcloud-csi.yml
for the deployment, only replacing, the hcloud-csi-driver images. Hope thats correct.
Many thanks in advance, I really appreciate your work!
I'm pretty much a noob when it comes to storage in k8s and I've also struggled a lot in the past with setting up rook in my rancher cluster deployed on hetzner. Is there a reason for using a ceph cluster when you can just use the csi to satisfy pod PVCs? Other than data replication and integrity that ceph offers?
I'm pretty much a noob when it comes to storage in k8s and I've also struggled a lot in the past with setting up rook in my rancher cluster deployed on hetzner. Is there a reason for using a ceph cluster when you can just use the csi to satisfy pod PVCs? Other than data replication and integrity that ceph offers?
The hetzner volumes are RWO - they can only be mounted once. I have a use case for RWX - multiple pods accessing the same volume.
For that I use rook to setup a CephFS, which does that. The CephFS itself is running on top of hetzner CSI block volumes.
For all other RWO cases, I use the hetzner CSI directly.
Ahhh gotcha, with CephFS it totally makes sense (also another option I think would be NFS of Gluster) but that was my worry, that there is some other extra benefit on RWO that I was not aware of and I was happy for nothing that I can just use the csi driver and drop rook for the time being :)
This is the same usecase here. I need ReadWriteMany access to the persistent volumes. Of course I can attach a volume manually to a storage-node and use it unformatted as block device using the discover feature of rook-ceph, but with a cluster on PV, I can disable auto-discover and let the monitor pods automatically request volumes using the storageclass provided by the CSI driver.
Thanks for the clarification guys, I was actually doing the manual approach @mbuelte :D so now using the csi driver feels great. Will probably get back to Ceph when needing RWM.
@ahilsend here are some fresh logs from the csi-driver pods and from one of the test storage-nodes directly.
controller-hcloud_csi_driver:
level=debug ts=2020-05-20T05:40:31.010067655Z component=grpc-server msg="handling request" req="volume_id:\"5514604\" node_id:\"5924240\" volume_capability:<block:<> access_mode:<mode:SINGLE_NODE_WRITER > > volume_context:<key:\"storage.kubernetes.io/csiProvisionerIdentity\" value:\"XXXXXXXXX-8081-csi.hetzner.cloud\" > "
level=info ts=2020-05-20T05:40:31.010166189Z component=api-volume-service msg="attaching volume" volume-id=5514604 server-id=5924240
level=debug ts=2020-05-20T05:40:31.387845488Z component=grpc-server msg="handling request" req="volume_id:\"5514605\" node_id:\"5924242\" volume_capability:<block:<> access_mode:<mode:SINGLE_NODE_WRITER > > volume_context:<key:\"storage.kubernetes.io/csiProvisionerIdentity\" value:\"XXXXXXXX-8081-csi.hetzner.cloud\" > "
level=info ts=2020-05-20T05:40:31.387959992Z component=api-volume-service msg="attaching volume" volume-id=5514605 server-id=5924242
level=info ts=2020-05-20T05:40:32.25457159Z component=api-volume-service msg="failed to attach volume" volume-id=5514605 server-id=5924242 err="cannot perform operation because server is locked (locked)"
level=error ts=2020-05-20T05:40:32.313396856Z component=grpc-server msg="handler failed" err="rpc error: code = Unavailable desc = failed to publish volume: server is locked"
node-hcloud_csi_driver:
level=debug ts=2020-05-20T05:40:46.19469452Z component=linux-mount-service msg="publishing block volume" volume-name=pvc-28ee81f4-dd4c-4cc4-9731-4cfb5c561ddb target-path=/var/lib/kubelet/plugins/kubernetes.io/csi/volumeDevices/publish/pvc-28ee81f4-dd4c-4cc4-9731-4cfb5c561ddb/aceebf44-4254-4384-9c8f-4b6cf0a8f8a7 volume-path=/dev/disk/by-id/scsi-0HC_Volume_5514603 readonly=false additional-mount-options="unsupported value type"
storage node #2 syslogs:
May 20 07:40:31 minion-2 k3s[6779]: E0520 07:40:31.911067 6779 nestedpendingoperations.go:301] Operation for "{volumeName:kubernetes.io/csi/csi.hetzner.cloud^5514603 podName: nodeName:}" failed. No retries permitted until 2020-05-20 07:40:33.911005509 +0200 CEST m=+186.284088056 (durationBeforeRetry 2s). Error: "Volume has not been added to the list of VolumesInUse in the node's volume status for volume \"pvc-28ee81f4-dd4c-4cc4-9731-4cfb5c561ddb\" (UniqueName: \"kubernetes.io/csi/csi.hetzner.cloud^5514603\") pod \"rook-ceph-osd-prepare-set1-data-0-n22jc-4n7l9\" (UID: \"aceebf44-4254-4384-9c8f-4b6cf0a8f8a7\") "
May 20 07:40:32 minion-2 k3s[6779]: I0520 07:40:32.011683 6779 reconciler.go:209] operationExecutor.VerifyControllerAttachedVolume started for volume "pvc-75e31e5b-9826-41a2-80e0-d6722a1760f8" (UniqueName: "kubernetes.io/csi/csi.hetzner.cloud^5514605") pod "rook-ceph-osd-prepare-set1-data-2-l2mk4-4jk6q" (UID: "f88db6ef-ad24-44e5-9da0-5a279979aa2c")
May 20 07:40:32 minion-2 k3s[6779]: E0520 07:40:32.011924 6779 nestedpendingoperations.go:301] Operation for "{volumeName:kubernetes.io/csi/csi.hetzner.cloud^5514605 podName: nodeName:}" failed. No retries permitted until 2020-05-20 07:40:33.011863809 +0200 CEST m=+185.384946326 (durationBeforeRetry 1s). Error: "Volume has not been added to the list of VolumesInUse in the node's volume status for volume \"pvc-75e31e5b-9826-41a2-80e0-d6722a1760f8\" (UniqueName: \"kubernetes.io/csi/csi.hetzner.cloud^5514605\") pod \"rook-ceph-osd-prepare-set1-data-2-l2mk4-4jk6q\" (UID: \"f88db6ef-ad24-44e5-9da0-5a279979aa2c\") "
May 20 07:40:32 minion-2 kernel: scsi 3:0:0:1: Direct-Access HC Volume 2.5+ PQ: 0 ANSI: 5
May 20 07:40:32 minion-2 kernel: sd 3:0:0:1: Power-on or device reset occurred
May 20 07:40:32 minion-2 kernel: sd 3:0:0:1: Attached scsi generic sg2 type 0
May 20 07:40:32 minion-2 kernel: sd 3:0:0:1: [sdb] 31457280 512-byte logical blocks: (16.1 GB/15.0 GiB)
May 20 07:40:32 minion-2 kernel: sd 3:0:0:1: [sdb] Write Protect is off
May 20 07:40:32 minion-2 kernel: sd 3:0:0:1: [sdb] Mode Sense: 63 00 00 08
May 20 07:40:32 minion-2 kernel: sd 3:0:0:1: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
May 20 07:40:32 minion-2 kernel: sd 3:0:0:1: [sdb] Attached SCSI disk
May 20 07:40:33 minion-2 k3s[6779]: I0520 07:40:33.017176 6779 reconciler.go:209] operationExecutor.VerifyControllerAttachedVolume started for volume "pvc-75e31e5b-9826-41a2-80e0-d6722a1760f8" (UniqueName: "kubernetes.io/csi/csi.hetzner.cloud^5514605") pod "rook-ceph-osd-prepare-set1-data-2-l2mk4-4jk6q" (UID: "f88db6ef-ad24-44e5-9da0-5a279979aa2c")
May 20 07:40:33 minion-2 k3s[6779]: E0520 07:40:33.017356 6779 nestedpendingoperations.go:301] Operation for "{volumeName:kubernetes.io/csi/csi.hetzner.cloud^5514605 podName: nodeName:}" failed. No retries permitted until 2020-05-20 07:40:35.017318419 +0200 CEST m=+187.390400946 (durationBeforeRetry 2s). Error: "Volume has not been added to the list of VolumesInUse in the node's volume status for volume \"pvc-75e31e5b-9826-41a2-80e0-d6722a1760f8\" (UniqueName: \"kubernetes.io/csi/csi.hetzner.cloud^5514605\") pod \"rook-ceph-osd-prepare-set1-data-2-l2mk4-4jk6q\" (UID: \"f88db6ef-ad24-44e5-9da0-5a279979aa2c\") "
May 20 07:40:33 minion-2 k3s[6779]: I0520 07:40:33.922013 6779 reconciler.go:209] operationExecutor.VerifyControllerAttachedVolume started for volume "pvc-28ee81f4-dd4c-4cc4-9731-4cfb5c561ddb" (UniqueName: "kubernetes.io/csi/csi.hetzner.cloud^5514603") pod "rook-ceph-osd-prepare-set1-data-0-n22jc-4n7l9" (UID: "aceebf44-4254-4384-9c8f-4b6cf0a8f8a7")
May 20 07:40:33 minion-2 k3s[6779]: E0520 07:40:33.922180 6779 nestedpendingoperations.go:301] Operation for "{volumeName:kubernetes.io/csi/csi.hetzner.cloud^5514603 podName: nodeName:}" failed. No retries permitted until 2020-05-20 07:40:37.922136657 +0200 CEST m=+190.295219174 (durationBeforeRetry 4s). Error: "Volume has not been added to the list of VolumesInUse in the node's volume status for volume \"pvc-28ee81f4-dd4c-4cc4-9731-4cfb5c561ddb\" (UniqueName: \"kubernetes.io/csi/csi.hetzner.cloud^5514603\") pod \"rook-ceph-osd-prepare-set1-data-0-n22jc-4n7l9\" (UID: \"aceebf44-4254-4384-9c8f-4b6cf0a8f8a7\") "
May 20 07:40:35 minion-2 k3s[6779]: I0520 07:40:35.028035 6779 reconciler.go:209] operationExecutor.VerifyControllerAttachedVolume started for volume "pvc-75e31e5b-9826-41a2-80e0-d6722a1760f8" (UniqueName: "kubernetes.io/csi/csi.hetzner.cloud^5514605") pod "rook-ceph-osd-prepare-set1-data-2-l2mk4-4jk6q" (UID: "f88db6ef-ad24-44e5-9da0-5a279979aa2c")
May 20 07:40:35 minion-2 k3s[6779]: E0520 07:40:35.028247 6779 nestedpendingoperations.go:301] Operation for "{volumeName:kubernetes.io/csi/csi.hetzner.cloud^5514605 podName: nodeName:}" failed. No retries permitted until 2020-05-20 07:40:39.028211068 +0200 CEST m=+191.401293536 (durationBeforeRetry 4s). Error: "Volume has not been added to the list of VolumesInUse in the node's volume status for volume \"pvc-75e31e5b-9826-41a2-80e0-d6722a1760f8\" (UniqueName: \"kubernetes.io/csi/csi.hetzner.cloud^5514605\") pod \"rook-ceph-osd-prepare-set1-data-2-l2mk4-4jk6q\" (UID: \"f88db6ef-ad24-44e5-9da0-5a279979aa2c\") "
May 20 07:40:35 minion-2 kernel: scsi 3:0:0:2: Direct-Access HC Volume 2.5+ PQ: 0 ANSI: 5
May 20 07:40:35 minion-2 kernel: sd 3:0:0:2: Power-on or device reset occurred
May 20 07:40:35 minion-2 kernel: sd 3:0:0:2: Attached scsi generic sg3 type 0
May 20 07:40:35 minion-2 kernel: sd 3:0:0:2: [sdc] 31457280 512-byte logical blocks: (16.1 GB/15.0 GiB)
May 20 07:40:35 minion-2 kernel: sd 3:0:0:2: [sdc] Write Protect is off
May 20 07:40:35 minion-2 kernel: sd 3:0:0:2: [sdc] Mode Sense: 63 00 00 08
May 20 07:40:35 minion-2 kernel: sd 3:0:0:2: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
May 20 07:40:35 minion-2 kernel: sd 3:0:0:2: [sdc] Attached SCSI disk
May 20 07:40:37 minion-2 k3s[6779]: I0520 07:40:37.944334 6779 reconciler.go:209] operationExecutor.VerifyControllerAttachedVolume started for volume "pvc-28ee81f4-dd4c-4cc4-9731-4cfb5c561ddb" (UniqueName: "kubernetes.io/csi/csi.hetzner.cloud^5514603") pod "rook-ceph-osd-prepare-set1-data-0-n22jc-4n7l9" (UID: "aceebf44-4254-4384-9c8f-4b6cf0a8f8a7")
May 20 07:40:37 minion-2 k3s[6779]: E0520 07:40:37.944491 6779 nestedpendingoperations.go:301] Operation for "{volumeName:kubernetes.io/csi/csi.hetzner.cloud^5514603 podName: nodeName:}" failed. No retries permitted until 2020-05-20 07:40:45.944451656 +0200 CEST m=+198.317534123 (durationBeforeRetry 8s). Error: "Volume has not been added to the list of VolumesInUse in the node's volume status for volume \"pvc-28ee81f4-dd4c-4cc4-9731-4cfb5c561ddb\" (UniqueName: \"kubernetes.io/csi/csi.hetzner.cloud^5514603\") pod \"rook-ceph-osd-prepare-set1-data-0-n22jc-4n7l9\" (UID: \"aceebf44-4254-4384-9c8f-4b6cf0a8f8a7\") "
May 20 07:40:39 minion-2 k3s[6779]: I0520 07:40:39.049843 6779 reconciler.go:209] operationExecutor.VerifyControllerAttachedVolume started for volume "pvc-75e31e5b-9826-41a2-80e0-d6722a1760f8" (UniqueName: "kubernetes.io/csi/csi.hetzner.cloud^5514605") pod "rook-ceph-osd-prepare-set1-data-2-l2mk4-4jk6q" (UID: "f88db6ef-ad24-44e5-9da0-5a279979aa2c")
May 20 07:40:39 minion-2 k3s[6779]: E0520 07:40:39.050001 6779 nestedpendingoperations.go:301] Operation for "{volumeName:kubernetes.io/csi/csi.hetzner.cloud^5514605 podName: nodeName:}" failed. No retries permitted until 2020-05-20 07:40:47.049955312 +0200 CEST m=+199.423037829 (durationBeforeRetry 8s). Error: "Volume has not been added to the list of VolumesInUse in the node's volume status for volume \"pvc-75e31e5b-9826-41a2-80e0-d6722a1760f8\" (UniqueName: \"kubernetes.io/csi/csi.hetzner.cloud^5514605\") pod \"rook-ceph-osd-prepare-set1-data-2-l2mk4-4jk6q\" (UID: \"f88db6ef-ad24-44e5-9da0-5a279979aa2c\") "
May 20 07:40:40 minion-2 k3s[6779]: I0520 07:40:40.958764 6779 reconciler.go:303] Volume detached for volume "rook-ceph-crash-collector-keyring" (UniqueName: "kubernetes.io/secret/1ba18009-61a2-404a-aa00-5f92d3e20bfe-rook-ceph-crash-collector-keyring") on node "minion-2" DevicePath ""
May 20 07:40:45 minion-2 k3s[6779]: I0520 07:40:45.981544 6779 reconciler.go:209] operationExecutor.VerifyControllerAttachedVolume started for volume "pvc-28ee81f4-dd4c-4cc4-9731-4cfb5c561ddb" (UniqueName: "kubernetes.io/csi/csi.hetzner.cloud^5514603") pod "rook-ceph-osd-prepare-set1-data-0-n22jc-4n7l9" (UID: "aceebf44-4254-4384-9c8f-4b6cf0a8f8a7")
May 20 07:40:45 minion-2 k3s[6779]: I0520 07:40:45.986797 6779 operation_generator.go:1245] Controller attach succeeded for volume "pvc-28ee81f4-dd4c-4cc4-9731-4cfb5c561ddb" (UniqueName: "kubernetes.io/csi/csi.hetzner.cloud^5514603") pod "rook-ceph-osd-prepare-set1-data-0-n22jc-4n7l9" (UID: "aceebf44-4254-4384-9c8f-4b6cf0a8f8a7") device path: ""
May 20 07:40:46 minion-2 k3s[6779]: I0520 07:40:46.082417 6779 operation_generator.go:881] MapVolume.WaitForAttach entering for volume "pvc-28ee81f4-dd4c-4cc4-9731-4cfb5c561ddb" (UniqueName: "kubernetes.io/csi/csi.hetzner.cloud^5514603") pod "rook-ceph-osd-prepare-set1-data-0-n22jc-4n7l9" (UID: "aceebf44-4254-4384-9c8f-4b6cf0a8f8a7") DevicePath ""
May 20 07:40:46 minion-2 k3s[6779]: I0520 07:40:46.086750 6779 operation_generator.go:890] MapVolume.WaitForAttach succeeded for volume "pvc-28ee81f4-dd4c-4cc4-9731-4cfb5c561ddb" (UniqueName: "kubernetes.io/csi/csi.hetzner.cloud^5514603") pod "rook-ceph-osd-prepare-set1-data-0-n22jc-4n7l9" (UID: "aceebf44-4254-4384-9c8f-4b6cf0a8f8a7") DevicePath "csi-111daa4b77421b11c26fb03add0e9b0e5eab1ca1fbab3a755a3d4a3765ae750e"
May 20 07:40:46 minion-2 systemd[1]: Started Kubernetes systemd probe.
May 20 07:40:46 minion-2 systemd[1]: run-ra3c46bb0a3cc4c77bf22af970150100a.scope: Succeeded.
May 20 07:40:46 minion-2 systemd[1]: Started Kubernetes transient mount for /var/lib/kubelet/plugins/kubernetes.io/csi/volumeDevices/pvc-28ee81f4-dd4c-4cc4-9731-4cfb5c561ddb/dev/aceebf44-4254-4384-9c8f-4b6cf0a8f8a7.
May 20 07:40:46 minion-2 systemd[1]: Started Kubernetes transient mount for /var/lib/kubelet/plugins/kubernetes.io/csi/volumeDevices/pvc-28ee81f4-dd4c-4cc4-9731-4cfb5c561ddb/dev/aceebf44-4254-4384-9c8f-4b6cf0a8f8a7.
May 20 07:40:46 minion-2 k3s[6779]: E0520 07:40:46.261184 6779 nestedpendingoperations.go:301] Operation for "{volumeName:kubernetes.io/csi/csi.hetzner.cloud^5514603 podName: nodeName:}" failed. No retries permitted until 2020-05-20 07:40:46.761129727 +0200 CEST m=+199.134212184 (durationBeforeRetry 500ms). Error: "MapVolume.MapBlockVolume failed for volume \"pvc-28ee81f4-dd4c-4cc4-9731-4cfb5c561ddb\" (UniqueName: \"kubernetes.io/csi/csi.hetzner.cloud^5514603\") pod \"rook-ceph-osd-prepare-set1-data-0-n22jc-4n7l9\" (UID: \"aceebf44-4254-4384-9c8f-4b6cf0a8f8a7\") : blkUtil.AttachFileDevice failed. globalMapPath:/var/lib/kubelet/plugins/kubernetes.io/csi/volumeDevices/pvc-28ee81f4-dd4c-4cc4-9731-4cfb5c561ddb/dev, podUID: aceebf44-4254-4384-9c8f-4b6cf0a8f8a7: GetLoopDevice failed for path /var/lib/kubelet/plugins/kubernetes.io/csi/volumeDevices/pvc-28ee81f4-dd4c-4cc4-9731-4cfb5c561ddb/dev/aceebf44-4254-4384-9c8f-4b6cf0a8f8a7: losetup -j /var/lib/kubelet/plugins/kubernetes.io/csi/volumeDevices/pvc-28ee81f4-dd4c-4cc4-9731-4cfb5c561ddb/dev/aceebf44-4254-4384-9c8f-4b6cf0a8f8a7 failed: exit status 1"
I also noticed the change of the node where the volumes was attached previously. In my tests, I was using three storage-nodes, on each runs a ceph monitor pod that requests the volume. Some seconds later, all of them got its own volume attached. But then, something happend that unattached the volume and reattach it on another node. Probably, this leads to the problem I see here.
When I'm using the auto-discover feature of rook-ceph, instead of storageClassDeviceSets, the new attached volumes are found, but ceph is unable to use them as raw block device, since they are formatted with ext4. When I manually wipe the fs, ceph is using them after a while and forms the cluster, but this setup does not differs from my previous one, where I was attaching the volumes manually, wipe them and so on. I don't want to use "auto-discover" and there should no need for "manually" intervention.
I'm happy to announce that we just released v1.4.0 of our CSI driver which includes this. The new container should be available in a couple of minutes.
Does this mean RWO is no longer an issue? I used Hetzner drivers for many deployments that scaled to more than one pod and if only one can access the storage, that is a huge issue for me
@hadifarnoud You still need a storage provider like rook-ceph that supports RWX. But you can now use hetzner volumes to deploy it. The volumes themself are still RWO afaik.
I'm trying to set up Ceph via Rook, using pvcs with a StorageClass powered by
hetznercloud/csi-driver
, but Volumes get stuck between being attached and being mounted to a Pod.When I describe one of these pods, there are the following events:
After a little digging,
no mount capability
comes from the file https://github.com/hetznercloud/csi-driver/blob/master/driver/node.goMy hetzner cloud k8s manifests installed are just the ones from the README, my
CephCluster
manifest is:Involved PVs
Involved PVCs
Any ideas why the volume would have no mount capability?