Closed JohnKiller closed 1 year ago
[🎩︎]mrajanna@fedora kubernetes-csi-addons $]cat log | grep -i "NodeStageVolume"
I1109 15:45:37.762508 1 utils.go:195] ID: 10 Req-ID: 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 GRPC call: /csi.v1.Node/NodeStageVolume
I1109 15:45:37.763027 1 utils.go:195] ID: 11 Req-ID: 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 GRPC call: /csi.v1.Node/NodeStageVolume
[🎩︎]mrajanna@fedora kubernetes-csi-addons $]cat log | grep -i "ID: 10"
I1109 15:45:37.762508 1 utils.go:195] ID: 10 Req-ID: 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 GRPC call: /csi.v1.Node/NodeStageVolume
I1109 15:45:37.762839 1 utils.go:206] ID: 10 Req-ID: 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 GRPC request: {"secrets":"***stripped***","staging_target_path":"/local/csi/staging/lagrondaia/rw-file-system-multi-node-multi-writer","volume_capability":{"AccessType":{"Mount":{}},"access_mode":{"mode":5}},"volume_context":{"clusterID":"7955282e-39e6-4db4-87d0-153b0ce2f37c","fsName":"cephfs","mounter":"fuse","subvolumeName":"csi-vol-80def62f-30ec-11ed-901a-0242ac110003","subvolumePath":"/volumes/csi/csi-vol-80def62f-30ec-11ed-901a-0242ac110003/1b116438-9f4d-4d34-a0e8-cc280c29a024"},"volume_id":"0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003"}
I1109 15:45:37.802858 1 omap.go:88] ID: 10 Req-ID: 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 got omap values: (pool="cephfs_metadata", namespace="csi", name="csi.volume.80def62f-30ec-11ed-901a-0242ac110003"): map[csi.imagename:csi-vol-80def62f-30ec-11ed-901a-0242ac110003 csi.volname:lagrondaia]
I1109 15:45:37.822515 1 nodeserver.go:247] ID: 10 Req-ID: 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 cephfs: mounting volume 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 with Ceph FUSE driver
I1109 15:45:37.883819 1 cephcmds.go:105] ID: 10 Req-ID: 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 command succeeded: ceph-fuse [/local/csi/staging/lagrondaia/rw-file-system-multi-node-multi-writer -m 192.168.110.101,192.168.110.102,192.168.110.103 -c /etc/ceph/ceph.conf -n client.admin --keyfile=***stripped*** -r /volumes/csi/csi-vol-80def62f-30ec-11ed-901a-0242ac110003/1b116438-9f4d-4d34-a0e8-cc280c29a024 -o nonempty --client_mds_namespace=cephfs]
I1109 15:45:37.883934 1 nodeserver.go:206] ID: 10 Req-ID: 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 cephfs: successfully mounted volume 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 to /local/csi/staging/lagrondaia/rw-file-system-multi-node-multi-writer
I1109 15:45:37.884122 1 utils.go:212] ID: 10 Req-ID: 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 GRPC response: {}
[🎩︎]mrajanna@fedora kubernetes-csi-addons $]cat log | grep -i "ID: 11"
I1109 15:45:37.763027 1 utils.go:195] ID: 11 Req-ID: 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 GRPC call: /csi.v1.Node/NodeStageVolume
I1109 15:45:37.763187 1 utils.go:206] ID: 11 Req-ID: 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 GRPC request: {"secrets":"***stripped***","staging_target_path":"/local/csi/staging/lagrondaia/rw-file-system-multi-node-multi-writer","volume_capability":{"AccessType":{"Mount":{}},"access_mode":{"mode":5}},"volume_context":{"clusterID":"7955282e-39e6-4db4-87d0-153b0ce2f37c","fsName":"cephfs","mounter":"fuse","subvolumeName":"csi-vol-80def62f-30ec-11ed-901a-0242ac110003","subvolumePath":"/volumes/csi/csi-vol-80def62f-30ec-11ed-901a-0242ac110003/1b116438-9f4d-4d34-a0e8-cc280c29a024"},"volume_id":"0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003"}
E1109 15:45:37.763238 1 nodeserver.go:136] ID: 11 Req-ID: 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 an operation with the given Volume ID 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 already exists
E1109 15:45:37.763271 1 utils.go:210] ID: 11 Req-ID: 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 GRPC error: rpc error: code = Aborted desc = an operation with the given Volume ID 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 already exists
You can see the CO is sending two NodeStageRequest for the same volume at the same time (only difference is millisecond times). As per the CSI specification NodeStage should make sure the volume is mounted on the given node only once if already mounted we should return succecss https://github.com/container-storage-interface/spec/blob/master/spec.md#nodestagevolume. To avoid consequences of mounting same volume twice we have a lock per volume at the csi driver. you will see operation already exist error until ongoing first request completes.
[🎩︎]mrajanna@fedora kubernetes-csi-addons $]cat log | grep -i "operation"
E1109 15:45:37.763238 1 nodeserver.go:136] ID: 11 Req-ID: 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 an operation with the given Volume ID 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 already exists
E1109 15:45:37.763271 1 utils.go:210] ID: 11 Req-ID: 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 GRPC error: rpc error: code = Aborted desc = an operation with the given Volume ID 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 already exists
E1109 15:45:37.772242 1 nodeserver.go:540] ID: 13 Req-ID: 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 an operation with the given Volume ID 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 already exists
E1109 15:45:37.772266 1 utils.go:210] ID: 13 Req-ID: 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 GRPC error: rpc error: code = Aborted desc = an operation with the given Volume ID 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 already exists
E1109 15:45:37.779016 1 nodeserver.go:540] ID: 15 Req-ID: 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 an operation with the given Volume ID 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 already exists
E1109 15:45:37.779035 1 utils.go:210] ID: 15 Req-ID: 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 GRPC error: rpc error: code = Aborted desc = an operation with the given Volume ID 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 already exists
[🎩︎]mrajanna@fedora kubernetes-csi-addons $]
[🎩︎]mrajanna@fedora kubernetes-csi-addons $]
[🎩︎]mrajanna@fedora kubernetes-csi-addons $]cat log | grep -i "ID: 13"
I1109 15:45:37.770267 1 utils.go:195] ID: 13 Req-ID: 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 GRPC call: /csi.v1.Node/NodeUnstageVolume
I1109 15:45:37.772198 1 utils.go:206] ID: 13 Req-ID: 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 GRPC request: {"staging_target_path":"/local/csi/staging/lagrondaia/rw-file-system-multi-node-multi-writer","volume_id":"0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003"}
E1109 15:45:37.772242 1 nodeserver.go:540] ID: 13 Req-ID: 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 an operation with the given Volume ID 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 already exists
E1109 15:45:37.772266 1 utils.go:210] ID: 13 Req-ID: 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 GRPC error: rpc error: code = Aborted desc = an operation with the given Volume ID 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 already exists
[🎩︎]mrajanna@fedora kubernetes-csi-addons $]cat log | grep -i "ID: 15"
I1109 15:45:37.778899 1 utils.go:195] ID: 15 Req-ID: 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 GRPC call: /csi.v1.Node/NodeUnstageVolume
I1109 15:45:37.778985 1 utils.go:206] ID: 15 Req-ID: 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 GRPC request: {"staging_target_path":"/local/csi/staging/lagrondaia/rw-file-system-multi-node-multi-writer","volume_id":"0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003"}
E1109 15:45:37.779016 1 nodeserver.go:540] ID: 15 Req-ID: 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 an operation with the given Volume ID 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 already exists
E1109 15:45:37.779035 1 utils.go:210] ID: 15 Req-ID: 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 GRPC error: rpc error: code = Aborted desc = an operation with the given Volume ID 0001-0024-7955282e-39e6-4db4-87d0-153b0ce2f37c-0000000000000001-80def62f-30ec-11ed-901a-0242ac110003 already exists
Same explanation as above, but here it's NodeUnstage Request.
Thanks for your response. So it's a problem with Nomad that should have waited for the first allocation to complete before trying to mount it again? I'll open an issue there
Thanks for your response. So it's a problem with Nomad that should have waited for the first allocation to complete before trying to mount it again? I'll open an issue there
IMO It's more of an enhancement, not a problem.
Eh it causes downtime during node draining where multiple tasks are being rescheduled at once
Describe the bug
When the container orchestrator launches multiple instances on the same host that all needs the same CephFS volume, if they are started too close toghether, the node plugin errors with
an operation with the given Volume ID already exists
. When the orchestrator tries again after some time, it then works.Since cephfs does support multi-write, it shouldn't matter that the node is still mounting the volume multiple times concurrently.
Environment details
fuse
orkernel
. for rbd itskrbd
orrbd-nbd
) : fuseSteps to reproduce
Steps to reproduce the behavior:
Launch multiple instances that needs to mount the same CephFS volume on the same host. If they start too close together, some will fail to mount
Actual results
Some allocations won't mount the volume.
Expected behavior
All allocations should mount the volume.
Logs