How to limit nodes that cdi-driver is trying to provision a volume to?

slamdev commented 2 years ago

I have different node types, with disks eligible for csi and without. disks for csi have a different name, so I am using this config:

devicePattern: /dev/nvme[1-3]n1

that worked well while I had 90% of nodes with nvme disks. but now the ratio is 50/50 and I see a lot of errors like

failed to provision volume with StorageClass "csi-driver-lvm-linear": rpc error: code = ResourceExhausted desc = volume creation failed

in plugin logs I see:

csi-driver-lvm-plugin-bvp86 csi-driver-lvm-plugin I1017 09:13:37.151506       1 controllerserver.go:126] creating volume pvc-facd5e9e-dda2-4bb0-94a5-0ec724e823a6 on node: worker-h1-1081261-183
csi-driver-lvm-plugin-bvp86 csi-driver-lvm-plugin I1017 09:13:37.151530       1 lvm.go:243] start provisionerPod with args:[createlv --lvsize 1073741824 --devices /dev/nvme[1-3]n1 --lvmtype linear --lvname pvc-facd5e9e-dda2-4bb0-94a5-0ec724e823a6 --vgname csi-lvm]
csi-driver-lvm-plugin-bvp86 csi-driver-lvm-plugin I1017 09:13:37.178143       1 lvm.go:395] provisioner pod status:Pending
csi-driver-lvm-plugin-bvp86 csi-driver-lvm-plugin I1017 09:13:38.184603       1 lvm.go:395] provisioner pod status:Pending
csi-driver-lvm-plugin-bvp86 csi-driver-lvm-plugin I1017 09:13:39.192834       1 lvm.go:395] provisioner pod status:Pending
csi-driver-lvm-plugin-bvp86 csi-driver-lvm-plugin I1017 09:13:40.199854       1 lvm.go:395] provisioner pod status:Pending
csi-driver-lvm-plugin-bvp86 csi-driver-lvm-plugin I1017 09:13:41.206348       1 lvm.go:395] provisioner pod status:Running
csi-driver-lvm-plugin-bvp86 csi-driver-lvm-plugin I1017 09:13:42.212716       1 lvm.go:395] provisioner pod status:Running
csi-driver-lvm-plugin-bvp86 csi-driver-lvm-plugin I1017 09:13:43.220177       1 lvm.go:385] provisioner pod terminated with failure
csi-driver-lvm-plugin-bvp86 csi-driver-lvm-plugin E1017 09:13:43.235743       1 controllerserver.go:142] error creating provisioner pod :rpc error: code = ResourceExhausted desc = volume creation failed
csi-driver-lvm-plugin-bvp86 csi-driver-lvm-plugin E1017 09:13:43.235783       1 server.go:114] GRPC error: rpc error: code = ResourceExhausted desc = volume creation failed

node worker-h1-1081261-183 doesn't have nvme disks, so it makes sense the provisioning fails.

it looks like plugin is jumping from the node to node and is trying to provision volume and that takes time. I would be nice if I can limit it to nodes, that only have a certain label, so it will not waste time on the nodes that will definitely fail.

PS: I've tried to schedule plugin daemonset only on the nodes with nvme disks, but it doesn't help - plugin is still trying all the available nodes

majst01 commented 2 years ago

You must prevent that your pods get scheduled to nodes where the specific type of disks is not present by add a affinity to nodes with a specific label. Documentation can be found here: https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/

slamdev commented 2 years ago

got it. that's what I was afraid of. I was hoping there is a way to solve it on the csi controller level, and not by modifying the pods. thanks for clarifying that!

metal-stack / csi-driver-lvm

How to limit nodes that cdi-driver is trying to provision a volume to? #61