Closed slamdev closed 2 years ago
You must prevent that your pods get scheduled to nodes where the specific type of disks is not present by add a affinity to nodes with a specific label. Documentation can be found here: https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/
got it. that's what I was afraid of. I was hoping there is a way to solve it on the csi controller level, and not by modifying the pods. thanks for clarifying that!
I have different node types, with disks eligible for csi and without. disks for csi have a different name, so I am using this config:
that worked well while I had 90% of nodes with nvme disks. but now the ratio is 50/50 and I see a lot of errors like
in plugin logs I see:
node worker-h1-1081261-183 doesn't have nvme disks, so it makes sense the provisioning fails.
it looks like plugin is jumping from the node to node and is trying to provision volume and that takes time. I would be nice if I can limit it to nodes, that only have a certain label, so it will not waste time on the nodes that will definitely fail.
PS: I've tried to schedule plugin daemonset only on the nodes with nvme disks, but it doesn't help - plugin is still trying all the available nodes