Closed mixitgit closed 2 years ago
@mixitgit thanks for creating this issue! Not sure if I understand this correctly, but you want that vcluster sets tolerations to some pods automatically that are synced between vcluster and the host cluster? Currently you can already set tolerations on pods within the vcluster that are then synced to the host cluster. Since scheduling happens in the host cluster those tolerations will be considered during scheduling.
I want to set tolerations automatically, this would allow to create dedicated nodes for vclusters
For example:
If we set --enforce-node-selector --node-selector=foo=bar
Currently it will automatically add a node selector to all pods:
nodeSelector:
worker.kaas.sbrf.ru/vcluster: sbl-d-01
If we would also add a toleration:
tolerations:
- key: "foo"
operator: "Equal"
value: "bar"
effect: "NoSchedule"
We then could create a node with label foo=bar
and a taint foo=bar NoSchedule
, and this node will be dedicated to this vcluster (no other vclusters will schedule their pods there)
This is extremely helpful when multiple teams sit on the same host clusters and some of them should have their own dedicated nodes
This toleration won't affect other use cases (unless someone will want to label his node and at the same time have same taint on it, so the pods won't be able to schedule there, but such case seems a bit weird)
@mixitgit thanks for the explanation! Ah I see, so you want to dedicate the node completely to a vcluster, that makes sense. Yes I guess we could add a flag for this.
Maybe it would be more convenient to embed this functionality to enforce-node-selector
flag? Because it's actually kind of enforces this nodes to be binded to the cluster. Otherwise it would be a little hard to set it through a flag, because it actually has 4 fields (key, operator, value, effect), so its either should be biased (i.e. always set operator="Equal", effect="NoSchedule"), or might be hard to understand. I am not sure though
@mixitgit there is an official notation for tolerations, like this key1=value1:NoSchedule
, so I don't think this will be a problem see Kubernetes docs. In general I guess we shouldn't mix those with enforce-node-selector
and rather do a new flag to avoid confusions.
@FabianKramm ok, sure, with this notation it seems perfect
Closing as #347 was merged
Why?
Currently it is possible to set node selector to ensure that all pods created in vcluster will be scheduled on some subset of nodes. But in scenarios, when there are multiple vclusters, some of which use fake nodes and some of which should use dedicated nodes, it doesn't guarantee that pods from vclusters that use fake nodes won't schedule on dedicated nodes.
How?
This could be solved by setting tolerations on all pods in vclusters that use dedicated nodes and corresponding taints on these nodes. So in couple with node selector we can guarantee that pods will be scheduled on some node if and only if they belong to some certain vcluster. This could be done either by allowing to set tolerations in some separate flag, or just via
--enforce-node-selector
, because there is no case when tolerations and node selector should differ in terms of dedicated nodes