kubernetes-sigs / descheduler

Descheduler for Kubernetes
https://sigs.k8s.io/descheduler
Apache License 2.0
4.23k stars 645 forks source link

Add option to apply RemovePodsViolatingNodeTaints only for explicitly included taints #1353

Closed etoster closed 3 months ago

etoster commented 4 months ago

Is your feature request related to a problem? Please describe.

We want to control the taints for which the RemovePodsViolatingNodeTaints strategy gets applied.

While the excludedTaints option offers some control, it only allows configuration of well-known and expected cases. It doesn't work for unforeseen situations where unpredictable taints may appear.

For example during an incident, an engineer might want to manually cordon the node while keeping all the workloads on it in order to not cause further disruption. This engineer might be missing the institutional knowledge of which taints are excluded to achieve this and thus might unknowingly cause the pods to get evicted by the RemovePodsViolatingNodeTaints strategy.

On the other hand it's impossible for the infrastructure owners to anticipate all possible taints which they would need to exclude.

Describe the solution you'd like

We would like to have the option to apply the RemovePodsViolatingNodeTaints strategy only for an explicitly allowlisted set of taints.

If and only if a taint is in a list of includedTaints will the pods on the node be considered candidates for descheduling.

If includedTaints is empty, all taints will be included by default, compatible with the current behavior.

excludedTaints still applies to all taints that are includedTaints. Meaning a taint that is in both will effectively be excluded.

Describe alternatives you've considered

  1. One alternative would be to limit the scope of evictions through node labels. But this would require setting/removing node labels together with taints. This is not supported by most tools designed for node lifecycle management (like node-problem-detector)

  2. Limit the applicable taints through policy enforcement with e.g. Kyverno. But this would effectively only implement the proposed solution with another tool, which makes maintaining the whole setup a lot more difficult.

What version of descheduler are you using?

descheduler version: v0.29.0

Additional context

damemi commented 4 months ago

This seems reasonable, and yeah I think it could be fine to allow both to be set. But it might be simpler to only allow setting either included or excluded, like we do with namespaces.

etoster commented 4 months ago

I agree and I cannot see a use case where one would need to set both included and excluded. Only in my mind it seemed maybe easier to implement the included list next to the excluded one, instead of a strict "either or". I'll check out how you do it with namespaces though!