rancher / system-upgrade-controller

In your Kubernetes, upgrading your nodes
Apache License 2.0
723 stars 86 forks source link

Some way to prevent concurrent plans on the cluster #225

Open davidcassany opened 1 year ago

davidcassany commented 1 year ago

Is your feature request related to a problem? Please describe. The problem we are facing is finding a proper way to prevent two different upgrade plans to be executed concurrently on the same cluster or even on the same node. In Elemental we found nothing prevents multiple upgrade plans to be executed concurrently over the same node (https://github.com/rancher/elemental-operator/issues/364). This has potential to corrupt the node OS.

Describe the solution you'd like Having some sort of configuration to prevent simultaneous plans being executed. Once a plan started in a cluster another one can't be scheduled until the current ongoing one has ended.

Describe alternatives you've considered I have been wondering of setting on some workarounds at node level like setting taints in prepare step and then setting some tolerations to prevent having multiple plans accessing the same node simultaneously and after thinking about and digging I believe this is not right, as I need something at cluster level. Preventing concurrent plans on a node can lead to scenarios where the plans are executed in different order depending on the node. I think there should be some way to prevent that from happening.

Additional context This issue feels like it could be related to #163, but since the use case is different I opened a new issue in case I missed something, I am still learning how system-upgrade-controller works.