coreos / container-linux-update-operator

A Kubernetes operator to manage updates of Container Linux by CoreOS
Apache License 2.0
209 stars 49 forks source link

reboot groups #17

Open mischief opened 8 years ago

mischief commented 8 years ago

this issue is for discussion on implementing reboot groups, similar to what locksmith has today.

currently my idea is to create a kubernetes TPR that describes which nodes are in which groups, and how many nodes of that groupt to reboot at once.

ghost commented 6 years ago

is it possible to simulate reboot groups by running several set of agents and operator each on different namespace? e.g. group1 will have agents and operator with namespace group1, group2 will have agents and operator with namespace group2

sdemos commented 6 years ago

It seems like that would be possible, as long as you restrict the nodes that the agent can run on to only nodes in your "group", but I've never tried to do anything like that before. However, currently cluo only reboots one node at a time anyway, so the main reason that reboot groups exist in locksmith isn't actually relevant here unless we provide the ability to configure the number of machines rebooting at once.

ghost commented 6 years ago

yes, the agent will be restricted into particular node using node selector, thus effectively it is possible two nodes on different group will reboot at the same time