coreos / container-linux-update-operator

A Kubernetes operator to manage updates of Container Linux by CoreOS
Apache License 2.0
209 stars 49 forks source link

update-agent: Added reboot-wait parameter #192

Open johannwagner opened 5 years ago

johannwagner commented 5 years ago

This adds an reboot-wait parameter, which waits, after the last pod was terminated, an fixed amount of time to finalize operations before reboot. This solves some problems this storage provisioners like rook.

/cc: @martin31821

coreosbot commented 5 years ago

Can one of the admins verify this patch?

lucab commented 5 years ago

@johannwagner thanks for the PR! However this lacks quite a bit of accompanying context, and on the surface it seems to just be trying to hide an existing race.

What is the problem this is trying to solve related to rook? How do you determine what is a good reboot-wait time (and is there some upper-bound guarantee on the value)? Why isn't pod draining sufficient here?

embik commented 5 years ago

Hi @lucab, not the author but hope I can provide some context as well.

This seems to be useful because of situations like #191 - beyond pod termination there are some downstream operations (e.g., unmounting a volume because it is going to be attached to another node - this can take a short period of time; I suppose that's why it's useful for rook) that are not fully grasped by CLUO at the moment.

Implementing the reboot-wait parameter would be a bit hacky, but it's a nice way around implementing checks for all possible downstream operations (which even might be proprietary).

johannwagner commented 5 years ago

Jeep, that’s the reason. We want to be able to shutdown our CEPH Cluster properly, which only works, if everything shutdowns properly after the last Pod was terminated. @embik assumes the correct context.

dghubble commented 5 years ago

If a drain isn't sufficient for your situation, CLUO supports before-reboot (and after-reboot) required annotations. A custom DaemonSet can run before/after a reboot to perform custom cleanup/setup actions (e.g. waiting a fixed period, implement Ceph shutdown, etc.) and then add the required annotation, which CLUO awaits.

https://github.com/coreos/container-linux-update-operator/blob/master/doc/before-after-reboot-checks.md

martin31821 commented 5 years ago

@dghubble We are already using custom wait parameters to cleanly unmount and reboot nodes, but at least the rbd driver does some internal network related tasks which we can't wait for at the moment.

cgwalters commented 5 years ago

I think this would probably be better done as a systemd unit with an ExecStop= that would run at shutdown time and call some API that waits for whatever Rook/Ceph needs to do.

martin31821 commented 5 years ago

If you follow these instructions https://github.com/rook/rook/blob/master/Documentation/container-linux.md on a CoreOS cluster, it will cause the nodes to hang during shutdown because the network is already brought down by systemd before the rbd module is able to complete its unmount and unlocking the objects/drive, that's why our initial implementation used a delay of approx. 30s to ensure this will be finished.

Basically the before-reboot hook runs before the update operator drains the node and maybe it's good to have the same hook mechanic being executed after-drain-before-reboot.

WDYT?

embik commented 5 years ago

@dghubble in that case the custom DaemonSet would be required to implement draining the node, because in the scenario described we need to drain -> wait -> reboot, not wait -> drain -> reboot. Which means it will reimplement logic already present in CLUO. The workflow would be drain (custom logic) -> wait (custom logic) -> already cordoned, no drain (CLUO) -> reboot (CLUO).

Honestly, if I need to write custom logic and deploy a custom DaemonSet and all that, I'd probably just fork CLUO and apply this patch because it solves the problem well enough. I could do that, no problem, I just think you should be aware this patch could be helpful for production users.