aiidalab / aiidalab-launch

Tool to launch AiiDAlab on a local workstation.
MIT License
5 stars 3 forks source link

Shutdown verdi daemon gracefully upon container shutdown #180

Open danielhollas opened 1 year ago

danielhollas commented 1 year ago

Due to an change in behaviour of aiida-core==2.3.0 we've discovered that we probably do not shutdown the verdi daemon gracefully and it leaves its PID file behind.

We should try to modify the aiidalab-launch stop command to first execute verdi daemon stop before it shutdowns the container.

unkcpz commented 1 year ago

Rather than put the elegant shutdown in Aiidalab-launch, it makes more sense to handle this by the image itself. When the container stop, a signal is send to the container and we can use that to stop the daemon. As show https://medium.com/@gchudnov/trapping-signals-in-docker-containers-7a57fdda7d86#.6b04xvnr8, it deserves a try to see if it is easy to add directly.

There is also https://github.com/just-containers/s6-overlay which designed to manage the services for containers and provide exit point for what we need here to elegantly shutdown with stoping the daemon.

danielhollas commented 1 year ago

I agree it would be better to handle this at the image level.

When the container stop, a signal is send to the container and we can use that to stop the daemon. As show https://medium.com/@gchudnov/trapping-signals-in-docker-containers-7a57fdda7d86#.6b04xvnr8, it deserves a try to see if it is easy to add directly.

Nice find! I think the first thing here would be to determine if the SIGTERM signal is sent to all processes in the container. If that is the case, we could add the SIGTERM handler to the daemon in aiida-core.

When the container stop, a signal is send to the container and we can use that to stop the daemon. As show https://medium.com/@gchudnov/trapping-signals-in-docker-containers-7a57fdda7d86#.6b04xvnr8, it deserves a try to see if it is easy to add directly.

I would be against this unless we have a very strong reason. Changing the Init process is a non-trivial change and potentially openning a whole new can of worms.

unkcpz commented 1 year ago

I'll remove this from milestone, since the better way is to use the new aiida-core image I prepared which handle the gracefully shutdown of services, see https://github.com/aiidateam/aiida-core/pull/6080, using s6-overlay as mentioned in this OP.

danielhollas commented 1 year ago

But switching to this new image would mean that we would no longer be based on the Jupyter images?

unkcpz commented 1 year ago

Not explicitly based on it but I actually borrowed > 90% of https://github.com/jupyter/docker-stacks/tree/main/docker-stacks-foundation include this fix-permissions script. The different is I use s6-overlay to manage the services rather than start a jupyter server which is not needed in aiida-core image. When we use it, by using s6-overlay, it is very straightforward to make jupyter server the main longrun service.