Open danielhollas opened 1 year ago
Rather than put the elegant shutdown in Aiidalab-launch, it makes more sense to handle this by the image itself. When the container stop, a signal is send to the container and we can use that to stop the daemon. As show https://medium.com/@gchudnov/trapping-signals-in-docker-containers-7a57fdda7d86#.6b04xvnr8, it deserves a try to see if it is easy to add directly.
There is also https://github.com/just-containers/s6-overlay which designed to manage the services for containers and provide exit point for what we need here to elegantly shutdown with stoping the daemon.
I agree it would be better to handle this at the image level.
When the container stop, a signal is send to the container and we can use that to stop the daemon. As show https://medium.com/@gchudnov/trapping-signals-in-docker-containers-7a57fdda7d86#.6b04xvnr8, it deserves a try to see if it is easy to add directly.
Nice find! I think the first thing here would be to determine if the SIGTERM signal is sent to all processes in the container. If that is the case, we could add the SIGTERM handler to the daemon in aiida-core
.
When the container stop, a signal is send to the container and we can use that to stop the daemon. As show https://medium.com/@gchudnov/trapping-signals-in-docker-containers-7a57fdda7d86#.6b04xvnr8, it deserves a try to see if it is easy to add directly.
I would be against this unless we have a very strong reason. Changing the Init process is a non-trivial change and potentially openning a whole new can of worms.
I'll remove this from milestone, since the better way is to use the new aiida-core image I prepared which handle the gracefully shutdown of services, see https://github.com/aiidateam/aiida-core/pull/6080, using s6-overlay
as mentioned in this OP.
But switching to this new image would mean that we would no longer be based on the Jupyter images?
Not explicitly based on it but I actually borrowed > 90% of https://github.com/jupyter/docker-stacks/tree/main/docker-stacks-foundation include this fix-permissions
script. The different is I use s6-overlay to manage the services rather than start a jupyter server which is not needed in aiida-core image. When we use it, by using s6-overlay, it is very straightforward to make jupyter server the main longrun service.
Due to an change in behaviour of
aiida-core==2.3.0
we've discovered that we probably do not shutdown the verdi daemon gracefully and it leaves its PID file behind.We should try to modify the
aiidalab-launch stop
command to first executeverdi daemon stop
before it shutdowns the container.