Our endpoint script doesn't handle SIGTERM properly. This means that when we run docker stop, we get a SIGTERM, disregard it, wait until the timeout, then receive the SIGKILL.
Workaround: we can run with docker stop -t 0, which will immediately SIGKILL. Fine for now since we don't have any cleanup, but we should probably do this correctly...
Most people seem to solve this problem by (1) using exec mode when starting their service (we already do that) and (2) if the entrypoint is a shell script, using exec when starting their service there. This however, will only work if you have a single service you are starting. We have several.
I suspect this means we need to actually trap the signal in our shell script.
Our endpoint script doesn't handle SIGTERM properly. This means that when we run docker stop, we get a SIGTERM, disregard it, wait until the timeout, then receive the SIGKILL.
Workaround: we can run with docker stop -t 0, which will immediately SIGKILL. Fine for now since we don't have any cleanup, but we should probably do this correctly...
Lots of info about this: The original discussion on github: https://github.com/docker/docker/pull/3240 Docker and PID 1: http://blog.phusion.nl/2015/01/20/docker-and-the-pid-1-zombie-reaping-problem/ Exec v. shell form (we are using exec form): https://docs.docker.com/reference/builder/#entrypoint Other projects that ran into this, and their solutions: https://github.com/docker/docker/issues/2436 https://gist.github.com/zba/27d9e54e7293c1bb2da4 http://veithen.github.io/2014/11/16/sigterm-propagation.html (<-----we may need this one) https://github.com/docker/docker/issues/3766
Most people seem to solve this problem by (1) using exec mode when starting their service (we already do that) and (2) if the entrypoint is a shell script, using exec when starting their service there. This however, will only work if you have a single service you are starting. We have several.
I suspect this means we need to actually trap the signal in our shell script.