CTRL + C or docker stop <container-name> does not kills all the processes

guoard commented 2 years ago

I run two processes in a container (Gunicorn and Nginx) when I kill one of the processes in the container, others stop automatically. logs:

time="2022-02-09T13:54:52+03:30" level=info msg="received terminated, shutting down"
time="2022-02-09T13:54:52+03:30" level=info msg="waiting for jobs to finish"
time="2022-02-09T13:54:52+03:30" level=info msg=exiting
[2022-02-09 13:54:52 +0330] [18] [INFO] Handling signal: term
[2022-02-09 10:24:52 +0000] [31] [INFO] Worker exiting (pid: 31)
[2022-02-09 10:24:52 +0000] [30] [INFO] Worker exiting (pid: 30)
[2022-02-09 10:24:52 +0000] [32] [INFO] Worker exiting (pid: 32)
[2022-02-09 13:54:52 +0330] [18] [INFO] Shutting down: Master
2022-02-09 13:54:52,981 WARN received SIGTERM indicating exit request
2022-02-09 13:54:52,981 WARN received SIGTERM indicating exit request
2022-02-09 13:54:52,981 INFO waiting for celery-worker_00 to die
2022-02-09 13:54:52,981 INFO waiting for celery-worker_00 to die

worker: Warm shutdown (MainProcess)

But when I try to stop the container by CTRL + C nothing happens. Or when I try to stop the container with docker stop <container-name> command it takes about 10 seconds then the container stops but there is no log about sending a signal to the subprocess.

nicolas-van commented 2 years ago

The behavior of multirun should be as follow:

Whenever it receives a signal it will forward them to all subprocesses.
It will not exit if any of its children is still alive (it politely waits for all its children to stop).
If any of its children doesn't want to exit, for any reason, it will never exit.
It doesn't output anything on stdout unless in verbose mode.

If you have problems with it and want to know how it behaves the first thing you should do is launch it in verbose mode (just add the -v flag). Then it should start to explain what's going on.

As far as I can see from your explanation, I can try to make the following guesses:

If it doesn't stop when you hit Ctrl+C that probably means that the SIGINT signal that you expect to send by doing so is not properly received by multirun. Since you run it inside a Docker container I could suggest it's coming from a bad usage of Docker interactive mode. As a quick recommandation I would suggest to use the command docker run with the -ti flags like docker run -ti <image_name>.
If it takes 10 seconds to stop when using docker stop command it probably comes from you children processes that take some time before agreeing to exit. That behavior doesn't seem unusual for web servers. If you want to have a full log just use the -v flag as I explained.

guoard commented 2 years ago

@nicolas-van thanks for your time and explanation

I run two processes in a container (Gunicorn and Nginx) when I kill one of the processes in the container, others stop automatically. logs:
time="2022-02-09T13:54:52+03:30" level=info msg="received terminated, shutting down"
time="2022-02-09T13:54:52+03:30" level=info msg="waiting for jobs to finish"
time="2022-02-09T13:54:52+03:30" level=info msg=exiting
[2022-02-09 13:54:52 +0330] [18] [INFO] Handling signal: term
[2022-02-09 10:24:52 +0000] [31] [INFO] Worker exiting (pid: 31)
[2022-02-09 10:24:52 +0000] [30] [INFO] Worker exiting (pid: 30)
[2022-02-09 10:24:52 +0000] [32] [INFO] Worker exiting (pid: 32)
[2022-02-09 13:54:52 +0330] [18] [INFO] Shutting down: Master
2022-02-09 13:54:52,981 WARN received SIGTERM indicating exit request
2022-02-09 13:54:52,981 WARN received SIGTERM indicating exit request
2022-02-09 13:54:52,981 INFO waiting for celery-worker_00 to die
2022-02-09 13:54:52,981 INFO waiting for celery-worker_00 to die

worker: Warm shutdown (MainProcess)
But when I try to stop the container by CTRL + C nothing happens. Or when I try to stop the container with docker stop <container-name> command it takes about 10 seconds then the container stops but there is no log about sending a signal to the subprocess.

I test this situation with -v flag

When run container with docker run -ti <image_name> command:

by CTRL + C it works properly, I can see logs and it does not take so much time to stop the container and all of the sub-processes (less than 2 seconds).
with docker stop <container-name> it's like before take about 10 seconds time to stop the container and no log about sending a signal to sub-processes.

nicolas-van commented 2 years ago

Hmmm...

Could you show your Dockerfile ? If the entrypoint is a script instead of a direct call to multirun please also post it.

guoard commented 2 years ago

Django project with celery, Supercronic Nginx, and Gunicorn repository

and if you want to run it in your localhost, just need to change CELERY_BROKER_URL variable in config/settings.py file.

nicolas-van commented 2 years ago

~~Can you replace the last line of multirun.sh by this line and test again ?~~

~~call multirun -v "${commands[@]}"~~

guoard commented 2 years ago

Got this error:

/usr/src/app/multirun.sh: line 30: call: command not found

nicolas-van commented 2 years ago

Yeah, my mistake. This one should do:

exec multirun -v "${commands[@]}"

guoard commented 2 years ago

@nicolas-van thank you it works now.

I suggest adding this to the readme file and explaining why we need exec.

nicolas-van commented 2 years ago

Well, it's not that simple because it's not specific to multirun.

You see, whenever you use Docker, the process that will receive signals when you make a docker stop will always be the first one that was launched, which means the one in your entrypoint (it will be considered PID 1 inside your container).

Now in your case your entrypoint is a bash script which means your PID 1 process is actually bash. And bash, like any other shell, will do what it wants with signals which may not be what you desire (which means forward it to multirun).

To solve this problem you must kill the bash process after it finished what it had to do and replace it with multirun while keeping the same PID (or whatever other process you want to launch after your script executed). This is done with the exec sh command.

What's boring is that some shell implementations make the call to exec implicitly on the last command and some don't. That's why it's a good practice to explicitly specify that you want to terminate your shell process by using exec on the last line. This advice is valid for mostly all Docker images using a script as entrypoint, including the ones that don't use process managers at all.

nicolas-van commented 2 years ago

On second thought I added it to the readme as this may be a cause of frustration, even if that's not really related to multirun.

guoard commented 2 years ago

I think it's a good guide, also saw exec in s6-overlay repository.

nicolas-van / multirun

CTRL + C or docker stop <container-name> does not kills all the processes #14