Open vipyoung opened 4 years ago
Hi, there is indeed an issue with the SIGHUP not being propagated inside the docker container. I've tried to trap the signal the way you drafted and it indeed does not work as expected. The issue, however, is not in the tileserver-gl code itself, but rather in the concept of the run.sh
script.
The SIGHUP signal works:
The Dockerfile for tileserver-gl uses the run.sh an it is a little more complicated because of the xvfb init and clean restarting of the process. (There needs to be xvfb cleanup, otherwise docker restart
does not work correctly.)
The issue is that the run.sh launches the node
as a subprocess, gets the PID and waits for the result before cleanup. It also sets up the traps (TERM/INT + your HUP) so that when the signal is received, it propagates to the process.
However, in the bash script, the trap interrupts the wait:
When Bash is waiting for an asynchronous command via the wait built-in, the reception of a signal for which a trap has been set will cause the wait built-in to return immediately with an exit status greater than 128, immediately after which the trap is executed.
So, when the HUP signal is sent, the wait
ends, the trap
fires, the signal is received by the node process, but shortly after that the whole docker container exits. (Because the run.sh script is done executing.)
One of the "solutions" is to double-wait the PID (see http://veithen.io/2014/11/16/sigterm-propagation.html), but that would not work correctly for SIGHUP (it would work only once). There might be an outline of a solution in this thread: https://stackoverflow.com/q/57120038 but I'm not a bash master and it seems to be getting quite complicated with process groups etc.
So I'm afraid that the SIGHUP does not currently work when running the full tileserver-gl in a docker, but PR definitely welcome if you have time to look more into this. (In the meantime I'll add a note to the docs so people know about the issue.)
Just fort the record, another ref maybe for a PR: https://unix.stackexchange.com/questions/146756/forward-sigterm-to-child-in-bash/444676#444676.
HTH, Matteo
If all you are looking for is the ability to send the signal from the command prompt you may use
docker exec tileserver-gl bash -c 'kill -HUP $(ps -x | sed -n "/\snode/s/ *\([0-9]\+\) .*/\1/p")'
If all you are looking for is the ability to send the signal from the command prompt you may use
docker exec tileserver-gl bash -c 'kill -HUP $(ps -x | sed -n "/\snode/s/ *\([0-9]\+\) .*/\1/p")'
it worked when i change the config.json and send the command.
If all you are looking for is the ability to send the signal from the command prompt you may use
docker exec tileserver-gl bash -c 'kill -HUP $(ps -x | sed -n "/\snode/s/ *\([0-9]\+\) .*/\1/p")'
Recentlty, the docker image no longer has the ps
command, and the above fails.
Now, you need to use a different command:
docker exec tileserver-gl bash -c 'kill -HUP $(ls -l /proc/*/exe | sed -n "/\/node$/s/.*proc\/\([0-9]\+\)\/exe .*/\1/p")'
I'm trying to use SIGHUP to reload the configuration file without success. Looking at code (run.sh), it seems to me that SIGHUP is not forwarded to the server. Thus I've added the followings:
_reload() { echo "Caught signal, reloading gracefully" kill -HUP "$child" 2>/dev/null } trap _reload SIGHUP
Now, runningdocker kill --signal=SIGHUP ${containerID}
shows the following logs before the server quits without any further messages.I tried to debug the code (although I'm completely new to node/js) and it seems to me that the problem comes from serve_style.js when the styleFile is validated:
let validationErrors = validate(styleFileData);
I have printed the value of
styleFile
when I first launch the docker instance, then when I send SIGHUP signal, and the value is exactly the same/data/styles/klokantech-basic-gl-style-gh-pages/style-local.json
which makes me think the problem is related to reading a file that is already open? Or may be something to do withlet
?I can prepare a PR for this issue, but I'd need some help with this particular issue.