bryanlatten commented 5 years ago

Hi @skarnet, trying to understand some strange shutdown behavior that is related to other topics (similar to https://github.com/just-containers/s6-overlay/issues/41, https://github.com/just-containers/s6-overlay/issues/141).

Goal:

On shutdown, intercept SIGTERM/SIGUP and convert to a more graceful signal, or in some cases, block or ignore signal until another action completes. SIGTERM is the default the container will receive during a docker stop, either by the orchestrator or through admin action. As developer running the container live would use CTRL+C to terminate a container running in foreground.

Test case:

Using trap and term / hup, catch signals before they reaches the associated program. Suppress by taking no action on them.

trap
{
  term { echo [outer] term }
  hup { echo [outer] hup }
}
program

Result:

We still see term and hup signals propagated into program, even though based on the trap documentation, it should no longer forward these signals.

Repro: To try to isolate the behavior, I created the following hack to "trap" a trap service:

Dockerfile

FROM busybox
ADD https://github.com/just-containers/s6-overlay/releases/download/v1.21.8.0/s6-overlay-amd64.tar.gz /tmp/
RUN gunzip -c /tmp/s6-overlay-amd64.tar.gz | tar -xf - -C /
ENTRYPOINT ["/init"]

RUN mkdir -p /etc/services.d/myapp
COPY run /etc/services.d/myapp/run

/etc/services.d/myapp/run

#!/usr/bin/execlineb -P

trap
{
  term { echo [outer] term }
  hup { echo [outer] hup }
}

trap
{
  term { echo [inner] term }
  hup { echo [inner] hup }
}

loopwhilex true

While running under Docker (with -it flags) and giving a CTRL+C (sighup):

[s6-init] making user provided files available at /var/run/s6/etc...exited 0.
[s6-init] ensuring user provided files have correct perms...exited 0.
[fix-attrs.d] applying ownership & permissions fixes...
[fix-attrs.d] done.
[cont-init.d] executing container initialization scripts...
[cont-init.d] done.
[services.d] starting services
[services.d] done.
^C[cont-finish.d] executing container finish scripts...
[cont-finish.d] done.
[s6-finish] waiting for services.
s6-svwait: fatal: timed out
[s6-finish] syncing disks.
[s6-finish] sending all processes the TERM signal.
[outer] term
[inner] term
[s6-finish] sending all processes the KILL signal and exiting.

Obviously the loop timed out, but you can see the double trap revealed our strange behavior -- the SIGHUP is being passed through to its final destination (both outer and inner logs are present).

Am I misunderstanding the behavior of trap?

skarnet commented 5 years ago

@bryanlatten Unless I'm mistaken and Docker does something special with ^C, ^C generates a SIGINT, not a SIGHUP...

bryanlatten commented 5 years ago

@skarnet would you expect the signals to pass through the first trap if they are explicitly "handled"? trap -x seems to not change any behavior also. Doing a docker stop (term) also has similar behavior.

skarnet commented 5 years ago

@bryanlatten Okay, if you send a SIGTERM and trap forwards it, then there's a problem indeed. I'm not a Docker user, so I can't exactly try and reproduce, but can you make a strace -f of your programs (one with just a trap, another with the two nested traps) and paste the output somewhere? It will help me figure out exactly what's happening.

And just to compare, does a shell script using the trap builtin work as you expect in the same circumstances?

bossjones commented 5 years ago

Hi @skarnet and @bryanlatten,

I tried to collect a strace for this situation, on CTL+C action from inside of a docker container. Just in case the details matter, I used a vagrant guest ( Ubuntu Bionic ).

Docker version

vagrant@devtop-bionic:~/dev/s6-signal$ docker --version
Docker version 18.09.5, build e8ff056

Dockerfile(Modified)

I used ubuntu for my dockerfile since i'm more familiar with that over busybox.

FROM ubuntu:bionic
RUN apt-get update; apt-get install -y curl bash git strace
ADD https://github.com/just-containers/s6-overlay/releases/download/v1.21.8.0/s6-overlay-amd64.tar.gz /tmp/
RUN gunzip -c /tmp/s6-overlay-amd64.tar.gz | tar -xf - -C /
ENTRYPOINT ["/init"]

RUN mkdir -p /etc/services.d/myapp
COPY run /etc/services.d/myapp/run

Build

docker build -t bossjones/s6-test:latest .

Run(Bash login prompt)

docker run -i -t --rm --privileged --cap-add=ALL --entrypoint "bash" -v $(pwd):/app:rw -w /app bossjones/s6-test:latest -l

Strace command

strace -s 8192 -f -o /app/strace.out /init

strace output is attached.

strace.txt

skarnet commented 5 years ago

Oh man, @bossjones, if you want a command that does nothing, please use s6-pause or a long sleep, not loopwhilex true ! The latter will busyloop, heating your CPU and spewing lots of garbage to the strace. Sorry, but that trace is unreadable to me.

bossjones commented 5 years ago

Sorry about that @skarnet ! Didn't know about s6-pause Hopefully this is much more readable ! Only 225 lines this time.

strace2.txt

skarnet commented 5 years ago

Thanks @bossjones. So, what's happening, @bryanlatten, is:

^C is sending a SIGINT to s6-svscan, which catches it and starts its shutdown procedure.
The SIGINT is sent to all the processes in s6-svscan's session, which means the s6-supervise processes also get the SIGINT and die instantly. That is working as intended: you're not supposed to have s6-svscan running in a controlling terminal and sending it ^C, and if you do, it is assumed that you want to kill the whole supervision tree.
The shutdown procedure calls s6-nuke -th, which sends a SIGHUP and a SIGTERM to all the processes at the same time. So, no matter how many traps you set, they will all get a SIGHUP and a SIGTERM, and print the message you set, and the end program under the traps will also get the signals.
For some reason, s6-svscan isn't running as process 1 (is it a proper container?), so the shutdown procedure isn't running as process 1 either, so it gets killed by the s6-nuke signals too!

Conclusion: trap is working as intended, but the way the container is run is out-of-spec. You can't interrupt s6-svscan with ^C without throwing a serious wrench into everything. And s6-svscan really needs to run as pid 1 in the container.

bryanlatten commented 5 years ago

@bossjones looks like you overrode the entrypoint during your strace to bash

bryanlatten commented 5 years ago

@skarnet really appreciate your help on all of this. To summarize what we know so far:

^C is translated to SIGINT by Docker
SIGINT received by s6-svcscan will drag down the whole supervision tree via s6-nuke -th which sends SIGHUP and SIGTERM to all processes, simultaneously. Amounts to a "known behavior" for SIGINT?
@bossjones and I need to correct his stacktrace, but we are still seeing a SIGHUP during a non-interactive, non-^C event, like docker stop.

skarnet commented 5 years ago

The whole supervision tree (the s6-svscan and s6-supervise processes) is killed by the SIGINT. That is a known behaviour when you run a supervision tree with a controlling terminal (typically from your command line). In a container, s6-svscan should not have a controlling terminal, it should not be interruptible by SIGINT. Also, when s6-svscan dies (because of the SIGINT), it doesn't really die (because it's supposed to run as process 1) but instead executes into a shutdown sequence. That shutdown sequence calls s6-nuke -th which kills all your processes, simultaneously - that is the intended behaviour on docker stop, but ^C triggers it too early.
I'd need to see a strace of what happens with a docker stop event instead of a ^C. I'm positive there's also an explanation for the behaviour you're seeing that doesn't involve trap not working. :-)

bossjones commented 5 years ago

Apologies for the delay!

Two more strace options.

First

Ran strace directly from docker, then ran docker stop. I have a feeling this strace is too verbose, but wanted to include options.

strace -s 8192 -f -o ./strace2.out docker run -d -i -t --rm --privileged --cap-add=ALL -v $(pwd):/app:rw -w /app bossjones/s6-test:latest

strace2.txt

Second

Started docker with docker run -d -i -t --rm --privileged --cap-add=ALL -v $(pwd):/app:rw -w /app bossjones/s6-test:latest

Entered the container with docker exec, then ran strace -p 1 -s 8192 -f -o /app/strace3.out.

strace3.txt

skarnet commented 5 years ago

Thanks @bossjones.

I have no use of the strace2.txt file, since it straces the calls in the process subtree created by docker stop, and it doesn't show at all what s6 is doing.
The strace3.txt file, however, confirms what I said: everything is working as expected. @bryanlatten: trap is only meant to protect your program against a SIGTERM sent to the pid of one supervised process, i.e. if you perform a s6-svc -t /var/run/s6/services/services/myapp, the trap command will divert it, and your "myapp" process will not get the SIGTERM. However, trap cannot protect "myapp" against the container shutdown procedure, which sends a SIGTERM and a SIGHUP, and then a SIGKILL, to all processes in the container at once. It's literally a kill -TERM -1. There is no diverting that SIGTERM; this is the one you're seeing your "inner trap" receive.

Normally, in the s6-overlay shutdown procedure, running services are stopped first via var/run/s6/etc/cont-finish.d scripts, likely calling s6-svc -d on those services. That way, when the final kill arrives, services like "myapp" have already had time to shut down properly (they have at least $S6_SERVICES_GRACETIME milliseconds to do so). Here, the grace time expires because the services are still running and apparently have not been instructed to exit. So, naturally, the nuke at the end catches them pants down.

Stop your services properly in cont-finish.d, so they're already down when the container is about to close and slaughters everything at once.

bryanlatten commented 5 years ago

@skarnet super helpful explanation - thank you! I've been trying to find a clearer explanation for the use of cont-finish.d (task finalization), but can't quite locate one - other than deep conversation in issues. Am I missing something?

Reference: https://github.com/just-containers/s6-overlay/issues/41

Back to the original question: can I intercept an administratively-generated SIGTERM (docker stop) to prep a supervised service for graceful shutdown? It does not appear that cont-finish.d runs before the distributed SIGTERM

skarnet commented 5 years ago

It should. The s6-nuke -th instruction, which sends the global SIGTERM (and also SIGHUP, because processes such as interactive shells may block or ignore SIGTERM) normally runs after the cont-finish.d scripts. It's been a long time since I've been directly involved with the s6-overlay structure itself (which is honestly quite old and would benefit from a serious makeover, but the current one works and should do the right thing); so at this point I prefer to defer to @glerchundi and @jprjr.

bryanlatten commented 5 years ago

@skarnet maybe that is the disconnect. I am clearly seeing cont-finish.d execute after the supervised tasks get their SIGTERM. In this example, "service trapped term" is clearly visible before "cont-finish.d" which are echo'd from their respective places.

[services.d] starting services
[services.d] done.
<><><> service trapped term <><><>
[cont-finish.d] executing container finish scripts...
[cont-finish.d] abc.sh: executing...
<><><> cont-finish.d <><><>
[cont-finish.d] abc.sh: exited 0.
[cont-finish.d] done.
[s6-finish] syncing disks.
[s6-finish] sending all processes the TERM signal.
<><><> service trapped hup <><><>
<><><> service trapped term <><><>
[s6-finish] sending all processes the KILL signal and exiting.

@glerchundi @jprjr any ideas on what I'm doing wrong?

skarnet commented 5 years ago

@bryanlatten When you print <><><> service trapped term <><><>, does it mean that the trap process (the "outer" one) is receiving a SIGTERM, or your application itself (or an "inner" trap), normally protected by a trap command, is still receiving a SIGTERM?

bryanlatten commented 5 years ago

@skarnet ah, that's the key. combining the two techniques (cont-finish.d + run-line trap) yields the desired behavior:

[services.d] done.
^C<><><> [outer] term <><><>
[cont-finish.d] executing container finish scripts...
[cont-finish.d] shutdown.sh: executing...
<><><> cont-finish.d <><><>
[cont-finish.d] shutdown.sh: exited 0.
[cont-finish.d] done.
[s6-finish] syncing disks.
[s6-finish] sending all processes the TERM signal.
<><><> [outer] term <><><>
<><><> [inner] term <><><>

Inner TERM is seem after the cont-finish.d line, meaning we were able to trap the initial TERM broadcast!

Again, thank you for all the help. On closing, can we add a line in the s6 docs around cont-finish.d usage?

skarnet commented 5 years ago

@bryanlatten cont-finish.d is a policy specific to s6-overlay, a way to provide user-configurable container initialization and shutdown. It has nothing to do with s6 itself, which only provides mechanism.

If anything, you could ask @glerchundi to clarify the s6-overlay documentation about cont-finish.d. :-)

just-containers / s6-overlay

Execlineb Trap: always forwards signals? #264

Docker version

Dockerfile(Modified)

Build

Run(Bash login prompt)

Strace command

First

Second