--die-with-parent does not kill spawned processes unless --unshare-pid is set

kindrowboat commented 2 years ago

Hello, I've been playing around with bubblewrap, and something that tripped me up is when I ran a program in bwrap with --die-with-parent that spawned one or more sub-processes, those sub-processes weren't getting killed. However when I added --unshare-pid, the child processes were properly killed. Note that both --die-with-parent and --unshare-pid had to be specified for bwrap to get the kernel to properly kill of spawned child processes.

You can see the tests I was running at https://git.kindrobot.ca/kindrobot/bwrap-dwp-test.

Is this the intended behaviour? If so, should something be added to the documentation under --die-with-parent?

rusty-snake commented 2 years ago

Is this the intended behaviour?

I think so, because that's how prdeadsig and pid namespaces work.

If so, should something be added to the documentation under --die-with-parent?

The documentation is a bit confusing and can be improved IMHO.

kindrowboat commented 2 years ago

I see, thank you! I'd be happy to make a PR to update the docs.

Does it make sense to either error out or show a warning when --die-with-parent is specified and --unshare-pid is not? I'd lean towards erring, but I am just a beginner with Linux namespaces, so I'm not sure if there's another way to set it up so that --die-with-parent would work.

smcv commented 2 years ago

bubblewrap is more mechanism than policy, so in general its approach to unexpected interactions between different kernel features tends to be "you asked for it, you got it" rather than second-guessing whether you intended to have a particular behaviour.

--die-with-parent means something relatively precise and well-defined: it uses PR_SET_PDEATHSIG to arrange for the kernel to send SIGKILL to bubblewrap's immediate child process (the COMMAND you specified on the bwrap command-line), and also to processes that are part of bubblewrap itself, when bubblewrap's parent process exits. The exact implications of that are not necessarily obvious, but that's more like a fact about the kernel rather than a fact about bubblewrap: bubblewrap is really just giving you access to a Linux kernel feature here.

In general, when not using --unshare-pid, as far as I'm aware there is no way to set up a similar parent-death signal for child processes of your COMMAND, because the parent-death signal is cleared for the child of a fork() (see prctl(2)), and therefore it doesn't inherit any further than bwrap's immediate child (the COMMAND) unless the COMMAND takes its own steps to propagate that setting. This project has a policy of not doing impossible things :-)

If your COMMAND wants its immediate child processes to die when the COMMAND dies, it can do the same thing bubblewrap does: fork, then PR_SET_PDEATHSIG, then exec. It doesn't even have to use SIGKILL, it could use a graceful termination signal like SIGTERM. Or, if you prefer, the child processes could do a PR_SET_PDEATHSIG operation on startup; or even both. Either way, if the child processes want to disable the parent-death signal and continue to run, then they can use PR_SET_PDEATHSIG themselves to set the parent-death signal to 0, which means you can't rely on this as a security mechanism.

If you use --unshare-pid, then --die-with-parent results in process 1 of the pid namespace being killed: depending whether you used --as-pid-1, process 1 could be part of bubblewrap, or it could be the COMMAND, but either way, --die-with-parent will kill it. As documented in pid_namespaces(7), killing process 1 of a pid namespace results in all other processes in the namespace receiving SIGKILL, as you've observed.

If you want the kernel to clean up your namespaced processes automatically when bubblewrap's parent process dies, without giving processes inside your namespace the opportunity to escape from that cleanup, then combining --unshare-pid with --die-with-parent is probably the way to do it.

Again, bubblewrap is mechanism rather than policy; how best to use that mechanism is a question for some larger framework. Flatpak is a typical example of a larger framework that imposes more policy, which it implements by running bwrap.

containers / bubblewrap

--die-with-parent does not kill spawned processes unless --unshare-pid is set #529