Open benweint opened 11 months ago
Turns out I had misdiagnosed this!
nf
really was delivering SIGINT
to all direct children, but because it doesn't use process groups for each spawned child, if the child processes spawned their own children and didn't respond to SIGINT
by exiting or forwarding to their children, then nf
would just hang when one child exited.
In the repro case that I gave, the process tree looks like this after a
exits:
❯ pstree -s nf.js
... snip ...
\-+= 83622 ben node nf.js start
\-+- 83628 ben /bin/bash ./wait-for-sigint.sh
\--- 83632 ben sleep 1000000
The bash
process (pid 83628) actually has received the SIGINT
, but per the bash manual:
When Bash receives a signal for which a trap has been set while waiting for a command to complete, the trap will not be executed until the command completes.
So in this example:
a
exitednf
sent SIGINT
to the direct child process for b
(bash
, pid=83628)bash
got the SIGINT
, but was waiting to invoke the trap handler until the sleep
command (pid=83632) exitedsleep
command itself never received the SIGINT
The way that goreman
solves this is by creating a process group for each spawned child, and then delivering the SIGINT
signals to the group, rather than the direct child.
I've implemented support for using process groups in my fork (https://github.com/benweint/node-foreman/commit/5cb9ee5009772fce10eb1cafd9ffa00b7d780102) and can PR it if there's interest, but it looks like this project might be dead.
The README says:
nf
does seem to detect the exit of a single child process, and claims to be sending aSIGINT
to all children in response to it, but in fact will not deliver theSIGINT
in all cases.Here's a simple repro case:
Observations
If I modify
wait-for-sigint.sh
to emit a constant stream of output while it is waiting, then the test case works as expected:Comparison to other implementations
foreman
(Ruby)goreman
(Go)goreman
has different default behavior wrt a single child process exiting:... but with
-exit-on-error
('Exit goreman if a subprocess quits with a nonzero return code'):