Reduce delays in detecting child exit

In brief, this fixes most instances of #166 through the combination of [SIGCHLD=sub{}], [low-select-timeout-w32], and [waitpid-blocking]

Issue #166 reported delays of up to 0.5s in detecting child exit. When IPC::Run is interacting with the child via a file descriptor (FD), the delay likely doesn't happen. Child exit will make FDs ready, which select() will notice. With zero such FDs, _select_loop() alternates between waitpid($pid, WNOHANG) and timeout-only select(undef, undef, undef, $timeout). That limits detection latency to $timeout. A number of tactics can help, and we can combine tactics to reduce delays in more cases. I'm choosing each tactic in bold:

Tactic	Race-free	Helps if SIGCHLD blocked	Helps Windows	Helps if timeout active	CPU cost	Complexity
SIGCHLD=sub{}	:x:	:x:	:x:	:white_check_mark:	+	+
self-pipe	:white_check_mark:	:x:	:x:	:white_check_mark:	+	+++
low-select-timeout-w32	:white_check_mark:	:white_check_mark:	:white_check_mark:	:white_check_mark:	+++	+
waitpid-blocking	:white_check_mark:	:white_check_mark:	:white_check_mark:	:x:	-	++
Windows-wait	:white_check_mark:	:white_check_mark:	:white_check_mark:	:white_check_mark:	-	++++
helper-pid	:white_check_mark:	:white_check_mark:	:white_check_mark:	:white_check_mark:	++++	++++

[SIGCHLD=sub{}] makes Perl terminate select() in response to that signal.

[self-pipe] uses https://cr.yp.to/docs/selfpipe.html to fix a race condition in [SIGCHLD=sub{}], when SIGCHLD arrives just before select() begins.

[low-select-timeout-w32] caps the select() timeout at something less than today's 0.5s, in Windows zero-FD cases. This gains responsiveness, but it loses CPU efficiency.

[waitpid-blocking] uses waitpid($pid, 0) when we have no timeouts or FDs to interact with a child. This works with SIGCHLD unavailable (Windows) or blocked, but not working with timeouts is a major limitation.

[helper-pid] uses a separate process or thread of the current process to handle one of the tasks. For example, have the main process select(), and have a second process write to a pipe when it detects a child exit. The CPU overhead makes this a loss outside academic scenarios. It could win if SIGCHLD is blocked, but an application blocking SIGCHLD announces its disinterest in fast child exit detection. It could win on Windows with sufficiently-long child runtimes and timeouts, making the continuous cost of select() wake-ups exceed the fixed cost of creating the helper.

[Windows-wait] uses a C-language module to issue Windows API calls OpenProcess(), WSAEventSelect(), and WaitForMultipleObjects(). Together, they enable one thread to wait for child exit, FD readiness, and timeout. This may allow removing the existing Win32Helper.pm process, reducing today's overhead on Windows; search the source tree for WaitForMultipleObjects(). No CPAN module covers all those functions. This might pay off if Win32::API suffices or if bypassing Win32Helper.pm gains considerable efficiency. Otherwise, it might pay off on a long time horizon, if we can get its C code in the default installation of Windows Perl.

Are there other tactics that might outperform these? Are there other key considerations for evaluating the tactics listed?

I feel the combination of [SIGCHLD=sub{}], [low-select-timeout-w32], and [waitpid-blocking] has only minor drawbacks, hence that stopping point. The delay remains when SIGCHLD is blocked or arrives just before select(). Windows incurs CPU overhead via shorter select() timeouts, only when there's a harness timeout and zero FDs. The most-plausible alternative was to replace [SIGCHLD=sub{}] with [self-pipe], accepting a bit of CPU overhead and developer-facing complexity to remove the "just before select()" race condition. [Windows-wait] is a more-speculative alternative.

Since tests can't assume much about the passage of real time, I'm not adding test cases. Existing tests caught bugs in earlier versions of the waitpid-related commits. You can see the responsiveness of several cases via the output of temporary tests I had in t/eintr.t here: https://github.com/nmisch/IPC-Run/actions/runs/8219780209/job/22478157192.

cpan-authors / IPC-Run

Reduce delays in detecting child exit #172