cpan-authors / IPC-Run

https://metacpan.org/pod/IPC::Run
Other
21 stars 38 forks source link

On Windows, avoid hang when closing read end of pipe. #156

Closed nmisch closed 2 years ago

nmisch commented 2 years ago

perl/perl5#19963 explains #77. As seen in the perl/perl5#19963 test program, a workaround is to use dup2() to close the read end of the pipe, replacing it with a non-pipe FD. (Closing the non-pipe FD is trivial.) This PR implements and tests that. As I mention in PR-added code comments, this works in my tests but relies on undefined behavior of CloseHandle(). I considered some alternatives to avoid that:

  1. Keep a hash of which FDs are sockets, to skip the dup2() for them. The potential socket sources are IPC::Run::Win32IO::_socket, filehandles the user provides, and IPC::Run's dup()/dup2() calls. Use getsockopt() to determine whether user filehandles are sockets. Assume FDs 0,1,2 are not sockets; while Perl assigns FDs to sockets, that relies on undefined CRT behavior.

  2. Use getsockopt() to determine whether a to-be-closed FD is a socket. If so, skip the dup2(). This did not work. _close() typically receives a FD, but Perl getsockopt() needs a Perl filehandle. Getting a filehandle with open($FH, "<&=$FD") hangs in the same situations where #77 hangs. (While the previous alternative would be better, one could salvage this alternative via making the socket filehandle available to _close() or via an XS module that calls getsockopt(FD).)

  3. Use CREATE_SUSPENDED to spawn the user-defined kid, and resume the kid after closes. This suffices for pipes IPC::Run has opened, but it doesn't help for pipes acquired from user filehandles or from FD 0,1,2. Unlike the other options, an EMFILE failure can't disrupt the close attempt.

I think the best option is either this PR or alternative (1). This PR has the advantage of isolating the workaround code; (1) would change multiple areas of platform-independent IPC::Run code. (1) does not rely on undefined behavior. Overall, I lean toward this PR. It survived 450 windows GitHub Actions runs.

dup2() needs some other FD as its first argument, of course. This PR issues open('NUL') to get one. That's attractively predictable, except that it could fail with EMFILE. I considered using some already-available FD. In most cases, FD=2 works. If we arranged to know the write end of the pipe and close the read end first, that could use dup2(write_end, read_end).

A brief benchmark of t/run.t confirmed this PR's small effect on performance. One series of twenty runs showed +2.6% median runtime, while another series showed -0.2%.

For anyone experimenting with this, beware of heisenbugs when using IPCRUNDEBUG. The POSIX::close() calls in _map_fds can hang the same way. I will propose a fix for that separately.

toddr commented 2 years ago

Does this allow us to turn windows CI back on?

nmisch commented 2 years ago

Does this allow us to turn windows CI back on?

I don't expect changes there: