Closed nmisch closed 2 years ago
Does this allow us to turn windows CI back on?
Does this allow us to turn windows CI back on?
I don't expect changes there:
windows
workflow is on now. It's been flaky; this makes it stable.windows
workflow skips some tests using GITHUB_WINDOWS_TESTING
. Those tests would otherwise fail, not hang, so I think those bugs are separate.Windows_Installation
workflow will remain off.
perl/perl5#19963 explains #77. As seen in the perl/perl5#19963 test program, a workaround is to use dup2() to close the read end of the pipe, replacing it with a non-pipe FD. (Closing the non-pipe FD is trivial.) This PR implements and tests that. As I mention in PR-added code comments, this works in my tests but relies on undefined behavior of CloseHandle(). I considered some alternatives to avoid that:
Keep a hash of which FDs are sockets, to skip the dup2() for them. The potential socket sources are IPC::Run::Win32IO::_socket, filehandles the user provides, and IPC::Run's dup()/dup2() calls. Use getsockopt() to determine whether user filehandles are sockets. Assume FDs 0,1,2 are not sockets; while Perl assigns FDs to sockets, that relies on undefined CRT behavior.
Use getsockopt() to determine whether a to-be-closed FD is a socket. If so, skip the dup2(). This did not work. _close() typically receives a FD, but Perl getsockopt() needs a Perl filehandle. Getting a filehandle with
open($FH, "<&=$FD")
hangs in the same situations where #77 hangs. (While the previous alternative would be better, one could salvage this alternative via making the socket filehandle available to _close() or via an XS module that calls getsockopt(FD).)Use
CREATE_SUSPENDED
to spawn the user-defined kid, and resume the kid after closes. This suffices for pipes IPC::Run has opened, but it doesn't help for pipes acquired from user filehandles or from FD 0,1,2. Unlike the other options, anEMFILE
failure can't disrupt the close attempt.I think the best option is either this PR or alternative (1). This PR has the advantage of isolating the workaround code; (1) would change multiple areas of platform-independent IPC::Run code. (1) does not rely on undefined behavior. Overall, I lean toward this PR. It survived 450 windows GitHub Actions runs.
dup2() needs some other FD as its first argument, of course. This PR issues open('NUL') to get one. That's attractively predictable, except that it could fail with
EMFILE
. I considered using some already-available FD. In most cases, FD=2
works. If we arranged to know the write end of the pipe and close the read end first, that could usedup2(write_end, read_end)
.A brief benchmark of
t/run.t
confirmed this PR's small effect on performance. One series of twenty runs showed +2.6% median runtime, while another series showed -0.2%.For anyone experimenting with this, beware of heisenbugs when using
IPCRUNDEBUG
. The POSIX::close() calls in _map_fds can hang the same way. I will propose a fix for that separately.