Closed jeremyd2019 closed 6 days ago
This sadly results in lots of errors being printed (w11 26100):
user@desktop MSYS ~
$ pacman -Su
0 [waitproc] pacman 1159 proc_waiter: error on read of child wait pipe 0x58C, Win32 error 995
0 [waitproc] pacman 1167 proc_waiter: error on read of child wait pipe 0x590, Win32 error 995
0 [waitproc] pacman 1169 proc_waiter: error on read of child wait pipe 0x5C0, Win32 error 995
0 [waitproc] pacman 1171 proc_waiter: error on read of child wait pipe 0x5E4, Win32 error 995
0 [waitproc] pacman 1173 proc_waiter: error on read of child wait pipe 0x5F0, Win32 error 995
0 [waitproc] pacman 1175 proc_waiter: error on read of child wait pipe 0x5E4, Win32 error 995
0 [waitproc] pacman 1177 proc_waiter: error on read of child wait pipe 0x5F4, Win32 error 995
0 [waitproc] pacman 1179 proc_waiter: error on read of child wait pipe 0x5EC, Win32 error 995
0 [waitproc] pacman 1181 proc_waiter: error on read of child wait pipe 0x5F4, Win32 error 995
0 [waitproc] pacman 1183 proc_waiter: error on read of child wait pipe 0xB0, Win32 error 995
0 [waitproc] pacman 1187 proc_waiter: error on read of child wait pipe 0x5EC, Win32 error 995
0 [waitproc] pacman 1189 proc_waiter: error on read of child wait pipe 0x5E4, Win32 error 995
0 [waitproc] pacman 1193 proc_waiter: error on read of child wait pipe 0x5F4, Win32 error 995
0 [waitproc] pacman 1195 proc_waiter: error on read of child wait pipe 0x5F4, Win32 error 995
:: Starting core system upgrade...
there is nothing to do
:: Starting full system upgrade...
there is nothing to do
D'oh. That error is expected due to CancelSynchronousIo
C:\>net helpmsg 995
The I/O operation has been aborted because of either a thread exit or an application request.
I wonder why I didn't see that when I tested this on ARM64
I think that should have only shown up when strace is enabled.
This addresses an extremely difficult to debug deadlock when running under emulation on ARM64.
A relatively easy way to trigger this bug is to call
fork()
, then within the child process immediately call anotherfork()
and thenexit()
the intermediate process.It would seem that there is a "code emulation" lock on the wait thread at this stage, and if the thread is terminated too early, that lock still exists albeit without a thread, and nothing moves forward.
It seems that a
SuspendThread()
combined with aGetThreadContext()
(to force the thread to actually be suspended, for more details see https://devblogs.microsoft.com/oldnewthing/20150205-00/?p=44743) makes sure the thread is "booted" from emulation before it is suspended.Hopefully this means it won't be holding any locks or otherwise leave emulation in a bad state when the thread is terminated.
Also, attempt to use
CancelSynchonousIo()
(as seen inflock.cc
) to avoid the need forTerminateThread()
altogether. This doesn't always work, however, so was not a complete fix for the deadlock issue.Addresses: https://cygwin.com/pipermail/cygwin-developers/2024-May/012694.html
Fixes msys2/msys2-autobuild#62, fixes #228 (and some other issues scattered about I don't remember off-hand)
Awaiting review on cygwin-patches mailing list: https://inbox.sourceware.org/cygwin-patches/2c68d6fe-5493-b7e0-6335-de5a68d3cd3f@jdrake.com/T/#u