Open z2oh opened 2 months ago
I guess there is some flaw in muxnote management which makes whole URLSession a bit unstable on non-Darwin platforms (not only Windows). For sockets, it is possible to over-release a muxnote object under heavy usage, because register and unregister code is running on different threads. Perhaps this also affects pipes. Would be nice to investigate this further though.
Upon upgrading our Azure CI machines to use the new Azure Cobalt ARM64 processors, we started seeing frequent compiler crashes when building a large Swift project. After some investigation, the culprit appears to be a lifecycle violation in libdispatch in the Windows pipe handling code.
The crashing line: https://github.com/apple/swift-corelibs-libdispatch/blob/e85f6a0d5c9ea1f32f5013c3fa34e4fc146cd0eb/src/event/event_windows.c#L240
And the stack trace:
I suspect this is not an Cobalt/ARM64 specific issue, but is more likely a long-standing bug which has become common on this particular line of CPUs due to some scheduling or timing change.
The interesting section is here: https://github.com/apple/swift-corelibs-libdispatch/blob/e85f6a0d5c9ea1f32f5013c3fa34e4fc146cd0eb/src/event/event_windows.c#L667-L669
The event set here is used to synchronize with the pipe monitoring thread, which itself calls
_dispatch_muxnote_retain
.Perhaps a change in timing affected the typical order of operations here, although I haven't been able to prove this yet.I'm trying to reproduce the crash under
LIBDISPATCH_LOG
to get some more information.