lhmouse / mcfgthread

Cornerstone of the MOST efficient std::thread on Windows for mingw-w64
https://gcc-mcf.lhmouse.com/
Other
277 stars 28 forks source link

Deadlock in conditional variable? #86

Closed Trzik closed 1 year ago

Trzik commented 1 year ago

I have a test with rigorous usage of cond vars that started deadlocking when I switched to mcfgthread implementation. With a little digging, I found that eventually one thread gets stuck in _MCF_cond_wait and the other in _MCF_cond_signal_some_slow.

The signaling thread goes into __MCF_batch_release_common, calls __MCF_keyed_event_signal and ends up in NtReleaseKeyedEvent with NULL (infinite) timeout.

For the wait call, the thread gets stuck in this __MCF_keyed_event_signal call that ends up in the same NtReleaseKeyedEvent call with zero timeout. https://github.com/lhmouse/mcfgthread/blob/ff795e30999ebed91be6bb3bd9cab2ffebec3b61/mcfgthread/cond.c#L83

I don't understand the code that well, but isn't this line a bug? Shouldn't the __MCF_keyed_event_wait be invoked instead? I did an experimental rebuild of the library with this change and it fixed the deadlock for me. But since I see similar line in event.c, this may be intentional in which case I don't know where the real root cause lies.

lhmouse commented 1 year ago

Yes that's reasonable explanation. I'm preparing a testcase now.

lhmouse commented 1 year ago

This can be reproduced with https://github.com/lhmouse/mcfgthread/blob/d283e3f095c5c4e6a4f1c2d9ae74415cc8df85e9/test/cond_multi_wait.c.

Thanks for the report.