Closed notgull closed 6 months ago
r? @zeenix, as this is the conclusion of the deep dive I've been doing for the past couple weeks, as mentioned here: https://github.com/smol-rs/event-listener/pull/130#issuecomment-2094853124
r? @fogti, as per the C memory model I'm relatively sure these optimizations should have any effect on the order of events, but I'm not sure.
I've tested async-lock
with a patched version of event-listener
with this PR and the MIRI tests pass, even with the previously ignored tests un-ignored. In addition I'm relatively sure it passes Loom tests (https://github.com/smol-rs/async-lock/pull/86). However I can't be 100% sure; my max-preemptions count is set very low as I do not have access to hardware that can conduct full loom tests at the moment.
I'm unclear why this bug is manifesting now; pretty much all versions of event-listener
contain this code and v5 didn't really touch it. Maybe it just offset the timing just right for the race condition to start showing up in normal code.
No, these optimizations were there to add fast paths that didn't lock the inner mutex and traverse the linked list. This operation is somewhat expensive so this optimization avoided doing that if it could. Removing it won't wake up any additional tasks.
The notify() function contains two optimizations:
inner
field is not yet initialized (i.e. no listeners have been created yet), it returns0
from the function early.notified
field indicates that the current notification would do nothing, it returns0
early as well.loom
testing has exposed race conditions in these optimizations that can cause notifications to be missed, leading to deadlocks. This commit removes these optimizations in the hope of preventing these deadlocks. Ideally we can restore them in the future.Closes #138 cc #130 and #133