Open oowekyala opened 1 year ago
I think reassigning the tag of the new event is the only reasonable option. I think we should think of it as a transaction. If the race occurs and the tag of the new event is wrong, we roll back, get a new tag, and attempt inserting it again.
Note that whatever tag is obtained for the scheduled physical action is uncertain, anyway.
Ok, I'll implement this.
For the record, I could not reproduce the bug without adding a thread::sleep
in the middle of the critical section, in the code of the runtime (not of the LF program). I suspect this bug is mostly theoretical...
These kinds of bugs are load dependent and might only surface rarely, yet I wouldn't call them theoretical because that wrongfully suggests that they cannot really happen in deployment.
There was a bug in the C++ runtime, and it can also happen in Rust theoretically (I wasn't able to reproduce it with an unmodified runtime, it depends on thread interleaving).
Possible faulty execution
C++ fix
In C++ there is a global event queue and a global mutex protecting it. The fix is to put the time reading and the pushing of the event in the same critical section.
Rust
In Rust the event queue is split:
Sender
to push events to the scheduler asynchronously. TheReceiver
end maintains an unsorted buffer of events that is periodically flushed into the main queue by the scheduler thread. Events pushed through theSender
have already been assigned a tag.We can assume
Sender
/Receiver
communicate atomically.Possible solutions for the Rust runtime
Global mutex
We could reproduce the C++ solution by introducing a mutex to guard the receiver and sender. This would however defeat part of the purpose of using channels, which is that we don't need to block the async sender thread when sending something.
Let the scheduler assign tags
Another solution would be to let the scheduler thread assign tags to asynchronous events. There are several possible problems with this:
Mixed solution
We could use the asynchronously assigned tag as long as it is greater than the latest processed tag. If it isn't, then we're in the problematic situation described above. Then, we can do something else:
None of these look super appealing in the general case - maybe it should be selectable