tokio-rs / tokio-uring

An io_uring backed runtime for Rust
MIT License
1.11k stars 117 forks source link

Bug: Race condition in next_when_notified #280

Open ollie-etl opened 1 year ago

ollie-etl commented 1 year ago

I believe the following fixes a race condition in FixedBufPool. We were observing a situation where, without resorting to timeouts to force retries, multiple concurrent calls to next would eventually all stall.

The fix (setting enable) is take directly from https://docs.rs/tokio/latest/tokio/sync/futures/struct.Notified.html#method.enable

The call to enable is important because otherwise if you have two calls to recv and two calls to send in parallel, the following could happen:

  1. Both calls to try_recv return None.
  2. Both new elements are added to the vector.
  3. The notify_one method is called twice, adding only a single permit to the Notify.
  4. Both calls to recv reach the Notified future. One of them consumes the permit, and the other sleeps forever.

By adding the Notified futures to the list by calling enable before try_recv, the notify_one calls in step three would remove the futures from the list and mark them notified instead of adding a permit to the Notify. This ensures that both futures are woken.

I believe this is happening in the presence of only concurrency, not parallelism (single threaded)

ollie-etl commented 1 year ago

@mzabaluev you're probably best placed for an opinion