oscourse-tsinghua / rcore_plus

Rust version of THU uCore OS. Linux compatible.
MIT License
172 stars 26 forks source link

Fix sys_futex race condition #32

Open wangrunji0408 opened 5 years ago

wangrunji0408 commented 5 years ago

According to the specification:

FUTEX_WAIT (since Linux 2.6.0) This operation tests that the value at the futex word pointed to by the address uaddr still contains the expected value val, and if so, then sleeps waiting for a FUTEX_WAKE operation on the futex word. The load of the value of the futex word is an atomic memory access (i.e., using atomic machine instructions of the respective architecture). This load, the comparison with the expected value, and starting to sleep are performed atomically and totally ordered with respect to other futex operations on the same futex word. If the thread starts to sleep, it is considered a waiter on this futex word. If the futex value does not match val, then the call fails immediately with the error EAGAIN.

Now the load and sleep is not an atomic operation on futex.

So if events happen as the following: Thread A wants to wait futex. Thread B wants to wake futex.

  1. Thread A loads the word at kernel space.
  2. Thread B stores the word at user space.
  3. Thread B wakes up the queue at kernel space.
  4. Thread A sleeps at the queue at kernel space.

Then thread A will sleep forever. GG.

jiegec commented 5 years ago

The current Condvar facility is incomplete and prone to racing. We need to rethink how Condvar works and how sys_{poll, select} uses it.