rayon-rs / rayon

Rayon: A data parallelism library for Rust
Apache License 2.0
11.08k stars 501 forks source link

Yield locks #1181

Closed HyperCodec closed 4 months ago

HyperCodec commented 4 months ago

If a task enounters a lock that another task is currently mutating, the entire thread sleeps and cannot perform other tasks. This can deadlock a thread if a task is waiting on a task in another thread that is waiting on a task later in the queue on the original thread. If there are other threads available, as long as the other thread doesn't also encounter this issue, it only decreases performance as the other threads can still steal tasks and eventually get the work done. However, if this type of situation happens across all available threads, the entire thing deadlocks.

What I recommend as the best solution to this problem is introducing custom "lock" types that simply call yield_now or something if locked. When the thread resumes the task, if it is still locked, it keeps calling yield_now, and otherwise proceeds with the task as usual. This way, even if the situation mentioned above happens, it doesn't really matter because the threads will still be able to get the work done.

cuviper commented 4 months ago

I'm going to close this as a duplicate of #592.

However, just yielding is not sufficient. If the task that is holding the lock is earlier on the same thread stack as the task that is now trying to obtain the lock, then no amount of yielding can resolve that. It's essentially an attempt at recursive locking, even thought that may be unintentionally induced by rayon's work-stealing.

There's a proposal for "blocking install" that I still need to get around to. This would give a way to better isolate work across threadpools, without work-stealing while you wait, so for example you could safely hold a lock in the caller.