ReactiveX / RxCpp

Reactive Extensions for C++
Apache License 2.0
3.05k stars 395 forks source link

Potential deadlock in rx-runloop #487

Open iam opened 5 years ago

iam commented 5 years ago

While inspecting rx-runloop.hpp I noticed there might be a potential deadlock in the code:

https://github.com/ReactiveX/RxCpp/blob/aac2fc97bc5fe680446afb5ae81bef0a9c0fbf8a/Rx/v2/src/rxcpp/schedulers/rx-runloop.hpp#L77-L88

Last two lines appear to be flipped, guard.unlock() is a no-op if it's at the end of a scope. During notify_earlier_wakeup it could call back into the runloop and take another lock which would hang since these locks aren't recursive.

/***/           if (need_earlier_wakeup_notification) st->notify_earlier_wakeup(when);
/***/           guard.unlock(); // So we can't get attempt to recursively lock the state
kirkshoop commented 5 years ago

I believe that this was intentional. the only intended use case for notify_earlier_wakeup is to set a timer that will call dispatch() at when and cancel any previous timer. Doing this under the lock also allows calls to empty() and peek() to be coherent while trying to schedule the next call to dispatch(). unlocking first, in this case, would cause races to the notify_earlier_wakeup() that would be hard to resolve in that method (it would have to reimplement the 'is this earlier than the current timer?').

iam commented 5 years ago

Doing this under the lock also allows calls to empty() and peek() to be coherent while trying to schedule the next call to dispatch().

Where does this dispatch scheduling happen exactly? Nothing in rxcpp seems to call dispatch(). This is the only (external) example I could find:

https://github.com/ReactiveX/RxCpp/pull/154

kirkshoop commented 5 years ago

Yes, this was built to support integration of rxcpp into existing event loops like SDL and Qt. Dispatch is intended to be called from those loops and empty and peek are intended to minimize calls to dispatch. Dispatch is where synchronization matters.

iam commented 5 years ago

Doing this under the lock also allows calls to empty() and peek() to be coherent while trying to schedule the next call to dispatch(). unlocking first, in this case, would cause races to the notify_earlier_wakeup() that would be hard to resolve in that method (it would have to reimplement the 'is this earlier than the current timer?').

Is there any value to being able to call empty and peek from the thread of notify_earlier_wakeup? If notify_earlier_wakeup is called then the user already knows that something earlier was scheduled (thus the queue is definitely non-empty).

It should then be up to the user code to properly synchronize how they set up their timers to do an earlier wakeup. For example if the same QTimer is used by multiple run_loop_scheduler instances then the user needs to do their own synchronization regardless, having the lock held under callback does not help.


Yes, this was built to support integration of rxcpp into existing event loops like SDL and Qt. Dispatch is intended to be called from those loops and empty and peek are intended to minimize calls to dispatch. Dispatch is where synchronization matters.

Thanks, I took a look these other event loop architectures and it makes more sense now.

kirkshoop commented 5 years ago

yes, I think that you are correct about the call to notify_earlier_wakeup. it does not need to access those methods. notify_earlier_wakeup can be called from dispatch() and from schedule() and those may be on different threads. if that call is moved outside the lock then implementations will be exposed to overlapped calls to notify_earlier_wakeup when they were not before. it would need a big breaking-change warning and a way to revert to the old behavior.

iam commented 5 years ago

if that call is moved outside the lock then implementations will be exposed to overlapped calls to notify_earlier_wakeup when they were not before. it would need a big breaking-change warning and a way to revert to the old behavior.

This seems like a breaking change that was just merged recently: https://github.com/ReactiveX/RxCpp/pull/486 ?

With this change, doesn't it mean that peek() or empty() from the notify_earlier_wakeup callback will immediately get a deadlock ?

notify_earlier_wakeup can be called from dispatch() and from schedule() and those may be on different threads.

Just to clarify my understanding: I don't see dispatch calling notify_earlier_wakeup, are you referring to schedule being called indirectly via the light-weight tail recursion mechanism?

kirkshoop commented 5 years ago

Here is the PR adding the notify, there is some discussion. https://github.com/ReactiveX/RxCpp/pull/337

I assumed that dispatch would call notify as well when time moves into the future, but you are right that the notify comes from schedule.

In looking at the canonical usage: https://github.com/anatoly-spb/qtimageviewer/blob/rxcpp/rxeventloop.h

I see that empty and peek are called from the timer, but not from the notify. I think that even that usage would go away if dispatch called notify..