realm / realm-core

Core database component for the Realm Mobile Database SDKs
https://realm.io
Apache License 2.0
1.02k stars 165 forks source link

Mutexes should not be implemented with dispatch semaphores due to risk of priority inversions #5697

Open hamzasohail opened 2 years ago

hamzasohail commented 2 years ago

There was a report from a user who observed a "Hang Risk" runtime issue in Xcode when running their app when using the Realm SDK.

https://www.mongodb.com/community/forums/t/hang-risk-thread-warning-on-xcode/177869

There is explicit guidance by Apple about the risk associated with this primitive. Upon examination, it seems realm implements mutexes (realm-core/src/realm/util/interpocess_mutex.cpp) with dispatch semaphores. This is risky -- dispatch semaphores do not provide priority inversion avoidance since the system doesn't know which thread will eventually signal the waiting thread.

Instead, I propose that for mutexes (and other locking primitives you provide), you should use OS unfair locks instead which provide priority inversion avoidance (this explicit guidance is available in this talk: https://developer.apple.com/videos/play/wwdc2017/706/)

tgoyne commented 2 years ago

We do wait on a mutex where possible. There's two scenarios where we don't:

  1. InterprocessMutex, as the name suggests, implements a mutex which is shared between all processes which open a given Realm file. Apple's implementation of process-shared mutexes doesn't actually work (FB5978435), so we have to emulate it with IPC mechanisms that do work.
  2. Async writes let you acquire the write lock asynchronously and be notified when it's acquired, which is inherently a pattern which can't be implemented via a mutex and is where we use semaphores instead. Here we probably need to be using manual QoS overrides.
hamzasohail commented 2 years ago

Hmmm, if you look at this report here, you seem to be causing an inversion which the app developer doesn't have a way to avoid. I'd suggest looking at ways to avoid such inversions. At the end of the day, the users of the app may see UI responsiveness issues because of them.

https://github.com/realm/realm-swift/issues/7902

tgoyne commented 2 years ago

https://github.com/realm/realm-core/pull/5799 doesn't fully fix this, but it will eliminate the most common case where the main thread waits on a condition variable rather than a mutex.

jedelbo commented 1 year ago

@tgoyne @hamzasohail can this be closed?

tgoyne commented 1 year ago

No, this is still a problem.