Prior to this PR, Pedalboard allowed multiple threads to call methods on AudioFile simultaneously. AudioFile objects included an objectLock, intended to serialize this access to ensure "thread safety" (although this was vacuous, as how meaningful are the results returned by a file-like object being manipulated by multiple threads simultaneously?).
This was not a problem when reading files from disk. However, AudioFile permits the caller to provide a file-like object (io.BytesIO, etc) which can be implemented in Python. In this case, concurrent access to the same AudioFile object caused hard deadlocks in Python.
Consider the following example, in which the thread holding the GIL is annotated with 🐟, and the thread holding the AudioFile object's lock is annotated with 🔒:
Thread A
Thread B
🐟 Call AudioFile.read(...)
🐟 Acquire the AudioFile's objectLock 🔒
🔒 🐟 Release Python's GIL
🔒
🐟 Call AudioFile.read(...)
🔒
🐟 Wait for AudioFile's objectLock 🔒
🔒 Call .read(...) on the file-like object
🐟
🔒 Wait for Python's GIL (to call back into Python)
🐟
🔒 ⏳ deadlocked
🐟 ⏳ deadlocked
This situation resulted in an uninterruptible Python interpreter (i.e.: Ctrl-C would not work), as the GIL was held by a thread that was in C++ code, and no Python signal handlers could run.
This PR makes significant changes to how AudioFile handles locking:
AudioFile objects now have read-write locks, instead of instance-wide locks. This allows us to have finer-grained control over locking and - say - avoid locking the entire object if two threads do want to read const properties of the object simultaneously.
AudioFile objects now use a helper class, ScopedDowngradeToReadLockWithGIL, which:
Downgrades a held write lock to a read lock upon construction
Attempts to re-acquire the write lock upon destruction
If the write lock is held buy another thread, it tries to release the GIL (if held by the current thread) to allow the other thread to make progress
If two threads do try to read from or write to an AudioFile object concurrently, Pedalboard can now detect this and throws an error, so that users will be alerted that the results of each read() call would have been non-deterministic:
RuntimeError: Another thread is currently reading from this AudioFile.
Note that using multiple concurrent readers on the same AudioFile
object will produce nondeterministic results.
This PR also includes some new and comprehensive testing around locking; however, these tests are beasts. Again, sorry for the complexity there.
This one's a big one. Sorry in advance.
Prior to this PR, Pedalboard allowed multiple threads to call methods on
AudioFile
simultaneously.AudioFile
objects included anobjectLock
, intended to serialize this access to ensure "thread safety" (although this was vacuous, as how meaningful are the results returned by a file-like object being manipulated by multiple threads simultaneously?).This was not a problem when reading files from disk. However,
AudioFile
permits the caller to provide a file-like object (io.BytesIO
, etc) which can be implemented in Python. In this case, concurrent access to the sameAudioFile
object caused hard deadlocks in Python.Consider the following example, in which the thread holding the GIL is annotated with 🐟, and the thread holding the
AudioFile
object's lock is annotated with 🔒:AudioFile.read(...)
AudioFile
'sobjectLock 🔒
AudioFile.read(...)
AudioFile
'sobjectLock 🔒
.read(...)
on the file-like objectThis situation resulted in an uninterruptible Python interpreter (i.e.:
Ctrl-C
would not work), as the GIL was held by a thread that was in C++ code, and no Python signal handlers could run.This PR makes significant changes to how
AudioFile
handles locking:AudioFile
objects now have read-write locks, instead of instance-wide locks. This allows us to have finer-grained control over locking and - say - avoid locking the entire object if two threads do want to readconst
properties of the object simultaneously.AudioFile
objects now use a helper class,ScopedDowngradeToReadLockWithGIL
, which:AudioFile
object concurrently, Pedalboard can now detect this and throws an error, so that users will be alerted that the results of eachread()
call would have been non-deterministic:This PR also includes some new and comprehensive testing around locking; however, these tests are beasts. Again, sorry for the complexity there.