Closed gthiemonge closed 2 years ago
Thanks for a great bug report! I'm unfamiliar with this part of the codebase, so it will take some time. @obondarev, maybe you would like to take a look?
Sure, I think read lock should be allowed if current thread already has a write lock. This means line #184 in [1] should be "elif (self._writer == me) or not self.has_pending_writers:" Can you please try that @gthiemonge ?
[1] https://github.com/harlowja/fasteners/commit/75cf59a7246383d44f617215fbdf818d2e88a399
I gave it a go, all tests pass, including the reproducer by @gthiemonge. Thanks :+1:
I will take another carefull look at that part of the code and will release the fix soonish.
Thanks, @psarka !
Sure, I think read lock should be allowed if current thread already has a write lock. This means line #184 in [1] should be "elif (self._writer == me) or not self.has_pending_writers:" Can you please try that @gthiemonge ?
[1] 75cf59a
Yes, it fixes my reproducer and our tests in TaskFlow. Thanks Folks!
Made a new release (0.17.3) with the fix.
We are still seeing deadlocks with 0.17.3. Will try to come up with a reproducer.
This behavior has been observed in the OpenStack TaskFlow project (https://opendev.org/openstack/taskflow)
After updating from fasteners 0.14.1 to 0.16.3, we have observed some issues with the TaskFlow CI, some tests are blocked with a deadlock.
I wrote a simple reproducer in the fasteners test: https://github.com/gthiemonge/fasteners/commit/6d40296baa2c870a1ea8746bc7ad0f581c6ec4c5
Basically, a first thread requests a writer lock, then a reentrant reader lock, while a second thread requests only the writer lock. The test deadlocks after a few iterations:
The following commit https://github.com/harlowja/fasteners/commit/75cf59a7246383d44f617215fbdf818d2e88a399 triggers this behavior, reverting the commit seems to fix the issue.
Any ideas on how to fix it?