Closed eindenbom closed 4 months ago
I had to repeat it a few times to trigger this behavior, but I see it.
Standalone reproducer for a different race condition:
import threading
import time
import lazy_loader as lazy
def repeat_lazy():
time.sleep(0.5)
np = lazy.load('numpy')
for _ in range(50):
threading.Thread(target=repeat_lazy).start()
@eindenbom Can you give #90 a try?
I do not think #90 fixes the same race condition as reported in this issue. You are fixing a possible race (I am not sure that there is any) in lazy_loader.load(), while in librosa it is executed at librosa load time and therefore protected by module load lock.
The race I have reported is in importlib.util._LazyModule.__getattr__()
, line 228: self.__class__ = types.ModuleType
.
The __class__
meta attribute is replaced BEFORE actual module is loaded. This creates a race condition if module attribute is requested concurrently from another thread.
The tests confirm that the other race condition exists, but you are right that #90 does not address this issue.
I do not think #90 fixes the same race condition as reported in this issue. You are fixing a possible race (I am not sure that there is any) in lazy_loader.load(), while in librosa it is executed at librosa load time and therefore protected by module load lock.
The race I have reported is in
importlib.util._LazyModule.__getattr__()
, line 228:self.__class__ = types.ModuleType
. The__class__
meta attribute is replaced BEFORE actual module is loaded. This creates a race condition if module attribute is requested concurrently from another thread.
Agreed, this is a CPython bug:
import importlib
import threading
import sys
spec = importlib.util.find_spec('http')
module = importlib.util.module_from_spec(spec)
http = sys.modules['http'] = module
loader = importlib.util.LazyLoader(spec.loader)
loader.exec_module(module)
def check():
return http.HTTPStatus.ACCEPTED == 202
for _ in range(5):
threading.Thread(target=check).start()
This looks painful to fix, since attempting to attach a lock to the _LazyModule
class or instance would result in a recursive call to __getattribute__
. Avoiding that is why self.__class__
is replaced before doing anything else. A global lock or a dict[ModuleType, Lock]
table are the only things I can think of.
Thanks for investigating, @effigies! Adding time.sleep(0.2)
into check
makes failure more reliable for me.
Can we simply lock up exec_module
?
import importlib
import threading
import sys
import time
spec = importlib.util.find_spec('http')
module = importlib.util.module_from_spec(spec)
http = sys.modules['http'] = module
lock = threading.Lock()
def lock_func(f):
def locked(*args, **kwargs):
with lock:
return f(*args, **kwargs)
return locked
loader = importlib.util.LazyLoader(spec.loader)
loader.exec_module = lock_func(loader.exec_module)
loader.exec_module(module)
def check():
time.sleep(0.2)
return http.HTTPStatus.ACCEPTED == 202
for _ in range(5):
threading.Thread(target=check).start()
I don't think so, because exec_module
happens at lazy load time. The race condition happens at the actual attribute lookup.
Yes, but look at the above example, which wraps exec_module in a lock.
In [1]: import importlib
...: import threading
...: import sys
...: import time
...:
...: spec = importlib.util.find_spec('http')
...: module = importlib.util.module_from_spec(spec)
...: http = sys.modules['http'] = module
...:
...: lock = threading.Lock()
...:
...: def lock_func(f):
...: def locked(*args, **kwargs):
...: with lock:
...: return f(*args, **kwargs)
...: return locked
...:
...: loader = importlib.util.LazyLoader(spec.loader)
...: loader.exec_module = lock_func(loader.exec_module)
...: loader.exec_module(module)
...:
...:
...: def check():
...: time.sleep(0.2)
...: return http.HTTPStatus.ACCEPTED == 202
...:
...: for _ in range(5):
...: threading.Thread(target=check).start()
...:
Exception in thread Thread-2 (check):
Traceback (most recent call last):
File "/home/chris/mambaforge/envs/default/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
self.run()
File "/home/chris/mambaforge/envs/default/lib/python3.10/threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "<ipython-input-1-1b0c7fd1f576>", line 25, in check
Exception in thread Thread-5 (check):
Traceback (most recent call last):
File "/home/chris/mambaforge/envs/default/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
self.run()
File "/home/chris/mambaforge/envs/default/lib/python3.10/threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "<ipython-input-1-1b0c7fd1f576>", line 25, in check
AttributeError: module 'http' has no attribute 'HTTPStatus'
AttributeError: module 'http' has no attribute 'HTTPStatus'
Hrm, yes, looks like I have to use lock_func(spec.loader.exec_module)
, but not sure how that alters behavior.
The loader isn't where the race is. I might be wrong, but I don't think that will protect the actual critical operations.
Right, so we'll have to ensure "attribute access + lazy loading" becomes an atomic operation.
I went ahead and opened https://github.com/python/cpython/issues/114763. I think I have a fix.
@eindenbom If you're prepared to build your whole dependency tree for 3.13-dev, you could test out https://github.com/python/cpython/issues/114781. I suspect it can be rebased on the 3.12 branch as well, if you'd prefer.
Thanks to @effigies, this issue has been addressed by https://github.com/python/cpython/pull/114781.
@eindenbom This should soon be backported to the 3.11 and 3.12 source trees.
Should be out in 3.11.9 (https://github.com/python/cpython/pull/115871 - ETA April 1) and 3.12.3 (https://github.com/python/cpython/pull/115870 - ETA April 9).
@effigies Does this affect / make unnecessary https://github.com/scientific-python/lazy_loader/pull/90?
No, my read is that these are two independent races.
lazy_loader loaded modules are not thread-safe: when concurrently accessed from several threads all but the first thread get partially loaded module missing most of functionality.
For example
librosa
uses lazy_loader to loadresampy
and gets the following error:To Reproduce Run the following snippet:
Expected behavior No errors reported.
Actual output
Software versions