scientific-python / lazy-loader

Populate library namespace without incurring immediate import costs
BSD 3-Clause "New" or "Revised" License
116 stars 19 forks source link

Broken thread safety #88

Closed eindenbom closed 4 months ago

eindenbom commented 5 months ago

lazy_loader loaded modules are not thread-safe: when concurrently accessed from several threads all but the first thread get partially loaded module missing most of functionality.

For example librosa uses lazy_loader to load resampy and gets the following error:

AttributeError: module 'resampy' has no attribute 'resample'

To Reproduce Run the following snippet:

import librosa
import threading
import numpy as np

def resample():
    librosa.resample(np.zeros(1000), orig_sr=24_000, target_sr=16_000, res_type="kaiser_best")

for _ in range(5):
    threading.Thread(target=resample).start()

Expected behavior No errors reported.

Actual output

Traceback (most recent call last):
Traceback (most recent call last):
  File "C:\Build\envs\dec5\lib\threading.py", line 870, in run
Traceback (most recent call last):
  File "C:\Build\envs\dec5\lib\threading.py", line 870, in run
  File "C:\Build\envs\dec5\lib\threading.py", line 870, in run
Traceback (most recent call last):
  File "C:\Build\envs\dec5\lib\threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
          File "D:\source\internal_ai_deepaudio\training\break_lazy_load.py", line 7, in resample
self._target(*self._args, **self._kwargs)self._target(*self._args, **self._kwargs)

  File "D:\source\internal_ai_deepaudio\training\break_lazy_load.py", line 7, in resample
  File "D:\source\internal_ai_deepaudio\training\break_lazy_load.py", line 7, in resample
    self._target(*self._args, **self._kwargs)
  File "D:\source\internal_ai_deepaudio\training\break_lazy_load.py", line 7, in resample
    librosa.resample(np.zeros(1000), orig_sr=24_000, target_sr=16_000, res_type="kaiser_best")
  File "C:\Build\envs\dec5\lib\site-packages\librosa\core\audio.py", line 677, in resample
    librosa.resample(np.zeros(1000), orig_sr=24_000, target_sr=16_000, res_type="kaiser_best")
  File "C:\Build\envs\dec5\lib\site-packages\librosa\core\audio.py", line 677, in resample
    librosa.resample(np.zeros(1000), orig_sr=24_000, target_sr=16_000, res_type="kaiser_best")
  File "C:\Build\envs\dec5\lib\site-packages\librosa\core\audio.py", line 677, in resample
    librosa.resample(np.zeros(1000), orig_sr=24_000, target_sr=16_000, res_type="kaiser_best")
  File "C:\Build\envs\dec5\lib\site-packages\librosa\core\audio.py", line 677, in resample
    y_hat = resampy.resample(y, orig_sr, target_sr, filter=res_type, axis=axis)
AttributeError: module 'resampy' has no attribute 'resample'
    y_hat = resampy.resample(y, orig_sr, target_sr, filter=res_type, axis=axis)
    y_hat = resampy.resample(y, orig_sr, target_sr, filter=res_type, axis=axis)
AttributeError: module 'resampy' has no attribute 'resample'AttributeError
: module 'resampy' has no attribute 'resample'
    y_hat = resampy.resample(y, orig_sr, target_sr, filter=res_type, axis=axis)
AttributeError: module 'resampy' has no attribute 'resample'
python-BaseException
python-BaseException
python-BaseException
python-BaseException
Exception in thread Exception in thread Thread-8:
Exception in thread Traceback (most recent call last):
Thread-5:
Traceback (most recent call last):
  File "C:\Build\envs\dec5\lib\threading.py", line 932, in _bootstrap_inner
  File "C:\Build\envs\dec5\lib\threading.py", line 932, in _bootstrap_inner
Thread-9:
Traceback (most recent call last):
  File "C:\Build\envs\dec5\lib\threading.py", line 932, in _bootstrap_inner
    self.run()
  File "C:\Build\envs\dec5\lib\threading.py", line 870, in run
    self.run()
  File "C:\Build\envs\dec5\lib\threading.py", line 870, in run
        self._target(*self._args, **self._kwargs)
self.run()
  File "C:\Build\envs\dec5\lib\threading.py", line 870, in run
  File "D:\source\internal_ai_deepaudio\training\break_lazy_load.py", line 7, in resample
    self._target(*self._args, **self._kwargs)
  File "D:\source\internal_ai_deepaudio\training\break_lazy_load.py", line 7, in resample
    librosa.resample(np.zeros(1000), orig_sr=24_000, target_sr=16_000, res_type="kaiser_best")
  File "C:\Build\envs\dec5\lib\site-packages\librosa\core\audio.py", line 677, in resample
    self._target(*self._args, **self._kwargs)
  File "D:\source\internal_ai_deepaudio\training\break_lazy_load.py", line 7, in resample
    librosa.resample(np.zeros(1000), orig_sr=24_000, target_sr=16_000, res_type="kaiser_best")
  File "C:\Build\envs\dec5\lib\site-packages\librosa\core\audio.py", line 677, in resample
    y_hat = resampy.resample(y, orig_sr, target_sr, filter=res_type, axis=axis)
    y_hat = resampy.resample(y, orig_sr, target_sr, filter=res_type, axis=axis)
AttributeError: AttributeError: module 'resampy' has no attribute 'resample'
module 'resampy' has no attribute 'resample'    librosa.resample(np.zeros(1000), orig_sr=24_000, target_sr=16_000, res_type="kaiser_best")

  File "C:\Build\envs\dec5\lib\site-packages\librosa\core\audio.py", line 677, in resample
python-BaseException
    y_hat = resampy.resample(y, orig_sr, target_sr, filter=res_type, axis=axis)
AttributeError: module 'resampy' has no attribute 'resample'
Exception in thread Thread-6:
Traceback (most recent call last):
  File "C:\Build\envs\dec5\lib\threading.py", line 932, in _bootstrap_inner
    self.run()
  File "C:\Build\envs\dec5\lib\threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "D:\source\internal_ai_deepaudio\training\break_lazy_load.py", line 7, in resample
    librosa.resample(np.zeros(1000), orig_sr=24_000, target_sr=16_000, res_type="kaiser_best")
  File "C:\Build\envs\dec5\lib\site-packages\librosa\core\audio.py", line 677, in resample
    y_hat = resampy.resample(y, orig_sr, target_sr, filter=res_type, axis=axis)
AttributeError: module 'resampy' has no attribute 'resample'

Software versions

Windows-10-10.0.22621-SP0
Python 3.8.0 (default, Nov  6 2019, 16:00:02) [MSC v.1916 64 bit (AMD64)]
NumPy 1.23.5
SciPy 1.10.1
librosa 0.10.1
INSTALLED VERSIONS
------------------
python: 3.8.0 (default, Nov  6 2019, 16:00:02) [MSC v.1916 64 bit (AMD64)]

librosa: 0.10.1

audioread: 3.0.1
numpy: 1.23.5
scipy: 1.10.1
sklearn: 1.3.2
joblib: 1.3.2
decorator: 5.1.1
numba: 0.58.1
soundfile: 0.12.1
pooch: v1.8.0
soxr: 0.3.7
typing_extensions: installed, no version number available
lazy_loader: installed, no version number available
msgpack: 1.0.7

numpydoc: None
sphinx: None
sphinx_rtd_theme: None
matplotlib: None
sphinx_multiversion: None
sphinx_gallery: None
mir_eval: None
ipython: None
sphinxcontrib.rsvgconverter: None
pytest: None
pytest_mpl: None
pytest_cov: None
samplerate: None
resampy: 0.4.2
presets: None
packaging: 23.2
stefanv commented 5 months ago

I had to repeat it a few times to trigger this behavior, but I see it.

stefanv commented 5 months ago

Standalone reproducer for a different race condition:

import threading
import time
import lazy_loader as lazy

def repeat_lazy():
    time.sleep(0.5)
    np = lazy.load('numpy')

for _ in range(50):
    threading.Thread(target=repeat_lazy).start()
stefanv commented 5 months ago

@eindenbom Can you give #90 a try?

eindenbom commented 5 months ago

I do not think #90 fixes the same race condition as reported in this issue. You are fixing a possible race (I am not sure that there is any) in lazy_loader.load(), while in librosa it is executed at librosa load time and therefore protected by module load lock.

The race I have reported is in importlib.util._LazyModule.__getattr__(), line 228: self.__class__ = types.ModuleType. The __class__ meta attribute is replaced BEFORE actual module is loaded. This creates a race condition if module attribute is requested concurrently from another thread.

stefanv commented 5 months ago

The tests confirm that the other race condition exists, but you are right that #90 does not address this issue.

effigies commented 5 months ago

I do not think #90 fixes the same race condition as reported in this issue. You are fixing a possible race (I am not sure that there is any) in lazy_loader.load(), while in librosa it is executed at librosa load time and therefore protected by module load lock.

The race I have reported is in importlib.util._LazyModule.__getattr__(), line 228: self.__class__ = types.ModuleType. The __class__ meta attribute is replaced BEFORE actual module is loaded. This creates a race condition if module attribute is requested concurrently from another thread.

Agreed, this is a CPython bug:

import importlib
import threading
import sys

spec = importlib.util.find_spec('http')
module = importlib.util.module_from_spec(spec)
http = sys.modules['http'] = module

loader = importlib.util.LazyLoader(spec.loader)                                                     
loader.exec_module(module)

def check(): 
    return http.HTTPStatus.ACCEPTED == 202                                                          

for _ in range(5):  
    threading.Thread(target=check).start()

This looks painful to fix, since attempting to attach a lock to the _LazyModule class or instance would result in a recursive call to __getattribute__. Avoiding that is why self.__class__ is replaced before doing anything else. A global lock or a dict[ModuleType, Lock] table are the only things I can think of.

stefanv commented 5 months ago

Thanks for investigating, @effigies! Adding time.sleep(0.2) into check makes failure more reliable for me.

stefanv commented 5 months ago

Can we simply lock up exec_module?

import importlib
import threading
import sys
import time

spec = importlib.util.find_spec('http')
module = importlib.util.module_from_spec(spec)
http = sys.modules['http'] = module

lock = threading.Lock()

def lock_func(f):
    def locked(*args, **kwargs):
        with lock:
            return f(*args, **kwargs)
    return locked

loader = importlib.util.LazyLoader(spec.loader)
loader.exec_module = lock_func(loader.exec_module)
loader.exec_module(module)

def check():
    time.sleep(0.2)
    return http.HTTPStatus.ACCEPTED == 202

for _ in range(5):
    threading.Thread(target=check).start()
effigies commented 5 months ago

I don't think so, because exec_module happens at lazy load time. The race condition happens at the actual attribute lookup.

stefanv commented 5 months ago

Yes, but look at the above example, which wraps exec_module in a lock.

effigies commented 5 months ago
In [1]: import importlib
   ...: import threading
   ...: import sys
   ...: import time
   ...: 
   ...: spec = importlib.util.find_spec('http')
   ...: module = importlib.util.module_from_spec(spec)
   ...: http = sys.modules['http'] = module
   ...: 
   ...: lock = threading.Lock()
   ...: 
   ...: def lock_func(f):
   ...:     def locked(*args, **kwargs):
   ...:         with lock:
   ...:             return f(*args, **kwargs)
   ...:     return locked
   ...: 
   ...: loader = importlib.util.LazyLoader(spec.loader)
   ...: loader.exec_module = lock_func(loader.exec_module)
   ...: loader.exec_module(module)
   ...: 
   ...: 
   ...: def check():
   ...:     time.sleep(0.2)
   ...:     return http.HTTPStatus.ACCEPTED == 202
   ...: 
   ...: for _ in range(5):
   ...:     threading.Thread(target=check).start()
   ...: 

Exception in thread Thread-2 (check):
Traceback (most recent call last):
  File "/home/chris/mambaforge/envs/default/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/home/chris/mambaforge/envs/default/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "<ipython-input-1-1b0c7fd1f576>", line 25, in check
Exception in thread Thread-5 (check):
Traceback (most recent call last):
  File "/home/chris/mambaforge/envs/default/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/home/chris/mambaforge/envs/default/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "<ipython-input-1-1b0c7fd1f576>", line 25, in check
AttributeError: module 'http' has no attribute 'HTTPStatus'
AttributeError: module 'http' has no attribute 'HTTPStatus'
stefanv commented 5 months ago

Hrm, yes, looks like I have to use lock_func(spec.loader.exec_module), but not sure how that alters behavior.

effigies commented 5 months ago

The loader isn't where the race is. I might be wrong, but I don't think that will protect the actual critical operations.

stefanv commented 5 months ago

Right, so we'll have to ensure "attribute access + lazy loading" becomes an atomic operation.

effigies commented 5 months ago

I went ahead and opened https://github.com/python/cpython/issues/114763. I think I have a fix.

effigies commented 5 months ago

@eindenbom If you're prepared to build your whole dependency tree for 3.13-dev, you could test out https://github.com/python/cpython/issues/114781. I suspect it can be rebased on the 3.12 branch as well, if you'd prefer.

stefanv commented 4 months ago

Thanks to @effigies, this issue has been addressed by https://github.com/python/cpython/pull/114781.

@eindenbom This should soon be backported to the 3.11 and 3.12 source trees.

effigies commented 4 months ago

Should be out in 3.11.9 (https://github.com/python/cpython/pull/115871 - ETA April 1) and 3.12.3 (https://github.com/python/cpython/pull/115870 - ETA April 9).

stefanv commented 4 months ago

@effigies Does this affect / make unnecessary https://github.com/scientific-python/lazy_loader/pull/90?

effigies commented 4 months ago

No, my read is that these are two independent races.