tonybaloney / pytest-freethreaded

MIT License
31 stars 3 forks source link

Found concurrency errors in scikit-image #12

Open toshihikoyanase opened 1 month ago

toshihikoyanase commented 1 month ago

I just found the ConcurrencyError when I tested the main branch of scikit-image. I'm not sure if it is a bug or my environmental issue, and I'll investigate it later.

$ PYTHON_GIL=0 pytest -vvvv --log-level=DEBUG -x --pyargs skimage
platform darwin -- Python 3.13.0rc2, pytest-8.3.3, pluggy-1.5.0
...
plugins: localserver-0.9.0, cov-5.0.0, freethreaded-0.1.0, doctestplus-1.2.1
collected 8771 items / 7 skipped

...
skimage/_shared/tests/test_coord.py::test_max_batch_size FAILED                                                                             [  0%]
...

_______________________________________________________________ test_max_batch_size _______________________________________________________________

    def test_max_batch_size():
        """Small batches are slow, large batches -> large allocations -> also slow.

        https://github.com/scikit-image/scikit-image/pull/6035#discussion_r751518691
        """
        coords = np.random.randint(low=0, high=1848, size=(40000, 2))
        tstart = time.time()
        ensure_spacing(coords, spacing=100, min_split_size=50, max_split_size=2000)
        dur1 = time.time() - tstart

        tstart = time.time()
        ensure_spacing(coords, spacing=100, min_split_size=50, max_split_size=20000)
        dur2 = time.time() - tstart

        # Originally checked dur1 < dur2 to assert that the default batch size was
        # faster than a much larger batch size. However, on rare occasion a CI test
        # case would fail with dur1 ~5% larger than dur2. To be more robust to
        # variable load or differences across architectures, we relax this here.
>       assert dur1 < 1.33 * dur2
E       AssertionError

skimage/_shared/tests/test_coord.py:79: AssertionError

The above exception was the direct cause of the following exception:

item = <Function test_max_batch_size>

    @pytest.hookimpl()
    def pytest_runtest_call(item: pytest.Item):
        # Try item.runtest()
        config_threads = item.config.option.threads
        config_iterations = item.config.option.iterations
        freethreaded_mark = item.get_closest_marker(name="freethreaded")

        if freethreaded_mark:
            threads = freethreaded_mark.kwargs.get("threads", config_threads)
            iterations = freethreaded_mark.kwargs.get("iterations", config_iterations)
        else:
            iterations = config_iterations
            threads = config_threads

        logger.debug("Running test %s", item.name)
        executor = ThreadPoolExecutor(max_workers=threads)
        barrier = threading.Barrier(threads)
        last_round = iterations % threads
        last_barrier = threading.Barrier(last_round) if last_round else None
        results = list(
            executor.map(
                get_one_result,
                repeat(item, iterations),
                chain(
                    repeat(barrier, iterations - last_round),
                    repeat(last_barrier, last_round),
                ),
            )
        )
        exceptions = [r for r in results if isinstance(r, Exception)]
        if not exceptions:
            return results[0]
        if len(exceptions) == len(results):
            raise results[0]
>       raise ConcurrencyError(
            iterations=iterations, failures=len(exceptions), threads=threads
        ) from exceptions[0]
E       pytest_freethreaded.plugin.ConcurrencyError: 11 failures in 200 iterations across 10 threads

../venv-skimage/lib/python3.13t/site-packages/pytest_freethreaded/plugin.py:103: ConcurrencyError
---------------------------------------------------------------- Captured log call ----------------------------------------------------------------
DEBUG    pytest_freethreaded.plugin:plugin.py:83 Running test test_max_batch_size

Setup

I created the virtual environment outside the scikit-image repository to avoid build errors related to numpy-inc-dir.

python3t -m venv venv-skimage
. venv-skimage/bin/activate

Then, I clone the main branch of scikit-image and followed the official installation guide:

git clone git@github.com:scikit-image/scikit-image.git

# Install the dependencies
python -m pip install -i https://pypi.anaconda.org/scientific-python-nightly-wheels/simple numpy
python -m pip install -i https://pypi.anaconda.org/scientific-python-nightly-wheels/simple scipy
python -m pip install -i https://pypi.anaconda.org/scientific-python-nightly-wheels/simple cython
python -m pip install -i https://pypi.anaconda.org/scientific-python-nightly-wheels/simple pillow

python -m pip install -r requirements/build.txt
python -m pip install -r requirements.txt
spin install -v
toshihikoyanase commented 1 month ago

Without PYTHON_GIL=0, I saw another error as follows:

______________________________________________ ERROR collecting skimage/_shared/tests/test_coord.py _______________________________________________
skimage/_shared/tests/test_coord.py:5: in <module>
    from scipy.spatial.distance import pdist, minkowski
../venv-skimage/lib/python3.13t/site-packages/scipy/spatial/__init__.py:110: in <module>
    from ._kdtree import *
../venv-skimage/lib/python3.13t/site-packages/scipy/spatial/_kdtree.py:4: in <module>
    from ._ckdtree import cKDTree, cKDTreeNode
scipy/spatial/_ckdtree.pyx:11: in init scipy.spatial._ckdtree
    ???
../venv-skimage/lib/python3.13t/site-packages/scipy/sparse/__init__.py:307: in <module>
    from . import csgraph
../venv-skimage/lib/python3.13t/site-packages/scipy/sparse/csgraph/__init__.py:187: in <module>
    from ._laplacian import laplacian
../venv-skimage/lib/python3.13t/site-packages/scipy/sparse/csgraph/_laplacian.py:7: in <module>
    from scipy.sparse.linalg import LinearOperator
../venv-skimage/lib/python3.13t/site-packages/scipy/sparse/linalg/__init__.py:129: in <module>
    from ._isolve import *
../venv-skimage/lib/python3.13t/site-packages/scipy/sparse/linalg/_isolve/__init__.py:4: in <module>
    from .iterative import *
../venv-skimage/lib/python3.13t/site-packages/scipy/sparse/linalg/_isolve/iterative.py:5: in <module>
    from scipy.linalg import get_lapack_funcs
../venv-skimage/lib/python3.13t/site-packages/scipy/linalg/__init__.py:203: in <module>
    from ._misc import *
../venv-skimage/lib/python3.13t/site-packages/scipy/linalg/_misc.py:3: in <module>
    from .blas import get_blas_funcs
../venv-skimage/lib/python3.13t/site-packages/scipy/linalg/blas.py:213: in <module>
    from scipy.linalg import _fblas
E   RuntimeWarning: The global interpreter lock (GIL) has been enabled to load module 'scipy.linalg._fblas', which has not declared that it can run safely without the GIL. To override this behavior and keep the GIL disabled (at your own risk), run with PYTHON_GIL=0 or -Xgil=0.
corona10 commented 1 month ago

Out of curiosity which scipy version are you using?

corona10 commented 1 month ago

Ah if https://anaconda.org/scientific-python-nightly-wheels/scikit-image/ it doesn't work, it's worth to reporting it.