Closed shenker closed 2 years ago
Operating System: Linux 3.10.0-1160.62.1.el7.x86_64 (this is on HMS O2 with conda-forge python installed via mamba)
ohhhh, you're running on centos. Yeah, support for centos has been very hard, and I suspect that's the actual issue here. I was able to get it to build and ship for CentOS on conda forge, so you could their docker setup if you want. but I've never got it to compile correctly locally.
I can try to take a crack at it on O2... but in general, haven't been developing there.
Gotcha.
I can believe that the undefined symbols (problem 2) is probably entirely a CentOS issue. I was hoping that installing all the relevant libraries (gcc, libstdcxx-ng, etc.) into a virtualenv from conda-forge would offer a way around the issue, if I can find the right versions of everything. This does appear to have been possible, at least a couple years ago. Some googling turned up:
It would be nice to have it build on O2, since that's where all my big ND2 files are (and where I can run dask distributed). I'm assuming the way to go about doing this is: 1) play around and see if we can find versions of libstdcxx, libtiff, etc. that make things work 2) if that fails, just use the conda-forge docker setup via singularity on O2 (I've never used singularity before)
It's less obvious to me why CentOS would cause the SDK .so files to be not installed at all (problem 1). I'm very confused by this and don't know quite how to attack it.
If you (or anyone) wants to dig into this, I think digging through the logs for the successful centos builds using the conda-forge CI would be a good start. The last one is here: https://dev.azure.com/conda-forge/feedstock-builds/_build/results?buildId=474173&view=results
note the build & host requirements in the recipe ... all the env vars during build here After building, I think there is still an important (re)localization step, and I think it starts around here.
Note that most of the difficulty here comes from the fact that we're using precompiled sdk shared library files provided by LIM ... so library linking and dependencies is more complicated as a result
Finally got around to playing around with this more. I can run pytest locally on O2 with nd2 from conda-forge with all tests passing, and I can build nd2 on O2 and get all tests to pass except those in tests/test_dask_dispatch.py
and tests/test_segfaults.py
. (See MWEs below.) There the memmaps don't seem to be closed when they should be, and I cannot fathom why it works for the conda-forge package but doesn't when I build myself on O2. Unless you recognize the BufferErrors I'm seeing, I think I'm going to give up (being able to run most of the test suite is good enough).
mamba env create -n nd2test
mamba activate nd2test
mamba install -c conda-forge cython black flake8 flake8-docstrings imagecodecs aicsimageio ipython isort mypy pre-commit psutil pydocstyle pytest pytest-cov wurlitzer xarray resource_backed_dask_array python=3.10.4 requests
git clone git@github.com:tlambert03/nd2.git
cd nd2
python scripts/download_samples.py
make
pip install --no-deps -e .
LD_LIBRARY_PATH=$CONDA_PREFIX/lib pytest
This gives 512 errors, all of them either BufferError
or AssertionError
. I rerun with pytest tests/test_segfaults.py
to show the error for a single test (it's the same as all the other failed tests):
============================= test session starts ==============================
platform linux -- Python 3.10.4, pytest-7.1.2, pluggy-1.0.0
rootdir: /home/jqs1/projects/nd2, configfile: setup.cfg
plugins: cov-3.0.0
collected 525 items / 524 deselected / 1 selected
tests/test_segfaults.py FE [100%]
==================================== ERRORS ====================================
________________________ ERROR at teardown of test_seg _________________________
@pytest.fixture(autouse=True)
def no_files_left_open():
files_before = {p for p in psutil.Process().open_files() if p.path.endswith("nd2")}
yield
files_after = {p for p in psutil.Process().open_files() if p.path.endswith("nd2")}
> assert files_before == files_after == set()
E AssertionError: assert set() == {popenfile(pa...flags=557056)}
E Extra items in the right set:
E popenfile(path='/home/jqs1/projects/nd2/tests/data/jonas_header_test2.nd2', fd=16, position=0, mode='r', flags=557056)
E Use -v to get more diff
tests/conftest.py:46: AssertionError
=================================== FAILURES ===================================
___________________________________ test_seg ___________________________________
def test_seg():
with nd2.ND2File(str(DATA / "jonas_header_test2.nd2")) as f:
img = f.to_xarray(delayed=True, squeeze=False, position=0)
> a = img.compute()
tests/test_segfaults.py:13:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../../mambaforge/envs/nd2/lib/python3.10/site-packages/xarray/core/dataarray.py:947: in compute
return new.load(**kwargs)
../../mambaforge/envs/nd2/lib/python3.10/site-packages/xarray/core/dataarray.py:921: in load
ds = self._to_temp_dataset().load(**kwargs)
../../mambaforge/envs/nd2/lib/python3.10/site-packages/xarray/core/dataset.py:861: in load
evaluated_data = da.compute(*lazy_data.values(), **kwargs)
../../mambaforge/envs/nd2/lib/python3.10/site-packages/dask/base.py:600: in compute
results = schedule(dsk, keys, **kwargs)
../../mambaforge/envs/nd2/lib/python3.10/site-packages/dask/threaded.py:81: in get
results = get_async(
../../mambaforge/envs/nd2/lib/python3.10/site-packages/dask/local.py:508: in get_async
raise_exception(exc, tb)
../../mambaforge/envs/nd2/lib/python3.10/site-packages/dask/local.py:316: in reraise
raise exc
../../mambaforge/envs/nd2/lib/python3.10/site-packages/dask/local.py:221: in execute_task
result = _execute_task(task, data)
../../mambaforge/envs/nd2/lib/python3.10/site-packages/dask/core.py:119: in _execute_task
return func(*(_execute_task(a, cache) for a in args))
../../mambaforge/envs/nd2/lib/python3.10/site-packages/dask/core.py:119: in <genexpr>
return func(*(_execute_task(a, cache) for a in args))
../../mambaforge/envs/nd2/lib/python3.10/site-packages/dask/core.py:119: in _execute_task
return func(*(_execute_task(a, cache) for a in args))
../../mambaforge/envs/nd2/lib/python3.10/site-packages/dask/optimization.py:990: in __call__
return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
../../mambaforge/envs/nd2/lib/python3.10/site-packages/dask/core.py:149: in get
result = _execute_task(task, cache)
../../mambaforge/envs/nd2/lib/python3.10/site-packages/dask/core.py:119: in _execute_task
return func(*(_execute_task(a, cache) for a in args))
../../mambaforge/envs/nd2/lib/python3.10/site-packages/dask/utils.py:41: in apply
return func(*args, **kwargs)
../../mambaforge/envs/nd2/lib/python3.10/site-packages/dask/array/core.py:514: in _pass_extra_kwargs
return func(*args[len(keys) :], **kwargs)
src/nd2/nd2file.py:351: in _dask_block
self.close()
src/nd2/nd2file.py:82: in close
self._rdr.close()
src/nd2/_sdk/latest.pyx:57: in nd2._sdk.latest.ND2Reader.close
cpdef close(self):
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> self._mmap.close()
E BufferError: cannot close exported pointers exist
src/nd2/_sdk/latest.pyx:60: BufferError
=========================== short test summary info ============================
FAILED tests/test_segfaults.py::test_seg - BufferError: cannot close exported...
ERROR tests/test_segfaults.py::test_seg - AssertionError: assert set() == {po...
================== 1 failed, 524 deselected, 1 error in 3.45s =================
To determine whether this is a compile-time or run-time problem, I want to verify that the conda-forge version passes all tests when running on CentOS (it does):
mamba env create -n nd2test
mamba activate nd2test
mamba install -c conda-forge python=3.10.4 nd2 aicsimageio xarray resource_backed_dask_array pytest requests
git clone git@github.com:tlambert03/nd2.git
cd nd2
python scripts/download_samples.py
pytest
gives
============================= test session starts ==============================
platform linux -- Python 3.10.4, pytest-7.1.2, pluggy-1.0.0
rootdir: /home/jqs1/projects/nd2, configfile: setup.cfg
collected 525 items
tests/test_aicsimage.py .......... [ 1%]
tests/test_dask_dispatch.py ........ [ 3%]
tests/test_reader.py ................................................... [ 13%]
........................................................................ [ 26%]
...................................................sss.................. [ 40%]
........................................................................ [ 54%]
...................................ssssss............................... [ 68%]
............................................xxxx.x....x...xx............ [ 81%]
....................x............. [ 88%]
tests/test_readme.py . [ 88%]
tests/test_rescue.py . [ 88%]
tests/test_sdk.py ...................................................... [ 98%]
..... [ 99%]
tests/test_segfaults.py . [100%]
============= 507 passed, 9 skipped, 9 xfailed in 63.28s (0:01:03) =============
E BufferError: cannot close exported pointers exist
this is what I just fixed in https://github.com/tlambert03/nd2/pull/55
are you by chance using a main
that is more than 6 days old?
Oh, I'm an idiot. All tests pass. Thanks Talley!
Nice! That was easy 😂
Description
I think I am running into two separate issues: 1) When building locally, either with
pip install
orpip install -e
, the shared library is not installed correctly (related to https://github.com/tlambert03/nd2/issues/24). 2) Even when I runpytest
from the root of the repo, to work around the previous problem,liblimfile.so
complains about undefined symbols.I spent a couple hours playing around and am rapidly running against the limits of my understanding of modern Python packaging, so any assistance or hints you could provide @tlambert03 would be much appreciated!
What I Did
(installing gcc in the virtualenv is necessary because the default gcc on HMS O2 is ancient)
If I then run
pytest
from the root of the repo, I get the errorIf I then
pip install aicsimageio
and re-runpytest
:From the root of the repo, if I run
python -c "from nd2._sdk import latest"
, I get:If I cd outside the repo, the same command gives:
I was thinking this had something to do with the editable pip install, so I tried installing via
pip install .[dev]
(no-e
). This produces the same results as above. Here's a clue:find /home/jqs1/mambaforge/envs/nd2test/lib/python3.10/site-packages/nd2/ -name "*.so"
givesNamely, the Nikon SDK shared libraries (
liblimfile.so
andlibnd2readsdk-shared.so
) aren't even being installed. I noticed that the paths inMANIFEST.in
are wrong. I believeshould be
Unfortunately, this change does not appear to fix this problem, the Nikon shared libraries are still not installed.