zarr-developers / numcodecs

A Python package providing buffer compression and transformation codecs for use in data storage and communication applications.
http://numcodecs.readthedocs.io
MIT License
125 stars 87 forks source link

Cannot import LZ4 #506

Open TheChymera opened 7 months ago

TheChymera commented 7 months ago

For context, though probably not relevant, I am trying to debug this issue with a package which uses numcodecs.

[deco]~ ❱ python
Python 3.11.7 (main, Dec 22 2023, 21:49:07) [GCC 13.2.1 20231216] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numcodecs
>>> from numcodecs.lz4 import LZ4
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: /usr/lib/python3.11/site-packages/numcodecs/lz4.cpython-311-x86_64-linux-gnu.so: undefined symbol: LZ4_compressBound
>>> numcodecs.__version__
'0.12.1'

I have installed numcodecs via the Gentoo Linux package manager full build and test log here. The lz4 test is indeed skipped:

SKIPPED [1] ../work/numcodecs-0.12.1-python3_11/install/usr/lib/python3.11/site-packages/numcodecs/tests/test_lz4.py:11: numcodecs.lz4 not available

But the build says the module is being built:

Compiling numcodecs/lz4.pyx because it changed.
[1/1] Cythonizing numcodecs/lz4.pyx

Any idea what might be going on here?

TheChymera commented 7 months ago

I had forgotten to put the link for the full build and test log, it's now here as well as in the corrected link in the original post.

TheChymera commented 7 months ago

Ok, so apparently for some reason the import fails if the package is built without the lto CFLAG and with SSE2/AVX2 support:

This build (CFLAGS are "-march=native -O2 -pipe"), results in:

[deco]~ ❱ python -c "from numcodecs import lz4"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ImportError: /usr/lib/python3.11/site-packages/numcodecs/lz4.cpython-311-x86_64-linux-gnu.so: undefined symbol: LZ4_compressBound

This build (CFLAGS are "-march=native -O2 -pipe -flto=auto") works fine.

This build (CFLAGS are "-march=native -O2 -pipe" but it also has DISABLE_NUMCODECS_AVX2=1 and DISABLE_NUMCODECS_SSE2=1) also works fine.

I have no idea what this could be about. Do any of you know more?

joshmoore commented 7 months ago

cc: @jakirkham

leaver2000 commented 7 months ago

I was working out a similar issue earlier. From my experience that error occurred when the lz4.h was not found when the .pyx is compiled.

For me adding the following to the extension resolved the issue

libraries="lz4"
TheChymera commented 7 months ago

@leaver2000 what do you mean by “adding” — is this some sort of patch that could be applied to the source code to make it less fragile to compiler options?

leaver2000 commented 7 months ago

@TheChymera The error you described was identical to the issue I ran into when the lz4_sources could not be found during setup

lz4.cpython-311-x86_64-linux-gnu.so: undefined symbol: LZ4_compressBound

https://github.com/zarr-developers/numcodecs/blob/main/setup.py#L188

    extensions = [
        Extension('numcodecs.lz4',
                  sources=sources + lz4_sources,
                  include_dirs=include_dirs,
                  define_macros=define_macros,
                  extra_compile_args=extra_compile_args,
                  libraries="lz4", # adding resolved my issues.
                  ),
    ]

Maybe just chuck a in an assert into the setup.py, to insure your compiler flags are not messing the install.

assert lz4_sources != []