Closed james-s-willis closed 3 years ago
How does this interact with h5py and the libhdf5 versions/binaries? There is this long standing issue that h5py and bit shuffle need to be linked to the same version fo hdf5, meaning both usually need to be build from source instead of using wheels. Is that fixed now?
That was just fixed in #81. Which is why we can now do binary wheels.
Nice! (I'm clearly not paying close enough attention).
Automated such that is uploads upon tagging seems pretty convenient!
Yeah, thanks to @t20100 and @james-s-willis for doing all of the work on that one, it's going to be a wonderful new world of speedy installs and fewer confused students!
Indeed, that was a huge pain in the ass.
@james-s-willis it's actually the mismatch I would like to see tested, e.g. headers used for building are 1.10.7, installed version used at run time is 1.8.11.
Basically something that checks that the wheels really are independent of the version of HDF5 installed.
@jrs65 Oh I see what you mean now. I'll see if I can set that test up.
@jrs65, so I build the wheels with HDF5 1.10.7. I then install HDF5 1.8.15 prior to the unit tests. With CPython 3.6 all tests pass, but with CPython 3.7 I get this error with test_h5filter.py
which complains about an incorrect datatype, ValueError: Unable to create dataset (not a datatype)
:
2021-07-05T20:49:43.2284868Z =================================== FAILURES ===================================
2021-07-05T20:49:43.2285394Z ____________________________ TestFilter.test_filter ____________________________
2021-07-05T20:49:43.2285732Z
2021-07-05T20:49:43.2286258Z self = <test_h5filter.TestFilter testMethod=test_filter>
2021-07-05T20:49:43.2286696Z
2021-07-05T20:49:43.2287042Z def test_filter(self):
2021-07-05T20:49:43.2287446Z shape = (32 * 1024 + 783,)
2021-07-05T20:49:43.2287797Z chunks = (4 * 1024 + 23,)
2021-07-05T20:49:43.2288181Z dtype = np.int64
2021-07-05T20:49:43.2288590Z data = np.arange(shape[0])
2021-07-05T20:49:43.2289059Z fname = "tmp_test_filters.h5"
2021-07-05T20:49:43.2289503Z f = h5py.File(fname, "w")
2021-07-05T20:49:43.2289931Z h5.create_dataset(
2021-07-05T20:49:43.2290296Z f,
2021-07-05T20:49:43.2290611Z b"range",
2021-07-05T20:49:43.2290962Z shape,
2021-07-05T20:49:43.2291292Z dtype,
2021-07-05T20:49:43.2291637Z chunks,
2021-07-05T20:49:43.2292029Z filter_pipeline=(32008, 32000),
2021-07-05T20:49:43.2292610Z filter_flags=(h5z.FLAG_MANDATORY, h5z.FLAG_MANDATORY),
2021-07-05T20:49:43.2293143Z > filter_opts=None,
2021-07-05T20:49:43.2293502Z )
2021-07-05T20:49:43.2293705Z
2021-07-05T20:49:43.2294279Z /project/tests/test_h5filter.py:33:
2021-07-05T20:49:43.2294747Z _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
2021-07-05T20:49:43.2295307Z bitshuffle/h5.pyx:186: in bitshuffle.h5.create_dataset
2021-07-05T20:49:43.2295832Z ???
2021-07-05T20:49:43.2296319Z h5py/_objects.pyx:54: in h5py._objects.with_phil.wrapper
2021-07-05T20:49:43.2296818Z ???
2021-07-05T20:49:43.2297302Z h5py/_objects.pyx:55: in h5py._objects.with_phil.wrapper
2021-07-05T20:49:43.2297801Z ???
2021-07-05T20:49:43.2298150Z _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
2021-07-05T20:49:43.2298390Z
2021-07-05T20:49:43.2298673Z > ???
2021-07-05T20:49:43.2299140Z E ValueError: Unable to create dataset (not a datatype)
2021-07-05T20:49:43.2299621Z
2021-07-05T20:49:43.2299983Z h5py/h5d.pyx:87: ValueError
2021-07-05T20:49:43.2300513Z _______________________ TestFilter.test_with_block_size ________________________
2021-07-05T20:49:43.2300879Z
2021-07-05T20:49:43.2301426Z self = <test_h5filter.TestFilter testMethod=test_with_block_size>
2021-07-05T20:49:43.2301888Z
2021-07-05T20:49:43.2302247Z def test_with_block_size(self):
2021-07-05T20:49:43.2302661Z shape = (128 * 1024 + 783,)
2021-07-05T20:49:43.2303016Z chunks = (4 * 1024 + 23,)
2021-07-05T20:49:43.2303404Z dtype = np.int64
2021-07-05T20:49:43.2303813Z data = np.arange(shape[0])
2021-07-05T20:49:43.2304272Z fname = "tmp_test_filters.h5"
2021-07-05T20:49:43.2304715Z f = h5py.File(fname, "w")
2021-07-05T20:49:43.2305141Z h5.create_dataset(
2021-07-05T20:49:43.2305496Z f,
2021-07-05T20:49:43.2305824Z b"range",
2021-07-05T20:49:43.2306161Z shape,
2021-07-05T20:49:43.2306510Z dtype,
2021-07-05T20:49:43.2307140Z chunks,
2021-07-05T20:49:43.2307558Z filter_pipeline=(32008, 32000),
2021-07-05T20:49:43.2308142Z filter_flags=(h5z.FLAG_MANDATORY, h5z.FLAG_MANDATORY),
2021-07-05T20:49:43.2308660Z > filter_opts=((680,), ()),
2021-07-05T20:49:43.2309017Z )
2021-07-05T20:49:43.2309206Z
2021-07-05T20:49:43.2309642Z /project/tests/test_h5filter.py:59:
2021-07-05T20:49:43.2310114Z _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
2021-07-05T20:49:43.2310676Z bitshuffle/h5.pyx:186: in bitshuffle.h5.create_dataset
2021-07-05T20:49:43.2311195Z ???
2021-07-05T20:49:43.2311684Z h5py/_objects.pyx:54: in h5py._objects.with_phil.wrapper
2021-07-05T20:49:43.2312182Z ???
2021-07-05T20:49:43.2312667Z h5py/_objects.pyx:55: in h5py._objects.with_phil.wrapper
2021-07-05T20:49:43.2313163Z ???
2021-07-05T20:49:43.2313493Z _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
2021-07-05T20:49:43.2313752Z
2021-07-05T20:49:43.2314034Z > ???
2021-07-05T20:49:43.2314499Z E ValueError: Unable to create dataset (not a datatype)
2021-07-05T20:49:43.2314890Z
2021-07-05T20:49:43.2315254Z h5py/h5d.pyx:87: ValueError
2021-07-05T20:49:43.2315809Z _______________________ TestFilter.test_with_compression _______________________
2021-07-05T20:49:43.2316190Z
2021-07-05T20:49:43.2316756Z self = <test_h5filter.TestFilter testMethod=test_with_compression>
2021-07-05T20:49:43.2317223Z
2021-07-05T20:49:43.2317623Z def test_with_compression(self):
2021-07-05T20:49:43.2318046Z shape = (128 * 1024 + 783,)
2021-07-05T20:49:43.2318415Z chunks = (4 * 1024 + 23,)
2021-07-05T20:49:43.2318794Z dtype = np.int64
2021-07-05T20:49:43.2319202Z data = np.arange(shape[0])
2021-07-05T20:49:43.2319671Z fname = "tmp_test_filters.h5"
2021-07-05T20:49:43.2320117Z f = h5py.File(fname, "w")
2021-07-05T20:49:43.2320547Z h5.create_dataset(
2021-07-05T20:49:43.2320907Z f,
2021-07-05T20:49:43.2321238Z b"range",
2021-07-05T20:49:43.2321572Z shape,
2021-07-05T20:49:43.2321920Z dtype,
2021-07-05T20:49:43.2322251Z chunks,
2021-07-05T20:49:43.2322744Z filter_pipeline=(32008,),
2021-07-05T20:49:43.2323246Z filter_flags=(h5z.FLAG_MANDATORY,),
2021-07-05T20:49:43.2323770Z > filter_opts=((0, h5.H5_COMPRESS_LZ4),),
2021-07-05T20:49:43.2324179Z )
2021-07-05T20:49:43.2324370Z
2021-07-05T20:49:43.2324811Z /project/tests/test_h5filter.py:86:
2021-07-05T20:49:43.2325272Z _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
2021-07-05T20:49:43.2325845Z bitshuffle/h5.pyx:186: in bitshuffle.h5.create_dataset
2021-07-05T20:49:43.2326365Z ???
2021-07-05T20:49:43.2326852Z h5py/_objects.pyx:54: in h5py._objects.with_phil.wrapper
2021-07-05T20:49:43.2327346Z ???
2021-07-05T20:49:43.2327828Z h5py/_objects.pyx:55: in h5py._objects.with_phil.wrapper
2021-07-05T20:49:43.2328407Z ???
2021-07-05T20:49:43.2328737Z _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
2021-07-05T20:49:43.2328995Z
2021-07-05T20:49:43.2329277Z > ???
2021-07-05T20:49:43.2329745Z E ValueError: Unable to create dataset (not a datatype)
2021-07-05T20:49:43.2330129Z
2021-07-05T20:49:44.4940352Z ##[error]Command ['sh', '-c', 'CI_BUILD_WHEEL=1 pytest /project/tests'] failed with code 1.
2021-07-05T20:49:44.4954398Z h5py/h5d.pyx:87: ValueError
2021-07-05T20:49:44.4954832Z
2021-07-05T20:49:44.4955223Z =========================== short test summary info ============================
2021-07-05T20:49:44.4956730Z FAILED ../project/tests/test_h5filter.py::TestFilter::test_filter - ValueErro...
2021-07-05T20:49:44.4957889Z FAILED ../project/tests/test_h5filter.py::TestFilter::test_with_block_size - ...
2021-07-05T20:49:44.4958717Z FAILED ../project/tests/test_h5filter.py::TestFilter::test_with_compression
2021-07-05T20:49:44.4959444Z =================== 3 failed, 57 passed, 1 skipped in 1.62s ====================
I followed this through to H5DCreate
inside h5py
but I can't work out the problem. The only difference I can see is between the h5py
versions used in CPython 3.7 + h5py 3.3.0 vs CPython 3.6 + h5py 3.1.0. I fixed the version used with CPython 3.7 to h5py 3.1.0 but I get the same error. Maybe, @kiyo-masui has seen this error before?
@jrs65, I've managed to fix the CI workflow file so that HDF5 1.10.7 is always installed before each bitshuffle wheel build and HDF5 1.8.11 is installed prior to each test suite is run. This config still passes all tests. I have also tried installing bitshuffle from the wheels generated on my lab machine running ubuntu 18.04, python 3.7, HDF5 1.10.0-patch1 and it passes the unit tests apart from test_h5plugin.py
. Are you able download the wheels from here: https://github.com/kiyo-masui/bitshuffle/actions/runs/1008960943 and try it on your machine Richard?
@jrs65, Shiny has tested the wheels and it works for him. The last thing that needs to be done is to add a secrets.pypi_password
to the repo so that the wheels can be uploaded to PyPI. I can't do that so I was hoping @kiyo-masui could do that? There is a guide here: https://docs.github.com/en/actions/reference/encrypted-secrets
Then bitshuffle would need to be tagged to upload the wheels to PyPI after this PR is merged.
The contents of secrets.pypi_password
is just my pipit password? No username or anything else?
It won't let me add a secret named secrets.pypi_password
because the .
isn't allowed in the name.
The secret's name should be pypi_password
and contains an API token from the project's settings page on pypi.org (user: __token__
).
Done.
Thanks @kiyo-masui! I'll merge this in now.
These changes automate building bitshuffle wheels for Linux x86.
How do you want the wheels uploaded to PyPI, @kiyo-masui? There are various options (https://cibuildwheel.readthedocs.io/en/stable/deliver-to-pypi/ - you can even automate it and only upload on tagged versions of bitshuffle.