zarr-developers / zarr-python

An implementation of chunked, compressed, N-dimensional arrays for Python.
https://zarr.readthedocs.io
MIT License
1.47k stars 273 forks source link

Tests fail with zlib-ng #1678

Open QuLogic opened 7 months ago

QuLogic commented 7 months ago

Zarr version

2.16.1

Numcodecs version

0.12.1

Python Version

3.12

Operating System

Fedora Rawhide/40

Installation

from source

Description

Fedora 40 is transitioning to using zlib-ng; this provides an ABI-compatible replacement that is parallel and optimized for current processors. Unfortunately, it also produces results which may be different (but as I understand it, of similar compression ratio). This causes several tests to fail, as they compute the exact hex digest of the result.

I'm not sure if I should just start skipping these tests, or try and update them in some way.

Steps to reproduce

On a Fedora Rawhide or Fedora 40 container, install dependencies, then install zarr and run tests:

$ podman run --rm -it fedora:rawhide
# dnf install -y python3-devel 'python3dist(bsddb3)' 'python3dist(fsspec)' 'python3dist(h5py)' 'python3dist(lmdb)' 'python3dist(msgpack)' 'python3dist(pytest)' 'python3dist(asciitree)' 'python3dist(numcodecs)' 'python3dist(fasteners)'
# pip install zarr
# pytest --pyargs zarr

Additional output

Failing test output ```pytb ___________________________ TestArray.test_hexdigest ___________________________ self = def test_hexdigest(self): found = [] # Check basic 1-D array z = self.create_array(shape=(1050,), chunks=100, dtype=" assert self.expected() == found E AssertionError: assert ['063b02ff8d9...05027a07d209'] == ['063b02ff8d9...05027a07d209'] E At index 3 diff: '14470724dca6c1837edddedc490571b6a7f270bc' != 'f3f04f0e30844739d34ef8a9eee6c949a47840b8' E Use -v to get more diff zarr/tests/test_core.py:667: AssertionError __________________ TestArrayWithDirectoryStore.test_hexdigest __________________ self = def test_hexdigest(self): found = [] # Check basic 1-D array z = self.create_array(shape=(1050,), chunks=100, dtype=" assert self.expected() == found E AssertionError: assert ['063b02ff8d9...05027a07d209'] == ['063b02ff8d9...05027a07d209'] E At index 3 diff: '14470724dca6c1837edddedc490571b6a7f270bc' != 'f3f04f0e30844739d34ef8a9eee6c949a47840b8' E Use -v to get more diff zarr/tests/test_core.py:667: AssertionError _______________ TestArrayWithNestedDirectoryStore.test_hexdigest _______________ self = def test_hexdigest(self): found = [] # Check basic 1-D array z = self.create_array(shape=(1050,), chunks=100, dtype=" assert self.expected() == found E AssertionError: assert ['d174aa384e6...7759ca725551'] == ['d174aa384e6...7759ca725551'] E At index 3 diff: '719a88b34e362ff65df30e8f8810c1146ab72bc1' != '42d9c96e60ed22346c4671bc5bec32a2078ce25c' E Use -v to get more diff zarr/tests/test_core.py:667: AssertionError _____________________ TestArrayWithN5Store.test_hexdigest ______________________ self = def test_hexdigest(self): found = [] # Check basic 1-D array z = self.create_array(shape=(1050,), chunks=100, dtype=" assert self.expected() == found E AssertionError: assert ['8811a77d54c...d30c80dfc372'] == ['8811a77d54c...d30c80dfc372'] E At index 3 diff: '568f9f837e4b682a3819cb122988e2eebeb6572b' != 'ea7d9e80211679291141840b111775b088e51480' E Use -v to get more diff zarr/tests/test_core.py:667: AssertionError ____________________ TestArrayWithN5FSStore.test_hexdigest _____________________ self = def test_hexdigest(self): found = [] # Check basic 1-D array z = self.create_array(shape=(1050,), chunks=100, dtype=" assert self.expected() == found E AssertionError: assert ['8811a77d54c...d30c80dfc372'] == ['8811a77d54c...d30c80dfc372'] E At index 3 diff: '568f9f837e4b682a3819cb122988e2eebeb6572b' != 'ea7d9e80211679291141840b111775b088e51480' E Use -v to get more diff zarr/tests/test_core.py:667: AssertionError _____________________ TestArrayWithDBMStore.test_hexdigest _____________________ self = def test_hexdigest(self): found = [] # Check basic 1-D array z = self.create_array(shape=(1050,), chunks=100, dtype=" assert self.expected() == found E AssertionError: assert ['063b02ff8d9...05027a07d209'] == ['063b02ff8d9...05027a07d209'] E At index 3 diff: '14470724dca6c1837edddedc490571b6a7f270bc' != 'f3f04f0e30844739d34ef8a9eee6c949a47840b8' E Use -v to get more diff zarr/tests/test_core.py:667: AssertionError ________________ TestArrayWithDBMStoreBerkeleyDB.test_hexdigest ________________ self = def test_hexdigest(self): found = [] # Check basic 1-D array z = self.create_array(shape=(1050,), chunks=100, dtype=" assert self.expected() == found E AssertionError: assert ['063b02ff8d9...05027a07d209'] == ['063b02ff8d9...05027a07d209'] E At index 3 diff: '14470724dca6c1837edddedc490571b6a7f270bc' != 'f3f04f0e30844739d34ef8a9eee6c949a47840b8' E Use -v to get more diff zarr/tests/test_core.py:667: AssertionError ____________________ TestArrayWithLMDBStore.test_hexdigest _____________________ self = def test_hexdigest(self): found = [] # Check basic 1-D array z = self.create_array(shape=(1050,), chunks=100, dtype=" assert self.expected() == found E AssertionError: assert ['063b02ff8d9...05027a07d209'] == ['063b02ff8d9...05027a07d209'] E At index 3 diff: '14470724dca6c1837edddedc490571b6a7f270bc' != 'f3f04f0e30844739d34ef8a9eee6c949a47840b8' E Use -v to get more diff zarr/tests/test_core.py:667: AssertionError ________________ TestArrayWithLMDBStoreNoBuffers.test_hexdigest ________________ self = def test_hexdigest(self): found = [] # Check basic 1-D array z = self.create_array(shape=(1050,), chunks=100, dtype=" assert self.expected() == found E AssertionError: assert ['063b02ff8d9...05027a07d209'] == ['063b02ff8d9...05027a07d209'] E At index 3 diff: '14470724dca6c1837edddedc490571b6a7f270bc' != 'f3f04f0e30844739d34ef8a9eee6c949a47840b8' E Use -v to get more diff zarr/tests/test_core.py:667: AssertionError ___________________ TestArrayWithSQLiteStore.test_hexdigest ____________________ self = def test_hexdigest(self): found = [] # Check basic 1-D array z = self.create_array(shape=(1050,), chunks=100, dtype=" assert self.expected() == found E AssertionError: assert ['063b02ff8d9...05027a07d209'] == ['063b02ff8d9...05027a07d209'] E At index 3 diff: '14470724dca6c1837edddedc490571b6a7f270bc' != 'f3f04f0e30844739d34ef8a9eee6c949a47840b8' E Use -v to get more diff zarr/tests/test_core.py:667: AssertionError _____________________ TestArrayWithFilters.test_hexdigest ______________________ self = def test_hexdigest(self): found = [] # Check basic 1-D array z = self.create_array(shape=(1050,), chunks=100, dtype=" assert self.expected() == found E AssertionError: assert ['b80367c5599...d33bce52501e'] == ['b80367c5599...d33bce52501e'] E At index 3 diff: 'c649ad229bc5720258b934ea958570c2f354c2eb' != '1e053b6ad7dc58de7b1f5dad7fb45851f6b7b3ee' E Use -v to get more diff zarr/tests/test_core.py:667: AssertionError __________________ TestArrayWithCustomMapping.test_hexdigest ___________________ self = def test_hexdigest(self): found = [] # Check basic 1-D array z = self.create_array(shape=(1050,), chunks=100, dtype=" assert self.expected() == found E AssertionError: assert ['063b02ff8d9...05027a07d209'] == ['063b02ff8d9...05027a07d209'] E At index 3 diff: '14470724dca6c1837edddedc490571b6a7f270bc' != 'f3f04f0e30844739d34ef8a9eee6c949a47840b8' E Use -v to get more diff zarr/tests/test_core.py:667: AssertionError ________________ TestArrayWithCustomMapping.test_nbytes_stored _________________ self = def test_nbytes_stored(self): z = self.create_array(shape=1000, chunks=100) assert 245 == z.nbytes_stored z[:] = 42 > assert 515 == z.nbytes_stored E assert 515 == 485 E + where 485 = .nbytes_stored zarr/tests/test_core.py:2315: AssertionError _______________________ TestArrayNoCache.test_hexdigest ________________________ self = def test_hexdigest(self): found = [] # Check basic 1-D array z = self.create_array(shape=(1050,), chunks=100, dtype=" assert self.expected() == found E AssertionError: assert ['063b02ff8d9...05027a07d209'] == ['063b02ff8d9...05027a07d209'] E At index 3 diff: '14470724dca6c1837edddedc490571b6a7f270bc' != 'f3f04f0e30844739d34ef8a9eee6c949a47840b8' E Use -v to get more diff zarr/tests/test_core.py:667: AssertionError ____________________ TestArrayWithStoreCache.test_hexdigest ____________________ self = def test_hexdigest(self): found = [] # Check basic 1-D array z = self.create_array(shape=(1050,), chunks=100, dtype=" assert self.expected() == found E AssertionError: assert ['063b02ff8d9...05027a07d209'] == ['063b02ff8d9...05027a07d209'] E At index 3 diff: '14470724dca6c1837edddedc490571b6a7f270bc' != 'f3f04f0e30844739d34ef8a9eee6c949a47840b8' E Use -v to get more diff zarr/tests/test_core.py:667: AssertionError ```
QuLogic commented 7 months ago

Oh, actually, it appears you don't even need to build from source, as wheels use system zlib, so a shorter reproducer:

$ podman run --rm -it fedora:rawhide
# dnf install -y python3-pip
# pip install zarr msgpack pytest
# pytest --pyargs zarr
AdamWill commented 3 months ago

Sent #1971 and #1972 with one approach to fix this (just accept the existing values and the ones we get from zlib-ng on Fedora; I don't know how long-term stable these will be).