zarr-developers / numcodecs

A Python package providing buffer compression and transformation codecs for use in data storage and communication applications.
http://numcodecs.readthedocs.io
MIT License
128 stars 88 forks source link

zarr 2.18.3 test suite fails on numcodecs 0.14 (test_array_with_delta_filter) #653

Open bnavigator opened 5 days ago

bnavigator commented 5 days ago

Latest released zarr is still 2.8.13, but numcodecs is already 0.14. They do not play together:

[   97s] _________________________ test_array_with_delta_filter _________________________
[   97s] [gw0] linux -- Python 3.12.7 /usr/bin/python3.12
[   97s]
[   97s]     def test_array_with_delta_filter():
[   97s]         # setup
[   97s]         astype = "u1"
[   97s]         dtype = "i8"
[   97s]         filters = [Delta(astype=astype, dtype=dtype)]
[   97s]         data = np.arange(100, dtype=dtype)
[   97s]
[   97s]         for compressor in compressors:
[   97s] >           a = array(data, chunks=10, compressor=compressor, filters=filters)
[   97s]
[   97s] zarr/tests/test_filters.py:40:
[   97s] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
[   97s] zarr/creation.py:444: in array
[   97s]     z[...] = data
[   97s] zarr/core.py:1449: in __setitem__
[   97s]     self.set_basic_selection(pure_selection, value, fields=fields)
[   97s] zarr/core.py:1545: in set_basic_selection
[   97s]     return self._set_basic_selection_nd(selection, value, fields=fields)
[   97s] zarr/core.py:1935: in _set_basic_selection_nd
[   97s]     self._set_selection(indexer, value, fields=fields)
[   97s] zarr/core.py:1988: in _set_selection
[   97s]     self._chunk_setitem(chunk_coords, chunk_selection, chunk_value, fields=fields)
[   97s] zarr/core.py:2261: in _chunk_setitem
[   97s]     self._chunk_setitem_nosync(chunk_coords, chunk_selection, value, fields=fields)
[   97s] zarr/core.py:2271: in _chunk_setitem_nosync
[   97s]     self.chunk_store[ckey] = self._encode_chunk(cdata)
[   97s] zarr/core.py:2387: in _encode_chunk
[   97s]     chunk = f.encode(chunk)
[   97s] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
[   97s]
[   97s] self = Delta(dtype='<i8', astype='|u1')
[   97s] buf = array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
[   97s]
[   97s]     def encode(self, buf):
[   97s]         # normalise input
[   97s]         arr = ensure_ndarray(buf).view(self.dtype)
[   97s]
[   97s]         # flatten to simplify implementation
[   97s]         arr = arr.reshape(-1, order='A')
[   97s]
[   97s]         # setup encoded output
[   97s]         enc = np.empty_like(arr, dtype=self.astype)
[   97s]
[   97s]         # set first element
[   97s]         enc[0] = arr[0]
[   97s]
[   97s]         # compute differences
[   97s]         # using np.subtract for in-place operations
[   97s]         if arr.dtype == bool:
[   97s]             np.not_equal(arr[1:], arr[:-1], out=enc[1:])
[   97s]         else:
[   97s] >           np.subtract(arr[1:], arr[:-1], out=enc[1:])
[   97s] E           numpy._core._exceptions._UFuncOutputCastingError: Cannot cast ufunc 'subtract' output from dtype('int64') to dtype('uint8') with casting rule 'same_kind'
[   97s]
[   97s] /usr/lib64/python3.12/site-packages/numcodecs/delta.py:70: UFuncTypeError
jakirkham commented 4 days ago

@ehgus could you please take a look?

ehgus commented 3 days ago

I will take a close look at the bug this weekend. Can you also describe the Numpy version used? Since the change passed numcodecs testsuits, we will probably need to add more tests to catch such bugs.

ehgus commented 3 days ago

@bnavigator Could you also check if it is fixed by replacing np.subtract(arr[1:], arr[0:-1], out=enc[1:]) to enc[1:] = np.diff(arr)? np.diff was used in the previous version.

bnavigator commented 3 days ago

@bnavigator Could you also check if it is fixed by replacing np.subtract(arr[1:], arr[0:-1], out=enc[1:]) to enc[1:] = np.diff(arr)?

Yes, that's what we do right now:

https://build.opensuse.org/projects/devel:languages:python:numeric/packages/python-numcodecs/files/numcodecs-revert-subtract-pr584.patch?expand=1

NumPy is at 2.1.3. Here is the full install and test log, succeeding with the patch applied: numcodecs_test_log.txt