zarr-developers / zarr-python

An implementation of chunked, compressed, N-dimensional arrays for Python.
https://zarr.readthedocs.io
MIT License
1.45k stars 274 forks source link

Cast fill value to array's dtype #2020

Closed d-v-b closed 2 months ago

d-v-b commented 2 months ago

In v3 right now, we don't apply any parsing or validation to the fill_value attribute of an array. This PR fixes that.

In this PR, the fill_value is cast to an instance of the array's dtype, using normal numpy dtype casting semantics. I also added JSON serialization for complex dtypes, but this is not tested yet. I will add those tests as part of a later refactor of the metadata tests.

For v2 arrays, I am importing the v2 fill value parsing function and wrapping it in a very light wrapper. This preserves all the v2 behavior.

I also added a fill_value attribute to the Array class. Happy to remove this if it's not supposed to be there.

I put the metadata tests in a new directory structure: tests/v3/metadata/test_v2.py for v2 stuff and tests/v3/metadata/test_v3.py for v3 stuff. This keeps the v2 and v3 logic separated, and removes the need to name functions test_foo_v3 and the like. We should ultimately use an analogous layout in the library code.

Longer term we should support the raw bits datatypes defined in the zarr v3 spec, but I don't think we need that now.

TODO: