Open d-v-b opened 5 months ago
It was intentional that zarr_format
not include a precise version number of the spec. Instead, it is intended that the spec defines some broad compatibility guarantees, and zarr_format
only needs to be updated when we need to step outside of those guarantees.
The rationale is:
If we don't include a precise version number, then when creating an array we don't have to worry about picking a version number, and when reading an array, we can still just validate the metadata according to the actual features in use.
thanks @jbms, that's helpful. I think it would be good to write this logic into the spec. I will ping you if I submit a PR to that effect.
if I got it right, this relates to
as a formalization of those "features" used/present in any given Zarr of a "major" zarr_format
version. Is my understanding correct?
I think this discussion and #262 concern different levels of abstraction.
The properties of a particular version of zarr are formalized by the relevant zarr specification. See the specification for zarr version 2, or the specification for zarr version 3. I raised this issue to discuss a particular detail about how the zarr v3 specification defines the metadata that declares which version of zarr it is.
By contrast, ZEP 4 is at a higher level of abstraction: it concerns formalizing specifications of conventions that contain zarr data. To quote that ZEP, "A Zarr implementation itself should not even be aware of the existence of the convention.".
The
zarr_format
metadata is an integer, but the spec document uses a string identifier that can represent major and minor versions. So, unlike the spec document, thezarr_format
metadata cannot ever represent a minor version. Is this a problem? It seems like skew between the spec version andzarr_format
is a recipe for trouble, but I don't see how to fix this without some disruption.cc @WardF, as this relates to some of our conversations from the community meeting the other day, and I think the netcdf perspective would be useful here.