Open TomNicholas opened 3 days ago
Is the key
here a positional indexer into the array, or a chunk indexer, or something else? It's possible that a combination of Array.metadata.encode_chunk_key
and store.exists
will do what you want:
In [20]: arr = zarr.create(path="a", shape=(3, 4, 5), chunks=(2, 2, 2))
In [21]: arr.metadata.encode_chunk_key((0, 0, 0))
Out[21]: 'c/0/0/0'
In [22]: arr.metadata.encode_chunk_key((1, 2, 3))
Out[22]: 'c/1/2/3'
It's a chunk indexer. We have a store, and want to calculate the byte offsets and ranges for every chunk in the store. Assuming no sharding the offset is always 0, we get the length of each chunk using the new .getsize
(because with compression they will could different lengths), but it would also be nice to know which chunks don't actually exist in the store so we don't bother trying to get their sizes or writing those into the generated chunk manifest.
It does sound like store.exists
is what we need though, thanks!
see also Array._iter_chunk_keys
Is there a way to ask zarr if a key is backed by a chunk (as opposed to defaulting to the fill_value)?
The motivation is trying to create virtual references for an existing zarr store, but not knowing which chunks of the chunk grid actually exist - see https://github.com/zarr-developers/VirtualiZarr/pull/271#discussion_r1844486393
cc @norlandrhagen