zarr-developers / zarr-python

An implementation of chunked, compressed, N-dimensional arrays for Python.
https://zarr.readthedocs.io
MIT License
1.53k stars 286 forks source link

Added Array.info_complete #2514

Open TomAugspurger opened 3 days ago

TomAugspurger commented 3 days ago

Now that Store.getsize is a thing, we can do info_complete which includes the number of chunks written and the size of those bytes.

The current implementation unfortunately does two list_prefixes on the same prefix. The first to get the count of chunks initialized and the second to get the bytes stored under a prefix. Unfortunately, these down compose well. I haven't thought of a nice way to eliminate that yet. We can do this naively by doing a single list_prefix and then counting the number of keys as we call getsize on each. But we also have a Store.getsize_prefix for those stores that have fastpaths for getting the total number of bytes under a prefix. I don't really want a Store.getsize_and_count_prefix, but maybe some kind of Store.statistics_prefix? Probably not worth worrying about today.