zarr-developers / zarr-specs

Zarr core protocol for storage and retrieval of N-dimensional typed arrays
https://zarr-specs.readthedocs.io/
Creative Commons Attribution 4.0 International
85 stars 28 forks source link

Data Versioning #283

Open MosGeo opened 8 months ago

MosGeo commented 8 months ago

This is early thoughts but has anybody thought about the possiblity of including some kind of data versioning specs in Zarr. The use would be that multiple versions of the same data can be stored and Zarr would only keep multiple copies of the changed chunks. This would save a lot of space if the data just changed a little bit.

from practical perspective, Zarr would return the latest version the chunk unless version is specified. In that case, it woulo return the latest version up to the specified version.

jhamman commented 8 months ago

Hi @MosGeo - there is a relevant discussion in https://github.com/zarr-developers/zarr-specs/issues/154.

tldr; there are a few efforts in this space including our product Arraylake (versioning docs) and Tensorstore/OctDB (docs).