Open jhamman opened 2 weeks ago
In my experience, the root of the zip is one of the trickiest parts for data creators (and I assume implementers) to get right, e.g.,
cc: @DennisHeimbigner
How useful is a ZipStore in practice? Are there a lot of use cases for it? Given how limited it is (no rename/deletion, etc) I am wondering if its worth having a spec for it
I have support equivalent to zipstore in nczarr in the netcdf-c library. I agree that it does not appear to be very useful, but the basic idea behind it is reasonable: a single file containing a complete zarr file tree, and using compression the component files to save space. Personally, I think that using a single file file system (SFFS) with added compression makes more sense. There are several implementations available, and it is easy enough to write your own,
In my experience, the root of the zip is one of the trickiest parts for data creators (and I assume implementers) to get right...
@joshmoore - do you have suggestions for the spec document that would make this clearer?
@zoj613 and @DennisHeimbigner - let's try to avoid making this about alternatives to the ZIP store concept. There are practical reasons to add this (Zarr-Python has long supported a ZIP store interface).
Remember, Zarr can support many storage backends. If there are alternatives to experiment with, let's do that in a separate issue.
@DennisHeimbigner - I would like to get your feedback on the spec as written. Is it aligned with your netcdf-c implementation?
@joshmoore - do you have suggestions for the spec document that would make this clearer?
Thoughts that I have revolving in my head that include:
for the format, the most important item I know of is "don't include the top-level directory" (though I have run into some complaints about that from various repositories, since the behavior differs between implementations, e.g. on Windows)
I think I have always used either linux zip or cygwin zip to create zarr zip files. What native windows program could I use to create a pure windows zip file? As for the top-level directory, I think it is better to always include it. I say this so that my rule holds, namely: 1.unzipping a zip store creates a directory tree usable by the zarr directory tree storage manager.
This is a working draft of the v3 ZIP file store specification.
xref: