glencoesoftware / bioformats2raw

Bio-Formats image file format to raw format converter
GNU General Public License v2.0
77 stars 35 forks source link

ZipStore as an output option #164

Open prete opened 2 years ago

prete commented 2 years ago

Type or frequest

Feature request

Request

Is it possible to add a flag like --zip-store-output to have the output be a zip file (ZipStore) instead of a folder?

Rationale

Having thousands of tiny files can make HPC storages suffer. Bundling the output structure into a ZipStore could prevent that.

Having a single zip file would reduce the file count and could also improve copying/moving times between filesystems. I've cheked and it seems that jzarr supports ZipStore.

A current workaround is to zip -0 -r some.zarr.zip some.zarr/ but that happens post-facto, so it doesn't prevent the large file count and also duplicates the size on disk required.

joshmoore commented 2 years ago

Similar to https://github.com/ome/ngff/issues/135, the NGFF spec is currently overly opinionated about which Zarr store gets used. Two possible options if the bf2raw2ometiff stack wants to move forward here is either to have the flag like --zip-store-output produce a non-NGFF intermediary or to work on removing the restrictions in v0.5. Knowing @prete that you want to then pipe the output here to other NGFF-compatible tools, I'd suggest the latter.