Closed LDeakin closed 3 weeks ago
Really cool!
Agreed! I'll compare the outputs (and hopefully the speeds) against a couple of other implementations, hopefully starting later this week.
Reencode 4496763-v2.zarr/0 to 4496763-v3-reencode
read: ~152.78ms @ 3.60GB/s
write: ~648.93ms @ 0.77GB/s
total: 801.71ms
size: 550.78MB to 502.71MB (838.86MB uncompressed)
:+1: (Trying to get an apples-to-apple comparison with, say, tensorstore now)
Closing this. For anyone interested in trying this out, see the --output-script
option.
I've developed a few CLI tools in
zarrs_tools
(crates.io / GitHub) that may be useful to some for this challenge.These tools have not been extensively tested with real-world Zarr V2 data, since we don't use Zarr V2 in my lab.
zarrs_reencode
Convert a V3 compatible subset of Zarr V2 arrays to Zarr V3. Many parameters are available for the output array encoding.
zarrs_info
Output the V3 equivalent metadata of a V2 array if it is compatible with only a metadata change. It works for groups too.
It takes care of things like the
typesize
inblosc
codec metadata.zarrs_ome
Generate multiscale arrays for visualisation.
Since last year, my lab has used
zarrs_ome
to create sharded Zarr V3 multiscale images ($\gtrapprox3000^3$) for visualisation in neuroglancer.