Open jakirkham opened 6 years ago
cc @aschampion (in case it is of interest)
Sounds like a valuable addition. It looks like there are both zlib-like and gzip-like compress functions, so I guess we'd want two codecs, zopfli-zlib and zopfli-gzip. The zopfli Python package looks like it would be straightforward to wrap.
Thanks for bringing this to my attention. Looks like there's a rust implementation, so when I have some time I should be able to benchmark it for n5 as I did for brotli.
At least from the rust N5 side, having a non-symmetric compression (i.e., should compress with zopfli but can decompress with normal gzip) could be a bit of an implementation pain. I'm not sure if that's easier in the zarr world. Java N5 already supports arbitrary compression schemes, so may already be able to handle this situation.
Maybe it could be a config option on the existing codecs as opposed to new codecs?
Even if it isn't installed one could just warn and fallback to zlib or gzip. At worst one simply gets less compressed data (it is still readable).
The Zopfli compression algorithm from Google is a zlib-style compression algorithm that is able to make a small, but notable improvement on the compression ratio that zlib might otherwise achieve. The catch is this ends up being quite a bit slower. However once compressed the data can be decompressed using standard zlib-style algorithms with practically unchanged decompression speed.
Generally this can be useful in cases where the data must be decompressible via zlib or gzip, compression generally happens once, and minimizing size is of paramount importance. For example, serving data used in webpages. So this could be a nice option for use cases where people are interacting tiles of image data in the web browser (via n5-wasm) for instance.
Note: There are a few Python implementations. One we might use is
zopfli
, which is on PyPI and conda-forge.