Open JackKelly opened 2 years ago
I remember using something like 'gzip' made the files smaller. But then it took longer to load. http://xarray.pydata.org/en/stable/generated/xarray.Dataset.to_netcdf.html I'm not sure on the right balance here
also just some general searching - not sure how useful this is https://www.unidata.ucar.edu/blogs/developer/entry/netcdf_compression
I'm not sure on the right balance here
yeah, I think the only way to tell is to do a bunch of experiments
tbh I wouldn't worry about better on-disk compression for v16. The compression we have now is fine, IMHO.
Detailed Description
For example,
pbzip2
reduces our NWP batches to 20% of their original size. Hopefully we can achieve similar reductions using "proper" NetCDF compression algorithms.Smaller batches should be faster to load; and easier to upload to public cloud / Lancium / etc.
Related issues
61
280
498
Also, if we do find better compression, then we should probably use that better compression for our intermediate zarrs, too.
Not urgent.