Open lindstro opened 1 month ago
Too bad you didn't find this, https://github.com/LLNL/H5Z-ZFP/issues/143
I guess I had already forgotten as I did participate on that thread. :-) But #143 deals only with how to support compression programmatically--it says nothing about how to use the CLI tools, which I suspect is the more common use case. For instance, how often do you call zlib
vs. compress a file using gzip
?
It would be nice to have documentation on how to use H5Z-ZFP in the context of netCDF-4. I've spent the last couple of days struggling with making this work, and I believe some additional documentation could save a lot of people grief.
First, the netCDF filter documentation mentions that HDF5 filters can indeed be used, e.g., from command-line tools like
nccopy
with the-F
switch (similar to but not the same as theh5repack
-f
switch) There are a few things the documentation does not mention, however:It seems that you cannot
nccopy
a file that has already been compressed, say, using zlib, to another compressed format.nccopy
will simply silently ignore such requests and not use the requested compression filter. You first have to use-F none
to copy the file to a temporary intermediate uncompressed file. Andncdump -h
will not tell you whether or not the file has been compressed. For that, you need to use the-s
switch also, e.g.,ncdump -hs file.nc
.The netCDF filter parameters are similar to yet distinct from how they're fed to
h5repack
. As a concrete example, suppose we want to use H5Z-ZFP in fixed-accuracy mode with a tolerance of 1.0. This would be specified toh5repack
usingwhere these numbers mean
With
nccopy
, you don't need the 0 following the filter ID, nor do you specify the number ofcd_values
. Rather, you would provide this:You have to tell
nccopy
the name of the variable (dataset) you want to apply the filter to. You can also specify*
for varname to apply compression to all variables, though I believe H5Z-ZFP will fail on certain types, e.g., chars. After the filter ID, you specify only the actualcd_values
given toh5repack
.One nice thing about
nccopy
is that it understands how to do type punning. The above example could also be specified asHere
1.0d
is interpreted as a double-precision number. This works fine on little-endian machines; my reading of the netCDF documentation is that this would not work correctly on a big-endian machine, but who has one of those these days?Perhaps a short section "Using H5Z-ZFP Plugin with nccopy" can be added to the documentation? Maybe even a separate netCDF tool like
print_h5repack_farg
can be provided, or haveprint_h5repack_farg
print bothh5repack
andnccopy
arguments.