Unidata / thredds

THREDDS Data Server v4.6
https://www.unidata.ucar.edu/software/tds/v4.6/index.html
265 stars 179 forks source link

H5iosp cannot read "shuffle=0 and fletcher32" file #460

Closed cwardgar closed 8 years ago

cwardgar commented 8 years ago

From IAO-533024:

Full Name: Ansley Manke Email Address: ansley.b.manke@noaa.gov Organization: NOAA Pacific Marine Environmental Laboratory Package Version: 4.4 Operating System: Linux Hardware: Description of problem: This is not new to netCDF 4.4.

I tried the settings in the example on this page,

https://www.unidata.ucar.edu/software/netcdf/netcdf-4/newdocs/netcdf-f77/NF_005fDEF_005fVAR_005fDEFLATE.html

the settings for chunking, compression and fletcher32 checksum are:

retval = NF_DEF_VAR_CHUNKING(ncid, varid, NF_CHUNKED, chunks)

C Turn on deflate compression, fletcher32 checksum. retval = NF_DEF_VAR_deflate(ncid, varid, 0, 1, 4)

retval = NF_DEF_VAR_FLETCHER32(ncid, varid, NF_FLETCHER32)

In the call to NF_DEF_VAR_deflate, the third argument is SHUFFLE, set to 0. A Ferret user reported to me that with this combination of settings, the resulting file has data that can't be read using the Java netCDF libraries. This error is returned,

java.lang.RuntimeException: Unknown filter type=0

I experimented with the test program under the netCDF C distribution examples/C/simple_nc4_wr.c. I find that if I add a call to nc_def_var_fletcher32(), then I see this behavior as well, only when fletcher32 is set, and shuffle = 0.

I don't have much of a way to test java netcdf, but the Panoply app uses java netcdf and returns the error.

A version of simple_nc4_wr.c called simple_xy_nc4_wr_with_fletcher32.c is attached; the file it writes can't be read with java netcdf, but ncdump has no trouble with it.

cwardgar commented 8 years ago

I have no idea if the bug is in netcdf-c or netcdf-Java. A file written by netcdf-c with what appears to be a valid set of parameters, cannot be read using netcdf-java. From my point of view, I learned that in general I don't want to make the fletcher checksum setting.

Is this just something that should be documented?

cwardgar commented 8 years ago

If I do not use the fletcher32 call, then everything is fine. It's only the combination of shuffle=0 and fletcher32 checksum set, that results in the unreadable files. For deflated files,

shuffle=1, fletcher not set --> ok shuffle=1, fletcher set --> ok shuffle=0, fletcher not set --> ok shuffle=0, fletcher set --> fails

cwardgar commented 8 years ago

User attached a C program that generates a dataset that demonstrates the failure: simple_xy_nc4_wr_with_fletcher32.c

cc simple_xy_nc4_wr_with_fletcher32.c -lnetcdf -o simple_xy_nc4.exe
./simple_xy_nc4.exe

The result: simple_xy_nc4.nc

ncdump simple_xy_nc4.nc

Prints the expected values, no problem. However, trying to dump the values in ToolsUI causes:

java.lang.RuntimeException: Unknown filter type=0
    at ucar.nc2.iosp.hdf5.H5tiledLayoutBB$DataChunk.getByteBuffer(H5tiledLayoutBB.java:200)
    at ucar.nc2.iosp.LayoutBBTiled.hasNext(LayoutBBTiled.java:128)
    at ucar.nc2.iosp.hdf5.H5tiledLayoutBB.hasNext(H5tiledLayoutBB.java:126)
    at ucar.nc2.iosp.IospHelper.readData(IospHelper.java:335)
    at ucar.nc2.iosp.IospHelper.readDataFill(IospHelper.java:291)
    at ucar.nc2.iosp.hdf5.H5iosp.readData(H5iosp.java:169)
    at ucar.nc2.iosp.hdf5.H5iosp.readData(H5iosp.java:144)
    at ucar.nc2.NetcdfFile.readData(NetcdfFile.java:2038)
    at ucar.nc2.Variable.reallyRead(Variable.java:837)
    at ucar.nc2.Variable._read(Variable.java:808)
    at ucar.nc2.Variable.read(Variable.java:686)
    at ucar.nc2.NCdumpW.print(NCdumpW.java:340)
    at ucar.nc2.NCdumpW.print(NCdumpW.java:242)
    at ucar.nc2.ui.ToolsUI$NCdumpPanel.run(ToolsUI.java:1736)
    at ucar.nc2.ui.ToolsUI$GetDataTask.run(ToolsUI.java:5408)
    at java.lang.Thread.run(Thread.java:745)
JohnLCaron commented 8 years ago

commit 95cba30f953e8e93d6ae7080d7c5fe3e0158556c should fix

cwardgar commented 8 years ago

Yep, looks good. Fixed on 5.0 in #476 and on 4.6 in #477.