ksharonin / kerchunkC

0 stars 0 forks source link

_FillValue to FillValue Message Support #9

Closed ksharonin closed 6 months ago

ksharonin commented 7 months ago

NetCDF Documentation:

ksharonin commented 7 months ago

Note that there is a FillValue Message read out by the h5coro program

----------------
Fill Value Message [1]: 0x706
----------------
Fill Flags:                                                      2B
Fill Value Size:                                                 4
Fill Value:                                                      0xC1100000
ksharonin commented 7 months ago

NOTE: NetCDF if attribute is present will set the HDF5 fill value message via the API function: https://github.com/Unidata/netcdf-c/blob/5b79304c7f5cd6fb46be39adf0a4a3534f04deba/libhdf5/nc4hdf.c#L852

NetCDF documentation on _FillValue attribute and practices: https://www.unidata.ucar.edu/software/netcdf/workshops/2011/bestpractices/MissingData.html

API documentation: https://docs.hdfgroup.org/archive/support/HDF5/doc/RM/RM_H5P.html#Property-SetFillValue

ksharonin commented 7 months ago

"The way I do it in the Python version of the code is to maintain a metadata cache of all the locations of the attributes as the code traverses the file; then on subsequent requests to read the attributes, those requests can be fulfilled very quickly. So to read a variable, the code has this notion of whether or not to "exit early" - meaning do I exit traversing the internal structure of the file as soon as I find the variable I am looking for, or do I continue traversing the file. So when needing to read all of the attributes, the "exit early" is set to false, and that entire level of the internal structure is read (and locations cached). I then build a list of all of the attributes at that level, and make a parallel request to read them all. But for your case, it looks like you may need to read them as you go and keep track of things like _FillValue. It is going to be tricky, and is definitely something we can walk through together and brainstorm ideas."

ksharonin commented 7 months ago

Seems to be value is properly present, but just needs casting correction (unless that is applied to a later stage in the dtype)