Unidata / netcdf-java

The Unidata netcdf-java library
https://docs.unidata.ucar.edu/netcdf-java/current/userguide/index.html
BSD 3-Clause "New" or "Revised" License
143 stars 68 forks source link

Unpacked values from scaled/offset dataset are all NaN. #1312

Closed rschmunk closed 4 months ago

rschmunk commented 4 months ago

Versions impacted by the bug

Any build of version 5.5.4 from the past month.

What went wrong?

A user wrote that a plot of his data was coming up blank when using a just-downloaded Panoply. A quick check revealed that Panoply was seeing all NaN values even though an ncdump showed good values. Some further testing demonstrated that this was also occurring with a copy of Panoply from a month ago, but that a copy from the very end of December 2023 was plotting as expected. In all three cases, the Panoply release was using an up-to-date build of the netcdf-Java 5.x-maint snapshot.

The dataset packs the data as shorts and specifies a scale value and an offset. Panoply acquires the dataset in enhanced mode and accepts whatever values netCDF-Java says are there.

If I modify Panoply to acquire the dataset in NOT enhanced mode, then netCDF-Java reports the unpacked short values and a plot is successfully made. If I build a copy of Panoply using a copy of netcdfAll that I built on December 28 and tell it to acquire the dataset in enhanced mode, the unpacked data values are correctly reported and a plot is successfully made.

Looking at the commit history on the 5.x-maint branch, it looks like there were some modifications to unpacking data made about January 12, which seems a likely point where the problem is occurring. Nothing I noticed in the changes then jumps out at me as the obvious cause, but I haven't dug very deep yet.

Not sure if it's at all meaningful, but the scale and offset values are reeeaally small.

A copy of the example dataset that revealed the issue can be downloaded from Dropbox. It's a CF-1.6 dataset with lon-lat data on a relatively small array covering the re4gion around Mexico City.

Relevant stack trace

No response

Relevant log messages

No response

If you have an example file that you can share, please attach it to this issue.

If so, may we include it in our test datasets to help ensure the bug does not return once fixed? Note: the test datasets are publicly accessible without restriction.

Yes

Code of Conduct

haileyajohnson commented 4 months ago

@rschmunk just to clarify, the place where Panoply is getting incorrect data is from a call to NetcdfDatasets.openDataset(filename) with enhanced=all?

rschmunk commented 4 months ago

Actually

NetcdfFile njfile = NetcdfFiles.open (filePathStr);
NetccdfDataset njdataset = NetcdfDatasets.enhance (njfile, ENHANCEMENT, null);

where ENHANCEMENT is the default enhancement plus allowing for an "incomplete" coordinate system.

If you're wondering why I acquire the dataset in two steps, it's actually because there's also a step where I acquire an unenhanced version in order to see the original CDL description.

rschmunk commented 4 months ago

@haileyajohnson, I tracked this down, and it has to do with the changes made to the ConvertMissing filter on January 12. I'm not sure which commit in particular was the coup de grace, as there were about 6 of them that day.

TL;DR - the unpacked data values from the sample dataset are all much smaller than Misc.defaultMaxRelativeDiffFloat.

Long version...

As noted above, the scale and offset values in the sample dataset are really small. Tiny even. The scale factor is of order 1E-12 and the offset of order 1E-7, and after unpacking, the output data results are of order 1E-8.

The dataset also specifies that the _FillValue is the same as the missing_value. Unpacked, the fill/missing value is ~ 0 (actually, also of order 1E-12).

The problem then is that ConvertMissing.isMissing receives an unpacked data value, checks if it is less than Misc.defaultMaxRelativeDiffFloat away from the unpacked fill value. Since that default difference is of order 1E-5, then of course the method is returning true. Every value in the array is being deemed invalid and the convert method sends back NaN for the entire array.

So it seems like a smaller difference value needs to be used in the test when dealing with such tiny scale factors and offsets.

haileyajohnson commented 4 months ago

Thanks for tracking that down! We did change the fuzzyequals comparison to use a relative diff instead of absolute diff, and that's what broke this case - I'm planning on reverting that change. But another issue I think is that we need to do the missing and fill value comparisons with unpacked data, so that the fuzyequals implementation doesn't matter. I should have a fix up shortly. Thanks for all your help!

rschmunk commented 4 months ago

@haileyajohnson, I thought about trying to hack up a test of modifying ConvertMissing to allow for specifying a more flexible relative difference in its fuzzy-equals test. But I took a second look at how many parameters there already are in the constructor and said, "Nah."

rschmunk commented 4 months ago

I downloaded a fresh snapshot and built it into Panoply. It plots the sample data as expected, so it looks like #1315 was what was needed.