Error constructing GRIB2 var names

Unidata / thredds

THREDDS Data Server v4.6

https://www.unidata.ucar.edu/software/tds/v4.6/index.html

265 stars 179 forks source link

Error constructing GRIB2 var names #854

Open rschmunk opened 7 years ago

rschmunk commented 7 years ago

I've heard from some NOAA people about trouble using Panoply to read some GRIB2 data, but IDV also has the problem. So assuming that the datasets are okay (and I'm assured that they are), then the problem would seem to lie in the NJ code for reading/constructing GRIB2 metadata.

The issue is that variable IDs and variable names in some datasets are reported with the wrong accumulation or averaging period, e.g.., the var names are shown with "XXX_36_Hour_Accumulation_XXX" when they should be "XXX_6_Hour_Accumulation_XXX".

Sample datasets are at http://nomads.ncep.noaa.gov/pub/data/nccf/com/gefs_legacy/prod/gefs_legacy.20170606/00/pgrb2a/

An example that shows the var names as 36-hour is geavg.t00z.pgrb2af258

However, a dataset that shows them as the expected 6-hour is geavg.t00z.pgrb2af252

lesserwhirls commented 7 years ago

Greetings @msdsoftware,

The key difference between the two files is the time unit encoded into the GRIB record. In file geavg.t00z.pgrb2af252 (with anticipated behavior), octet 18 of the PDS for Total Precipitation indicates:

 18: Indicator of unit of time range == 1 (table 4.4: Hour)

whereas in file geavg.t00z.pgrb2af258 (unanticipated behavior), octet 18 for the same grid indicates:

18: Indicator of unit of time range == 11 (table 4.4: 6 hours)

In both cases, a time interval of 6 is computed based on the forecast time (octet 19) and the time interval end (octets 37-43), but once the time unit is applied, we get the discrepancy (6 hours vs 36 hours).

rschmunk commented 7 years ago

So do you think this constitutes a bug in the NJ code? Or if not necessarily a bug, then something that NJ might catch with a sanity test and try to "correct"?

I forwarded your comments to the NOAA people -- they said thanks much for the analysis. They are going to try to figure out where/why in the processing chain some files are generated with octet 18 value of 11. The theory has been advanced that someone fat-fingered an 11 rather than a 1 when writing some code.

rschmunk commented 7 years ago

I received some follow-up from NOAA about the datasets that cause the unexpected behavior:

It seems as though the unidata library in question may need to calculate that "I" value differently for forecast hours over 255. The information that the developer provided regarding the octets in the PDS section are below:

As far as I can tell, the GRIB1 PDS time specifications are correct:
Octet: 18 (time unit)  19 (P1)       20 (P2)       21 (time range type)

f252:  01 (1 hour)     f6 (246)      fc (252)      04 (average)

f258:  0b (6-hour)     2a (46)       2b (47)       04 (average)
                           46*6=252      47*6=258
In GRIB1, the time unit changes at this point because octet 19 is unable to contain a number larger than 255.

lesserwhirls commented 7 years ago

@msdsoftware - thanks for following up with NOAA! This is quite helpful to know in general.

The code that creates the interval name can be found here. It looks to me like the multiplication by getTimeUnitScale() isn't correct, as the time interval object already takes that time unit into account.

I've made a PR (Unidata/thredds#861) to see if removing this breaks anything in our test suite, but I think it should be ok.

lesserwhirls commented 7 years ago

Not that simple. The scaling by getTimeUnitScale() is needed for GRIB1, but the time interval calculated from the GRIB2 record isn't done correctly. We have a "comment hint" here:

int startOffset = timeUnit.getOffset(refDate, start);   // LOOK wrong - not dealing with value

value means the value of the time unit (i.e. the 6 in 6 hours, in this case). As this is done at the ucar.nc2.grib level, and not ucar.nc2.grib.grib1 and ucar.nc2.grib.grib2 level, a fix here to use the unit's value in creating the time interval may end up breaking GRIB1 code. This looks like it might take quite a bit of time to untangle.