Unidata / netcdf-java

The Unidata netcdf-java library
https://docs.unidata.ucar.edu/netcdf-java/current/userguide/index.html
BSD 3-Clause "New" or "Revised" License
150 stars 71 forks source link

Thin grid grib2 file parsing #993

Open siggemannen opened 2 years ago

siggemannen commented 2 years ago

Hello. I have a problem parsing some grib2 files with "thin grids". I think these files are generated from cdo grib1=>grib2 but not sure. The problem is that this assertion is failing:

https://github.com/Unidata/netcdf-java/blob/a742a9e1be396530c528581a929dce6a8fbbff7c/grib/src/main/java/ucar/nc2/grib/grib2/Grib2Gds.java#L237-L241

In my files the octet value is 2. But it doesn't seem that this assertion is needed because if i remove it, the file is parsed properly anyway.

I can also parse the same file using C# Grib.API.

So the question is if this assertion is really needed? I'm not a grib expert though :)

I'm using on Windows 10, java 17 (but same problem seems to exist in all java versions) and the following version:

<dependency>
  <groupId>edu.ucar</groupId>
  <artifactId>grib</artifactId>
  <version>4.5.5</version>
</dependency>

Attaching zipped file containing the grib file i want to import ECMWF_CAMS_total_column_nitric_acid2019010100.zip

shahramn commented 2 years ago

All GRIB2 data that I know off ( with a reduced grid ) have this value set to 2

JohnLCaron commented 2 years ago
  1. Removing this test, the grid seems to parse ok.

  2. So remove the test so you can see the record in ToolsUI. Open in IOSP/Grib2/Grib2Collection. context menu on the middle table to "Show complete Grib Record".

  3. (3.40) Grid definition template 3.40 - Gaussian latitude/longitude 
    1:                                                                                 GDS length == 1096 
    5:                                                                                    Section == 3 
    6:                                             Source of Grid Definition (see code table 3.0) == 0 (table 3.0: Specified in Code table 3.1) 
    7:                                                                      Number of data points == 348528 
    11:                                             Number of octects for optional list of numbers == 2 
    12:                                                          Interpretation of list of numbers == 0 (table 3.11: There is no appended list) 

    ...

  4. table 3.11 is in IOSP/Grib/WMO codes. Context menu "Show Table" on 3.11

    Code Table Code table 3.11 - Interpretation of list of numbers at end of section 3 (3.11)
    0: There is no appended list
    1: Numbers define number of points corresponding to full coordinate circles (i.e. parallels), coordinate values on each circle are multiple of the circle mesh, and extreme coordinate values given in grid definition (i.e. extreme longitudes) may not be reached in all rows
    2: Numbers define number of points corresponding to coordinate lines delimited by extreme coordinate values given in grid definition (i.e. extreme longitudes) which are present in each row
    3: Numbers define the actual latitudes for each row in the grid. The list of numbers are integer values of the valid latitudes in microdegrees (scaled by 10-6) or in unit equal to the ratio of the basic angle and the subdivisions number for each row, in the same order as specified in the "scanning mode flag" (bit no. 2)
    255: Missing
  5. So even though "There is no appended list", if you skip the test and allow Grib2Gds.readNptsInLine() to continue, we can read numPts short values for the parallels, and things seem to work.

Conclusions:

  1. Likely miscoded, should be reported to ECMWF. They may have some conventions that they havent told us about. If well defined, we could hack in ECMWF specific code. OTOH, what does it mean for there to be "no appended list"? Probably it means "we know what to do in our code, hahahah".
  2. We think we only know how to parse octet12 == 1, which means "Numbers define number of points corresponding to full coordinate circles (i.e. parallels)", and thats what the code seems to be doing.
  3. If @shahramn has examples of octet12 == 2, we should get those and figure out what they mean and code for them if possible.
shahramn commented 2 years ago

My mistake. Apologies. I was talking about octet 11 i.e. "number Of Octets For Number Of Points". So for octet 12 i.e. "interpretation Of Number Of Points" (which is Code Table 3.11): its value should NOT be zero! All the ecCodes samples files for reduced grids have this set to 1. So this does look like an error during encoding.

I think it is best to report this to ECMWF ( https://confluence.ecmwf.int/site/support )

JohnLCaron commented 2 years ago

Hi @shahramn, thanks for reporting. That makes sense to me.

@siggemannen: You say "In my files the octet value is 2." Is it possible you meant octet 11?

Note that in our code octets are 1 based, to follow GRIB documentation conventions, because it was a Fortran world back then, I guess.

JohnLCaron commented 2 years ago

Seems possible its a cdo error. Perhaps check there also.

I imagine correctly translating GRIB1 to GRIB2 in a generic way is difficult.

Maybe we could accept octet12 == 0 to really mean 1, if we in fact there are numPts monotonic? values at the end of section 3. Right now we are just throwing an Exception anyway.

siggemannen commented 2 years ago

@JohnLCaron - When i debug this line: if (octet12 != 1)

The value in octet12 was 2. Or at least that's how i remember it. 😄 I will take a look and compare the grib1 file vs grib2 to see if it's so that CDO mangles these octets. Will get back to you!