NOAA-EMC / NCEPLIBS-g2c

This library contains C decoder/encoder routines for GRIB edition 2.
Other
17 stars 11 forks source link

What is the readiness status of the 2.0 API? #439

Closed keltonhalbert closed 9 months ago

keltonhalbert commented 9 months ago

I'm interested in using g2c to handle GRIB2 IO for some new projects I'm working on at SPC, and came across this library. It would be great to use a library supported by NCEP/EMC, but I'd like to avoid having to write all the file handlers/IO in order to extract the GRIB2 messages for querying, especially if there's already positive effort existing on this front.

I see that there's been a lot of work with the g2c API, but it seems like this library isn't as up to date as its FORTRAN counterpart, and a v1.8.0 release doesn't appear to be forthcoming from the outside. How safe/stable would it be to go ahead and start working with the g2c 2.0 API from the develop branch?

edwardhartnett commented 9 months ago

This is indeed an active project! I had to switch to the Fortran libraries to fix some critical bugs, including problems with files > 2 GB.

The g2c API is how the g2c library will move forward, and the fortran library will become a wrapper around the C library.

The API functions may change in minor ways. The existing code is reasonably well-tested. I intend to add a CMake option which turns the g2c API off by default, but allows it to be built. Then there will be a g2c release.

For the g2c-2.0.0 release the new g2c API will be stable and well-tested. I hope this will happen sometime in 2024.

keltonhalbert commented 9 months ago

Thanks for the fast response @edwardhartnett!

I'm glad to hear the existing code is well tested and should only have minor API changes. That's something I can live with! I was poking around the unit tests and got the impression that it should be relatively safe to use, but I wanted to double check and make sure before making it a dependency. My use case will be fairly simple - read grib2 files, and compute a bunch of stuff... API changes there should hopefully not impact things severely.

Again, much appreciated on clarifying these things. Feel free to close this issue and mark it as resolved, unless you have any reason to keep it open.

P.S. - thanks for making this work with Spack, makes life a lot easier! :)

edwardhartnett commented 9 months ago

Let me know how this works for you. Spack work is done by @AlexanderRichert-NOAA and indeed makes life easier.

keltonhalbert commented 9 months ago

@edwardhartnett Well, I'm going to use this as an opportunity to piggy back. If this just ends up being not functional, I'll just hold off until down the road to tackle this.

I'm just trying to read a simple RAP grib2 file, but g2c_inq_dim isnt returning a nonzero length, and the dimname array is empty. I'm effectively running the tst_mrms.c file on the linked grib2 file. If I download the gdaswave file from the data directory, g2c_inq_dim works fine.

However, g2c_degrib2 appears to parse the file fine enough. Here's an except from the first 2 messages...

GRIB MESSAGE  1  starts at 1

  SECTION 0:  0 2 37801
  SECTION 1:  7 0 2 1 1 2023 10 4 0 0 0 0 1
  Contains  0  Local Sections  and  1  data fields.

  FIELD  1
  SECTION 0:  0 2
  SECTION 1:  7 0 2 1 1 2023 10 4 0 0 0 0 1
  SECTION 3:  0 67725 0 0 30
  GRID TEMPLATE 3. 30 :  6 0 0 0 0 0 0 301 225 16281000 233862000 56 25000000 265000000 20318000 20318000 0 64 25000000 25000000 -90000000 0
  NO Optional List Defining Number of Data Points.
  PRODUCT TEMPLATE 4. 0: ( PARAMETER = REFC     0 16 196 )  16 196 2 0 105 0 0 1 0 10 0 0 255 0 0
  FIELD: REFC     Entire Atmosphere valid  0 hour after 2023100400:00:00
  NO Optional Vertical Coordinate List.
  Num. of Data Points =  67725     NO BIT-MAP 
  DRS TEMPLATE 5. 40 :  -1054867456 -4 0 10 0 0 255
  Data Values:
  Num. of Data Points =  67725   Num. of Data Undefined = 0
( PARM= REFC ) :  MIN=             -10.00000000 AVE=              -4.93052864 MAX=              47.81250000

 GRIB MESSAGE  2  starts at 37802

  SECTION 0:  0 2 50031
  SECTION 1:  7 0 2 1 1 2023 10 4 0 0 0 0 1
  Contains  0  Local Sections  and  1  data fields.

  FIELD  1
  SECTION 0:  0 2
  SECTION 1:  7 0 2 1 1 2023 10 4 0 0 0 0 1
  SECTION 3:  0 67725 0 0 30
  GRID TEMPLATE 3. 30 :  6 0 0 0 0 0 0 301 225 16281000 233862000 56 25000000 265000000 20318000 20318000 0 64 25000000 25000000 -90000000 0
  NO Optional List Defining Number of Data Points.
  PRODUCT TEMPLATE 4. 0: ( PARAMETER = VIS      0 19 0 )  19 0 2 0 105 0 0 1 0 1 0 0 255 0 0
  FIELD: VIS      Surface valid  0 hour after 2023100400:00:00
  NO Optional Vertical Coordinate List.
  Num. of Data Points =  67725     NO BIT-MAP 
  DRS TEMPLATE 5. 40 :  0 0 -2 10 0 0 255
  Data Values:
  Num. of Data Points =  67725   Num. of Data Undefined = 0
( PARM= VIS ) :  MIN=               0.00000000 AVE=           26052.08398438 MAX=           90000.00000000

I will note, however, that the terminal output gets garbled, as if there's out-of-bounds memory access. The repeating string of 23 does not occur when reading the GDAS file.

2323232323232323inq_dim returned 0
Number of messages: 294
len0 0 dimname0 

And the output when working with the GDAS file...

inq_dim returned 0
Number of messages: 19
len0 151 dimname0 Latitude

I'm not 100% sure it's an illegal memory access issue, and I'm having an issue with my valgrind install saying it's seeing illegal instructions, so can't provide any concrete evidence as of yet. Any ideas on what could be going on?

keltonhalbert commented 9 months ago

The code I'm using, in case it ends up being necessary:

#include <stdio.h>
#include <stdlib.h>
#include <grib2.h>

//#define G2_FILE "/users/khalbert/Downloads/rap_252_20210415_0000_000.grb2"
//#define G2_FILE "/users/khalbert/Downloads/ruc2anl_130_20081031_0800_001.grb2"
//#define G2_FILE "/users/khalbert/Downloads/rap.t00z.awp252pgrbf00.grib2"
#define G2_FILE "/users/khalbert/Downloads/gdaswave.t00z.wcoast.0p16.f000.grib2"
#define LAT_LEN 1000 
#define LON_LEN 1000

int main(int argc, char* argv) {

    int g2cid;
    int num_msg;
    size_t len0;
    size_t len1;
    char dimname0[G2C_MAX_NAME + 1]; 
    char dimname1[G2C_MAX_NAME + 1]; 

    float lat[LAT_LEN];
    float lon[LON_LEN];

    g2c_set_log_level(10);
    if (g2c_open(G2_FILE, 0, &g2cid))
        return G2C_ERROR;

    if (g2c_inq(g2cid, &num_msg))
        return G2C_ERROR;

    int ret = g2c_inq_dim(g2cid, 0, 0, 0, &len0, dimname0, lat);
    if (ret)
        return G2C_ERROR;

    g2c_degrib2(g2cid, "./test_degrib2.txt");

    printf("inq_dim returned %d\n", ret);
    printf("Number of messages: %d\n", num_msg);
    printf("len0 %d dimname0 %s\n", len0, dimname0);
    //printf("lat[0] = %f lat[%ld] = %f\n", lat[0], len0 - 1, lat[len0 - 1]);

    g2c_close(g2cid);

    return 0;
}
keltonhalbert commented 9 months ago

Aha! I got to the bottom of it.

it appears that the flag in determine_dims only works for grid_definition 0, a lat lon grid, while RAP data is in grid_definition 30, a Lambert Conformal grid.

I'm by no means anywhere near a grib2 expert, but if you have a high-level idea of how you want to handle grid definitions in determine_dims, I'd be happy to put in some effort here and make a pull request. If you'd rather leave it be for now though, that's fine too.

edwardhartnett commented 9 months ago

I would love help developing handling of other grids!

I am about to do a release of g2c. The new g2c_ functions will not be built by default until the 2.0.0 release. Until then, they are considered experimental and can be built with the BUILD_G2C=ON option to cmake.

edwardhartnett commented 9 months ago

I'm about to do a release. I will close this issue.