ufs-community / UFS_UTILS

Utilities for the NCEP models.
Other
21 stars 107 forks source link

Update chgres_cube to process RRFS GRIB2 data #660

Open GeorgeGayno-NOAA opened 2 years ago

GeorgeGayno-NOAA commented 2 years ago

This data uses a GRIB2 GDT template number of "1", which is not recognized by chgres. Other code updates may also be required.

GeorgeGayno-NOAA commented 2 years ago

Test case on Hera: /scratch1/NCEPDEV/da/George.Gayno/noscrub/eric.rrfs

Work on hold until G2 library issue is addressed - https://github.com/NOAA-EMC/NCEPLIBS-g2/issues/36

kgerheiser commented 2 years ago

Thanks for the test case

GeorgeGayno-NOAA commented 2 years ago

Thanks for the test case

I added a print of the number of records read by the inventory loop. See 7185ed5. The test case only reads 389 records out of more than 1000. It stops exactly at the 2GB mark.

GeorgeGayno-NOAA commented 1 year ago

@LarissaReames-NOAA the branch I worked from is a year old. (Fortunately, I did not delete it.) Merging from 'develop' might result in tons of conflicts. I don't know if it would be better to start with a new branch, then manually add the updates from my old branch.

https://github.com/GeorgeGayno-NOAA/UFS_UTILS/tree/feature/chgres_rrfs

Let me try a merge.

GeorgeGayno-NOAA commented 1 year ago

@LarissaReames-NOAA the branch I worked from is a year old. (Fortunately, I did not delete it.) Merging from 'develop' might result in tons of conflicts. I don't know if it would be better to start with a new branch, then manually add the updates from my old branch.

https://github.com/GeorgeGayno-NOAA/UFS_UTILS/tree/feature/chgres_rrfs

Let me try a merge.

The merge was easier than I thought. I had only changed model_grid.F90. The RRFS data is rotated lat/lon, but uses the official WMO GDT. The official template requires unrotated corner points. Our gdswzd routine requires rotated corner points. So I added logic to do the rotation. I then stopped work because I hit the 2GB file limit.

LarissaReames-NOAA commented 1 year ago

Sounds good. I'll start from your branch and use this Issue # in my commits.

LarissaReames-NOAA commented 1 year ago

@GeorgeGayno-NOAA I'm running in to an issue with G2 with the RRFS_A natlev and prslev files that I think may be related to the size of the file. It's giving me an Error 99 " Request not found." when attempting to read surface pressure. I've checked with both grib_dump and ncl_filedump and surface pressure is indeed in the file and looks to have the same grib2 parameters provided to getgb2 near line 2636 in atm_input_data.F90. I also compared that entry from grib_dump with a HRRR grib2 file that we use for the regression tests and all of the relevant entries are identical. I've also looked more closely at the output and in both cases (nat and pres) it's only able to parse the upper ~15 vertical levels, which sounds to me like it's not generating a complete index of the file internally.

Update: I tried the f000 CONUS file that's available on AWS and that works fine. It's <1GB in total size.

GeorgeGayno-NOAA commented 1 year ago

@GeorgeGayno-NOAA I'm running in to an issue with G2 with the RRFS_A natlev and prslev files that I think may be related to the size of the file. It's giving me an Error 99 " Request not found." when attempting to read surface pressure. I've checked with both grib_dump and ncl_filedump and surface pressure is indeed in the file and looks to have the same grib2 parameters provided to getgb2 near line 2636 in atm_input_data.F90. I also compared that entry from grib_dump with a HRRR grib2 file that we use for the regression tests and all of the relevant entries are identical. I've also looked more closely at the output and in both cases (nat and pres) it's only able to parse the upper ~15 vertical levels, which sounds to me like it's not generating a complete index of the file internally.

Update: I tried the f000 CONUS file that's available on AWS and that works fine. It's <1GB in total size.

What version of the G2 library are you using?

LarissaReames-NOAA commented 1 year ago

@GeorgeGayno-NOAA I'm running in to an issue with G2 with the RRFS_A natlev and prslev files that I think may be related to the size of the file. It's giving me an Error 99 " Request not found." when attempting to read surface pressure. I've checked with both grib_dump and ncl_filedump and surface pressure is indeed in the file and looks to have the same grib2 parameters provided to getgb2 near line 2636 in atm_input_data.F90. I also compared that entry from grib_dump with a HRRR grib2 file that we use for the regression tests and all of the relevant entries are identical. I've also looked more closely at the output and in both cases (nat and pres) it's only able to parse the upper ~15 vertical levels, which sounds to me like it's not generating a complete index of the file internally. Update: I tried the f000 CONUS file that's available on AWS and that works fine. It's <1GB in total size.

What version of the G2 library are you using?

I'm working on Jet using the standard module file, so it loads 3.4.5

GeorgeGayno-NOAA commented 1 year ago

@GeorgeGayno-NOAA I'm running in to an issue with G2 with the RRFS_A natlev and prslev files that I think may be related to the size of the file. It's giving me an Error 99 " Request not found." when attempting to read surface pressure. I've checked with both grib_dump and ncl_filedump and surface pressure is indeed in the file and looks to have the same grib2 parameters provided to getgb2 near line 2636 in atm_input_data.F90. I also compared that entry from grib_dump with a HRRR grib2 file that we use for the regression tests and all of the relevant entries are identical. I've also looked more closely at the output and in both cases (nat and pres) it's only able to parse the upper ~15 vertical levels, which sounds to me like it's not generating a complete index of the file internally. Update: I tried the f000 CONUS file that's available on AWS and that works fine. It's <1GB in total size.

What version of the G2 library are you using?

I'm working on Jet using the standard module file, so it loads 3.4.5

Ok. That explains it. The G2 library fix was just tagged yesterday: https://github.com/NOAA-EMC/NCEPLIBS-g2/releases/tag/v3.4.7

The libraries team said they would install it 'soon'. You can try to clone and compiled it in your own space. Then adjust the ufs_utils build module to point to it.

LarissaReames-NOAA commented 1 year ago

@GeorgeGayno-NOAA I'm running in to an issue with G2 with the RRFS_A natlev and prslev files that I think may be related to the size of the file. It's giving me an Error 99 " Request not found." when attempting to read surface pressure. I've checked with both grib_dump and ncl_filedump and surface pressure is indeed in the file and looks to have the same grib2 parameters provided to getgb2 near line 2636 in atm_input_data.F90. I also compared that entry from grib_dump with a HRRR grib2 file that we use for the regression tests and all of the relevant entries are identical. I've also looked more closely at the output and in both cases (nat and pres) it's only able to parse the upper ~15 vertical levels, which sounds to me like it's not generating a complete index of the file internally. Update: I tried the f000 CONUS file that's available on AWS and that works fine. It's <1GB in total size.

What version of the G2 library are you using?

I'm working on Jet using the standard module file, so it loads 3.4.5

Ok. That explains it. The G2 library fix was just tagged yesterday: https://github.com/NOAA-EMC/NCEPLIBS-g2/releases/tag/v3.4.7

The libraries team said they would install it 'soon'. You can try to clone and compiled it in your own space. Then adjust the ufs_utils build module to point to it.

Installed the 3.4.7 locally, double checked it was correctly linked during install, problem remains. Parsing the file quits at 13 hybrid levels. Do I have to do something special to use the new large file capability? That wasn't clear in the release notes.

Also, if it helps at all, I'm getting several messages in the chgres_cube output that look like: SAGT 0 0 5

From looking at the g2 code this supposedly indicates that it's finding an unknown grib section, but it doesn't seem to be a fatal error?

@edwardhartnett Could you provide any advice here?

GeorgeGayno-NOAA commented 1 year ago

@LarissaReames-NOAA do those odd messages begin at a certain record number and does that record correspond to the 2GB point?

LarissaReames-NOAA commented 1 year ago

The first place they appear is in model_grid when the grib2 file is first opened and the first record is checked for the grid template definition: ` - OPEN AND READ INPUT DATA GRIB2 FILE: /lfs4/NAGAPE/hpc-wof1/lreames/chgres_cube/reg_tests/input_data/rrfs.grib2/rrfs. t00z.natlev.f000.grib2 SAGT 0 0 5 SAGT 0 0 5 SAGT 0 0 5 SAGT 0 0 5 SAGT 0 0 5 SAGT 0 0 5 SAGT 0 0 5 SAGT 0 0 5 SAGT 0 0 5 SAGT 0 0 5 SAGT 0 0 5 SAGT 0 0 5

The second time is is in atm_input_data.F90 during the check for the product definition number: ` - READ ATMOS DATA FROM GRIB2 FILE: /lfs4/NAGAPE/hpc-wof1/lreames/chgres_cube/reg_tests/input_data/rrfs.grib2/rrfs. t00z.natlev.f000.grib2 SAGT 0 0 5

So it doesn't actually show up in the loop over all entries to count hybrid levels. It looks like it's showing up when jpdtn=-1, aka when entries with any grid definition template number are searched. I'm not really sure if it has much bearing on the actual issue.

A total of 296 records are read in atm_input_data.F90. How can I know if that's the 2GB point? It's certainly close to the number of entries Kyle was encountering when his read was ending at 2GB

GeorgeGayno-NOAA commented 1 year ago

I did my own independent test using G2 v3.4.7 (https://github.com/NOAA-EMC/NCEPLIBS-g2/releases/tag/v3.4.7) and got the same chgres_cube error:

max, min U   -24.9200000762939        49.4799995422363
 max, min V   -43.0999984741211        58.0000000000000
 - CALL FieldScatter FOR INPUT U-WIND.
 - CALL FieldScatter FOR INPUT V-WIND.
 - READ SURFACE PRESSURE.
 - FATAL ERROR: READING SURFACE PRESSURE RECORD.
 - IOSTAT IS:           99

Will contact the library team.

edwardhartnett commented 1 year ago

Is this a new error with 3.4.7? Or is this something that never worked?

GeorgeGayno-NOAA commented 1 year ago

Is this a new error with 3.4.7? Or is this something that never worked?

Some programs, such as chgres_cube, use routine getgb2 to read grib data.

https://github.com/NOAA-EMC/NCEPLIBS-g2/blob/develop/src/getgb2.F90#L47

Based on the value of the LUBI argument, that routine will either read an existing index file, create the index file, or force a regeneration of the index file.

LarissaReames-NOAA commented 1 year ago

Is this a new error with 3.4.7? Or is this something that never worked?

Some programs, such as chgres_cube, use routine getgb2 to read grib data.

https://github.com/NOAA-EMC/NCEPLIBS-g2/blob/develop/src/getgb2.F90#L47

Based on the value of the LUBI argument, that routine will either read an existing index file, create the index file, or force a regeneration of the index file.

In other words, it's never worked for files > 2GB. However, RRFS North American files are the first we've dealt with that are that large, so we've been able to get by without that capability until now.

edwardhartnett commented 1 year ago

OK, to resolve this we're going to have to introduce a new index format, which can handle > 2 GB files. So that will take a little work.

JacobCarley-NOAA commented 1 year ago

Chiming in to say this is a needed capability for both RRFSv1 (we need it to make ICs to drive the on-demand FIreWx nest) and for the 3DRTMA as well. I'm glad to see this is picking up steam again!