Unidata / netcdf-fortran

Official GitHub repository for netCDF-Fortran libraries, which depend on the netCDF C library. Install the netCDF C library first.
Other
244 stars 98 forks source link

GFDL MOM6 model aborts with netcdf-fortran-4.6.0. Works with netcdf-fortran-4.5.3. (SUSPECT MOM6 ISSUE BUT OPENING AT REQUEST OF NETCDF SUPPORT) #395

Open GeorgeVandenberghe-NOAA opened 1 year ago

GeorgeVandenberghe-NOAA commented 1 year ago

The GFDL MOM6 ocean model fails with a field name error when using netcdf-fortran-4.6.0. It runs normally with netcdf-fortran-4.5.3. I strongly suspect this is not a NetCDF issue but am submitting this issue at the request of NetCDF support. Will try to replicate this when I get a chance on gaea C5 and report the error message which has been lost on C5. The rest of the software is netcdf-c-4.7.4 (also fails with 4.9.1) and hdf5/1.12.2.

GeorgeVandenberghe-NOAA commented 1 year ago

The error message from MOM6 8416: FATAL from PE 0: NetCDF: Name contains illegal characters: netcdf_add_variable: file:MOM6_OUTPUT/ocean_geometry.ncvariable:lath

A second came from the ATM model after 24 hours in an ATM only run 0: FATAL from PE 0: NetCDF: Name contains illegal characters: netcdf_add_variable: file:RESTART/20200602.060000.fv_core.res.ncvariable:xaxis_1

I will report these to EMC.

WardF commented 1 year ago

Thank you!

edwardhartnett commented 1 year ago

What are the illegal names being attempted?

That is, what is the software trying to use for a variable name?

GeorgeVandenberghe-NOAA commented 1 year ago

From the issue I reported

I think these are valid names but a memory issue overwriting some table defining them.

-

"

On Tue, Mar 28, 2023 at 2:16 PM Edward Hartnett @.***> wrote:

What are the illegal names being attempted?

That is, what is the software trying to use for a variable name?

— Reply to this email directly, view it on GitHub https://github.com/Unidata/netcdf-fortran/issues/395#issuecomment-1487396778, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANDS4FQAC2MDR2TPT522J5DW6MTHNANCNFSM6AAAAAAWJKYNJA . You are receiving this because you authored the thread.Message ID: @.***>

--

George W Vandenberghe

Lynker Technologies at NOAA/NWS/NCEP/EMC

5830 University Research Ct., Rm. 2141

College Park, MD 20740

@.***

301-683-3769(work) 3017751547(cell)

edwardhartnett commented 1 year ago

Is a '*' valid in a variable name in netCDF?

GeorgeVandenberghe-NOAA commented 1 year ago

I don't know. I suspect not. If not it's definitely a model issue

On Wed, Mar 29, 2023 at 1:03 PM Edward Hartnett @.***> wrote:

Is a '*' valid in a variable name in netCDF?

— Reply to this email directly, view it on GitHub https://github.com/Unidata/netcdf-fortran/issues/395#issuecomment-1488567741, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANDS4FWLXVTTJE7EB7KHSP3W6QXKBANCNFSM6AAAAAAWJKYNJA . You are receiving this because you authored the thread.Message ID: @.***>

--

George W Vandenberghe

Lynker Technologies at NOAA/NWS/NCEP/EMC

5830 University Research Ct., Rm. 2141

College Park, MD 20740

@.***

301-683-3769(work) 3017751547(cell)

edwardhartnett commented 1 year ago

According to this answer, asterisks should be allowed in names: https://www.unidata.ucar.edu/support/help/MailArchives/netcdf/msg10684.html

DennisHeimbigner commented 1 year ago

I believe that is correct.

edwardhartnett commented 1 year ago

Oh, duh, it's right there in front of us:

Name contains illegal characters:
   netcdf_add_variable: file:RESTART/20200602.060000.fv_core.res.ncvariable
   *:xaxis_1*

There's a "/" there. Is that part of the variable name? Because that it an illegal character for netCDF names, IIRC...

GeorgeVandenberghe-NOAA commented 1 year ago

I thought the file description valid and the actual variable was

:xaxis_1*

but I haven't looked at MOM6 source (which EMC doesn't control in any way).

On Wed, Mar 29, 2023 at 3:58 PM Edward Hartnett @.***> wrote:

Oh, duh, it's right there in front of us:

Name contains illegal characters: netcdf_add_variable: file:RESTART/20200602.060000.fv_core.res.ncvariable :xaxis_1

There's a "/" there. Is that part of the variable name? Because that it an illegal character for netCDF names, IIRC...

— Reply to this email directly, view it on GitHub https://github.com/Unidata/netcdf-fortran/issues/395#issuecomment-1488888003, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANDS4FUIRZIJQBR5DAZB4S3W6RL3XANCNFSM6AAAAAAWJKYNJA . You are receiving this because you authored the thread.Message ID: @.***>

--

George W Vandenberghe

Lynker Technologies at NOAA/NWS/NCEP/EMC

5830 University Research Ct., Rm. 2141

College Park, MD 20740

@.***

301-683-3769(work) 3017751547(cell)

GeorgeVandenberghe-NOAA commented 1 year ago

I can start looking at this after I get the benchmark off my desk and out the door.

edwardhartnett commented 1 year ago

Maybe the variable is named "file:RESTART/20200602.060000.fv_core.res.ncvariable :xaxis_1"? Can you ask the programmers?

Surely at some point this must have worked, yet as far as I know there has been no change in illegal name characters since @DennisHeimbigner changed everything to allow unicode, and that was like 15 years ago. Dennis have there been changes to what's legal in a name since then?

GeorgeVandenberghe-NOAA commented 1 year ago

I have asked FV3 ATM and MOM6 people to look into it at GFDL.

Meanwhile since I believe this is a model issue UCAR and NETCDF development shouldn't lose any sleep over it.

On Wed, Mar 29, 2023 at 4:09 PM Edward Hartnett @.***> wrote:

Maybe the variable is named "file:RESTART/20200602.060000.fv_core.res.ncvariable :xaxis_1"? Can you ask the programmers?

Surely at some point this must have worked, yet as far as I know there has been no change in illegal name characters since @DennisHeimbigner https://github.com/DennisHeimbigner changed everything to allow unicode, and that was like 15 years ago. Dennis have there been changes to what's legal in a name since then?

— Reply to this email directly, view it on GitHub https://github.com/Unidata/netcdf-fortran/issues/395#issuecomment-1488902592, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANDS4FUAPUQI7L6VRBYEHG3W6RNCBANCNFSM6AAAAAAWJKYNJA . You are receiving this because you authored the thread.Message ID: @.***>

--

George W Vandenberghe

Lynker Technologies at NOAA/NWS/NCEP/EMC

5830 University Research Ct., Rm. 2141

College Park, MD 20740

@.***

301-683-3769(work) 3017751547(cell)

DennisHeimbigner commented 1 year ago

There have been no changes as far as I know. But it is still the case that the '/' character is not legal.

bensonr commented 1 year ago

GFDL models are running just fine with the cray-netcdf modules on the Gaea C5 partition. The versions appear to be C-4.9.0 and Fortran-4.5.3 (according to nc- and nf-config). I have asked an FMS engineer to run unit tests on an internal system at GFDL with C-4.9.0 and Fortran-4.6.0.

GeorgeVandenberghe-NOAA commented 1 year ago

I have only tested netcdf fortran 4.5.3 and netcdf-fortran 4.6.0. I get solid failures with netcdf-fortran 4.6.0 using both netcdf-c/4.7.4 and netcdf-c/4.9.1 with hdf5/1.10.6 and with netcdf-c/4.7.4, netcf-fortran/4.6.0 using hdf5/1.14.0. netcdf-c/4.9.1 does not build with hdf5/1.14.0. The common denominator seems to be netcdf-fortran/4.6.0 I have not tried netcdf-c/4.9.2 yet

On Wed, Mar 29, 2023 at 6:50 PM Rusty Benson @.***> wrote:

GFDL models are running just fine with the cray-netcdf modules on the Gaea C5 partition. The versions appear to be C-4.9.0 and Fortran-4.5.3 (according to nc- and nf-config). I have asked an FMS engineer to run unit tests on an internal system at GFDL with C-4.9.0 and Fortran-4.6.0.

— Reply to this email directly, view it on GitHub https://github.com/Unidata/netcdf-fortran/issues/395#issuecomment-1489128032, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANDS4FUERVV4QAKEAZBTP4TW6R77NANCNFSM6AAAAAAWJKYNJA . You are receiving this because you authored the thread.Message ID: @.***>

--

George W Vandenberghe

Lynker Technologies at NOAA/NWS/NCEP/EMC

5830 University Research Ct., Rm. 2141

College Park, MD 20740

@.***

301-683-3769(work) 3017751547(cell)

uramirez8707 commented 1 year ago

Maybe the variable is named "file:RESTART/20200602.060000.fvcore.res.ncvariable :xaxis1"? Can you ask the programmers?

The filename is "RESTART/20200602.060000.fv_core.res.nc" and the variable name is "xaxis_1"

Something like this should be the traceback https://github.com/NOAA-GFDL/GFDL_atmos_cubed_sphere/blob/cfae631cb9c07b40863441c157a39787626ba1fc/tools/fv_io.F90#L131 https://github.com/NOAA-GFDL/FMS/blob/7188e3a2e634376da74c3e4247bc9b487ef52700/fms2_io/fms_netcdf_domain_io.F90#L548 https://github.com/NOAA-GFDL/FMS/blob/7188e3a2e634376da74c3e4247bc9b487ef52700/fms2_io/netcdf_io.F90#L956-L964

GeorgeVandenberghe-NOAA commented 1 year ago

Forwarding to GFDL . I will have to rebuild FMS to get that traceback (FMS/2022.04 is the version used) The failure is in the coupled UFS:MOM6 execution but I saw an ATM failure also so it isn't just MOM6 I am working other critical issues at NCEP

On Thu, Mar 30, 2023 at 10:09 AM uramirez8707 @.***> wrote:

Maybe the variable is named "file:RESTART/20200602.060000.fv_core.res.ncvariable :xaxis_1"? Can you ask the programmers?

The filename is "RESTART/20200602.060000.fv_core.res.nc" and the variable name is "xaxis_1"

Something like this should be the traceback

https://github.com/NOAA-GFDL/GFDL_atmos_cubed_sphere/blob/cfae631cb9c07b40863441c157a39787626ba1fc/tools/fv_io.F90#L131

https://github.com/NOAA-GFDL/FMS/blob/7188e3a2e634376da74c3e4247bc9b487ef52700/fms2_io/fms_netcdf_domain_io.F90#L548

https://github.com/NOAA-GFDL/FMS/blob/7188e3a2e634376da74c3e4247bc9b487ef52700/fms2_io/netcdf_io.F90#L956-L964

— Reply to this email directly, view it on GitHub https://github.com/Unidata/netcdf-fortran/issues/395#issuecomment-1490373672, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANDS4FUWWZ4H2OGLPHTDXLLW6WHYDANCNFSM6AAAAAAWJKYNJA . You are receiving this because you authored the thread.Message ID: @.***>

--

George W Vandenberghe

Lynker Technologies at NOAA/NWS/NCEP/EMC

5830 University Research Ct., Rm. 2141

College Park, MD 20740

@.***

301-683-3769(work) 3017751547(cell)

bensonr commented 1 year ago

@GeorgeVandenberghe-NOAA - I'll ask again, what compiler (assuming Intel - ifx? ifort?) and version (19, OneAPI 2022, ??) are you using?

GeorgeVandenberghe-NOAA commented 1 year ago

My omission, sorry. Intel ifort, icc icpc.. whatever intel-classic loads on gaea c5. No oneapi!! that's a nopitty nope for many reasons! Hdf5/1.14.0, netcdf/4.7.4 netcdf-fortran/4.6.0, also replicated with hdf5/1.10.6 netcdf/4.9.1, netcdf-fortran/4.6.0

On Thu, Mar 30, 2023 at 7:09 PM Rusty Benson @.***> wrote:

@GeorgeVandenberghe-NOAA https://github.com/GeorgeVandenberghe-NOAA - I'll ask again, what compiler (assuming Intel - ifx? ifort?) and version (19, OneAPI 2022, ??) are you using?

— Reply to this email directly, view it on GitHub https://github.com/Unidata/netcdf-fortran/issues/395#issuecomment-1490794924, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANDS4FW6WITTYQZJPYSR6A3W6XK4XANCNFSM6AAAAAAWJKYNJA . You are receiving this because you were mentioned.Message ID: @.***>

--

George W Vandenberghe

Lynker Technologies at NOAA/NWS/NCEP/EMC

5830 University Research Ct., Rm. 2141

College Park, MD 20740

@.***

301-683-3769(work) 3017751547(cell)

GeorgeVandenberghe-NOAA commented 1 year ago

The compilation module loaded is intel-classic/2022.0.2 on gaea C5

On Thu, Mar 30, 2023 at 7:18 PM George Vandenberghe - NOAA Affiliate < @.***> wrote:

My omission, sorry. Intel ifort, icc icpc.. whatever intel-classic loads on gaea c5. No oneapi!! that's a nopitty nope for many reasons! Hdf5/1.14.0, netcdf/4.7.4 netcdf-fortran/4.6.0, also replicated with hdf5/1.10.6 netcdf/4.9.1, netcdf-fortran/4.6.0

On Thu, Mar 30, 2023 at 7:09 PM Rusty Benson @.***> wrote:

@GeorgeVandenberghe-NOAA https://github.com/GeorgeVandenberghe-NOAA - I'll ask again, what compiler (assuming Intel - ifx? ifort?) and version (19, OneAPI 2022, ??) are you using?

— Reply to this email directly, view it on GitHub https://github.com/Unidata/netcdf-fortran/issues/395#issuecomment-1490794924, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANDS4FW6WITTYQZJPYSR6A3W6XK4XANCNFSM6AAAAAAWJKYNJA . You are receiving this because you were mentioned.Message ID: @.***>

--

George W Vandenberghe

Lynker Technologies at NOAA/NWS/NCEP/EMC

5830 University Research Ct., Rm. 2141

College Park, MD 20740

@.***

301-683-3769(work) 3017751547(cell)

--

George W Vandenberghe

Lynker Technologies at NOAA/NWS/NCEP/EMC

5830 University Research Ct., Rm. 2141

College Park, MD 20740

@.***

301-683-3769(work) 3017751547(cell)

GeorgeVandenberghe-NOAA commented 1 year ago

Also if I rebuild FMS with -traceback -g, where is that specified in the source tree directives files, i.e CMakeLists.txt and whatnot?

On Thu, Mar 30, 2023 at 7:09 PM Rusty Benson @.***> wrote:

@GeorgeVandenberghe-NOAA https://github.com/GeorgeVandenberghe-NOAA - I'll ask again, what compiler (assuming Intel - ifx? ifort?) and version (19, OneAPI 2022, ??) are you using?

— Reply to this email directly, view it on GitHub https://github.com/Unidata/netcdf-fortran/issues/395#issuecomment-1490794924, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANDS4FW6WITTYQZJPYSR6A3W6XK4XANCNFSM6AAAAAAWJKYNJA . You are receiving this because you were mentioned.Message ID: @.***>

--

George W Vandenberghe

Lynker Technologies at NOAA/NWS/NCEP/EMC

5830 University Research Ct., Rm. 2141

College Park, MD 20740

@.***

301-683-3769(work) 3017751547(cell)