ufs-community / ufs-weather-model

UFS Weather Model
Other
132 stars 239 forks source link

cpld_control_p8_gnu & cpld_debug_p8_gnu fails on hercules #2350

Open DeniseWorthen opened 2 days ago

DeniseWorthen commented 2 days ago

Description

Testing the develop branch on hercules, these tests now fails with the following:

150: WARNING from PE     0: Unused line in INPUT/MOM_input : ODA_INCUPD_NHOURS = 6
150:
105: pe=00105 FAIL at line=00045    NetCDF4_get_var.H                        <Unable to get variable: biomass from file: ExtData/QFED/2021/03/qfed2.emis_so2.006.20210322.nc4>
105: pe=00105 FAIL at line=00726    ServerThread.F90                         <status=1>
105: pe=00105 FAIL at line=00471    ServerThread.F90                         <status=1>
105: pe=00105 FAIL at line=01117    ServerThread.F90                         <status=1>
105: pe=00105 FAIL at line=00091    MessageVisitor.F90                       <status=1>
105: pe=00105 FAIL at line=00115    AbstractMessage.F90                      <status=1>
105: pe=00105 FAIL at line=00107    SimpleSocket.F90                         <status=1>
105: pe=00105 FAIL at line=00427    ClientThread.F90                         <status=1>
112: pe=00112 FAIL at line=00045    NetCDF4_get_var.H                        <Unable to get variable: biomass from file: ExtData/QFED/2021/03/qfed2.emis_nh3.006.20210323.nc4>
112: pe=00112 FAIL at line=00726    ServerThread.F90                         <status=1>
112: pe=00112 FAIL at line=00471    ServerThread.F90                         <status=1>

See /work2/noaa/stmp/dworthen/stmp/dworthen/FV3_RT/rt_651752/cpld_debug_p8_gnu for remainder of err message.

I ran the test manually a second time and it also failed.

To Reproduce:

Run the tests on hercules.

Additional context

The files which are complained about (eg ExtData/QFED/2021/03/qfed2.emis_nh3.006.20210323.nc4) are present in the run directory.

The intel versions of the test run do not fail.

Output

DeniseWorthen commented 2 days ago

Note, I originally saw the issue while testing a CMEPS PR, so I had removed all the standalone tests from rt.conf. I don't know whether this impacts any non-cpld configuration. I ran only the cpld_debug_p8_gnu test from the develop branch.

InnocentSouopgui-NOAA commented 2 days ago

cpld_debug_p8_intel is failing on S4. I am not sure if it is related to this issue, but it seems likely.

zach1221 commented 1 day ago

I had opened a duplicate here, not realizing this one existed. I think it may be related to an issue with the mvapich2 installation on Hercules, potentially caused by the maintenance last week on the platform. @jkbk2004