E3SM-Project / E3SM

Energy Exascale Earth System Model source code. NOTE: use "maint" branches for your work. Head of master is not validated.
https://docs.e3sm.org/E3SM
Other
341 stars 343 forks source link

FPE in `RtmMod.F90` with `ERS_D.f09_f09.IELM.pm-cpu_gnu.elm-lnd_rof_2way` #6174

Open ndkeen opened 6 months ago

ndkeen commented 6 months ago

With Jan23 master and ERS_D.f09_f09.IELM.pm-cpu_gnu.elm-lnd_rof_2way:

 22: Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
 22: 
 22: Backtrace for this error:
 22: #0  0x14c010c53dbf in ???
 22: #1  0x2121b5f in __rtmmod_MOD_rtmrun
 22:    at /global/cfs/cdirs/e3sm/ndk/repos/me37-jan23/components/mosart/src/riverroute/RtmMod.F90:2315
 22: #2  0x20d834b in __rof_comp_mct_MOD_rof_run_mct
 22:    at /global/cfs/cdirs/e3sm/ndk/repos/me37-jan23/components/mosart/src/cpl/rof_comp_mct.F90:472
 22: #3  0x48441a in __component_mod_MOD_component_run
 22:    at /global/cfs/cdirs/e3sm/ndk/repos/me37-jan23/driver-mct/main/component_mod.F90:734
 22: #4  0x467cd2 in __cime_comp_mod_MOD_cime_run
 22:    at /global/cfs/cdirs/e3sm/ndk/repos/me37-jan23/driver-mct/main/cime_comp_mod.F90:2932
 22: #5  0x481785 in cime_driver
 22:    at /global/cfs/cdirs/e3sm/ndk/repos/me37-jan23/driver-mct/main/cime_driver.F90:153
 22: #6  0x4817e8 in main
 22:    at /global/cfs/cdirs/e3sm/ndk/repos/me37-jan23/driver-mct/main/cime_driver.F90:23

also error with intel and DEBUG

These tests pass (with next of Mar1):

SMS_D.f09_f09.IELM.pm-cpu_gnu.elm-lnd_rof_2way
SMS_D.f09_f09.IELM.pm-cpu_gnu
ndkeen commented 3 months ago

With master of April 22nd, using ERS_D.f09_f09.IELM.pm-cpu_gnu.elm-lnd_rof_2way, I see:

 44: Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
 44: 
 44: Backtrace for this error:
 44: #0  0x147568e53dbf in ???
 44: #1  0x21ab7b3 in __rtmrestfile_MOD_rtmrestart
 44:    at /global/cfs/cdirs/e3sm/ndk/repos/me26-apr22/components/mosart/src/riverroute/RtmRestFile.F90:641
 44: #2  0x21af12c in __rtmrestfile_MOD_rtmrestfileread
 44:    at /global/cfs/cdirs/e3sm/ndk/repos/me26-apr22/components/mosart/src/riverroute/RtmRestFile.F90:137
 44: #3  0x2198a7d in __rtmmod_MOD_rtmini
 44:    at /global/cfs/cdirs/e3sm/ndk/repos/me26-apr22/components/mosart/src/riverroute/RtmMod.F90:1865
 44: #4  0x20f44e4 in __rof_comp_mct_MOD_rof_init_mct
 44:    at /global/cfs/cdirs/e3sm/ndk/repos/me26-apr22/components/mosart/src/cpl/rof_comp_mct.F90:111
 44: #5  0x48b6a5 in __component_mod_MOD_component_init_cc
 44:    at /global/cfs/cdirs/e3sm/ndk/repos/me26-apr22/driver-mct/main/component_mod.F90:248
 44: #6  0x4729a4 in __cime_comp_mod_MOD_cime_init
 44:    at /global/cfs/cdirs/e3sm/ndk/repos/me26-apr22/driver-mct/main/cime_comp_mod.F90:1500
 44: #7  0x484148 in cime_driver
 44:    at /global/cfs/cdirs/e3sm/ndk/repos/me26-apr22/driver-mct/main/cime_driver.F90:122
 44: #8  0x484284 in main
 44:    at /global/cfs/cdirs/e3sm/ndk/repos/me26-apr22/driver-mct/main/cime_driver.F90:23

which is slightly different error mesg than above, but still same file