E3SM-Project / E3SM

Energy Exascale Earth System Model source code. NOTE: use "maint" branches for your work. Head of master is not validated.
https://docs.e3sm.org/E3SM
Other
345 stars 351 forks source link

MPAS CICE failure on bebop #1711

Closed jayeshkrishna closed 6 years ago

jayeshkrishna commented 7 years ago

A debug run of SMS.ne30_oECv3_ICG.A_WCYCL1850S.bebop_intel (Intel 17.0.4) failed on bebop with the following error message,

forrtl: error (65): floating invalid

The stack trace is given below for reference,

forrtl: error (65): floating invalid
Image              PC                Routine            Line        Source
acme.exe           00000000086D0BE2  Unknown               Unknown  Unknown
libpthread-2.17.s  00002B7C8BD54370  Unknown               Unknown  Unknown
acme.exe           000000000459E5B3  ice_shortwave_mp_        1274  ice_shortwave.f90
acme.exe           00000000044D2FBF  ice_colpkg_mp_col        2863  ice_colpkg.f90
acme.exe           00000000043F007C  cice_column_mp_co        2391  mpas_cice_column.f90
acme.exe           0000000004432005  cice_column_mp_ci         546  mpas_cice_column.f90
acme.exe           00000000043AD7FF  cice_initialize_m         118  mpas_cice_initialize.f90
acme.exe           000000000455D51F  cice_core_mp_cice         255  mpas_cice_core.f90
acme.exe           00000000040D7429  ice_comp_mct_mp_i         522  ice_comp_mct.f90
acme.exe           000000000045B4DD  component_mod_mp_         227  component_mod.F90
acme.exe           0000000000426C14  cesm_comp_mod_mp_        1197  cesm_comp_mod.F90
acme.exe           00000000004523AC  MAIN__                     63  cesm_driver.F90
acme.exe           00000000004156DE  Unknown               Unknown  Unknown
libc-2.17.so       00002B7C8C284B35  __libc_start_main     Unknown  Unknown
acme.exe           00000000004155E9  Unknown               Unknown  Unknown
jayeshkrishna commented 7 years ago

The following patch fixed the issue for me,

diff --git a/src/core_cice/column/ice_shortwave.F90 b/src/core_cice/column/ice_shortwave.F
index fa08235..bef38ea 100644
--- a/src/core_cice/column/ice_shortwave.F90
+++ b/src/core_cice/column/ice_shortwave.F90
@@ -1271,7 +1271,9 @@

       ! compute aerosol mass path

-         aero_mp(:) = c0
+         if( n_aero > 0 ) then
+            aero_mp(:) = c0
+         endif
          if( tr_aero ) then
             ! assume 4 layers for each aerosol, a snow SSL, snow below SSL,
             ! sea ice SSL, and sea ice below SSL, in that order.
akturner commented 7 years ago

@jayeshkrishna: Thanks for bringing this to our attention!

akturner commented 7 years ago

@njeffery: Can you investigate this issue?

njeffery commented 7 years ago

@akturner , @jayeshkrishna : Yes. I'm back from vacation and looking into the issue.

rljacob commented 7 years ago

@jayeshkrishna Which version of intel? @njeffery you'll need a Blues/Anvil account to get access to bebop.

jayeshkrishna commented 7 years ago

Support for bebop is still in a branch - lcrc/machines/bebop-modules (will be merged in couple of days). Compiler : Intel 17.0.4

rljacob commented 7 years ago

Was bebop support added in PR #1662 ?

jayeshkrishna commented 7 years ago

Yes (my mistake), it was added in PR #1662 .

(Got confused because we are still working on the branch to get some remaining issues straightened out.)

njeffery commented 7 years ago

@rljacob :I have an account now and can log in. How do I get space on group/acme?

rljacob commented 7 years ago

You should have permissions to create a directory in /lcrc/group/acme

rljacob commented 7 years ago

@njeffery what is your username on blues/bebop?

jonbob commented 6 years ago

Closed via PR #1823, which brought in the version of mpas-cice containing this fix