E3SM-Project / E3SM

Energy Exascale Earth System Model source code. NOTE: use "maint" branches for your work. Head of master is not validated.
https://docs.e3sm.org/E3SM
Other
348 stars 359 forks source link

Error with coupled lowres on edison after adjusting PE layout. NetCDF: Numeric conversion not representable #2198

Closed ndkeen closed 5 years ago

ndkeen commented 6 years ago

Trying to use more nodes I am hitting some issues. Wanted to record as I learn more.

/global/cscratch1/sd/ndk/E3SM_simulations/edison.2018-master-mar1e3sm.DECKv1b_piControl.emod334c.1m.s24b.ne30_oECv3_ICG/

   0:  pionfwrite_mod::write_nfdarray_double         107   IAM:            0  start:
   0:                      1                     1  count:                  48602
   0:                      1  size :                     1  error:          -60
   0:   -872381472          11
   0:  pio_support::pio_die:: myrank=          -1 : ERROR:
   0:  pionfwrite_mod::write_nfdarray_double:         250 :
   0:  NetCDF: Numeric conversion not representable
   0: Image              PC                Routine            Line        Source
   0: e3sm.exe           00000000052F324D  Unknown               Unknown  Unknown
   0: e3sm.exe           00000000038CF521  pio_support_mp_pi         120  pio_support.F90
   0: e3sm.exe           00000000038CDA55  pio_utils_mp_chec          59  pio_utils.F90
   0: e3sm.exe           00000000039E1AD2  pionfwrite_mod_mp         250  pionfwrite_mod.F90.in
   0: e3sm.exe           00000000039AED7F  piodarray_mp_writ         645  piodarray.F90.in
   0: e3sm.exe           00000000039ACCB9  piodarray_mp_writ         223  piodarray.F90.in
   0: e3sm.exe           00000000039AC88C  piodarray_mp_writ         293  piodarray.F90.in
   0: e3sm.exe           00000000018A9CE5  cam_grid_support_        3107  cam_grid_support.F90
   0: e3sm.exe           0000000000520018  cam_history_mp_du        4401  cam_history.F90
   0: e3sm.exe           0000000000504A0F  cam_history_mp_ws        4716  cam_history.F90
   0: e3sm.exe           00000000004F3563  cam_comp_mp_cam_r         382  cam_comp.F90
   0: e3sm.exe           00000000004E05EE  atm_comp_mct_mp_a         509  atm_comp_mct.F90
   0: e3sm.exe           00000000004268B4  component_mod_mp_         728  component_mod.F90
   0: e3sm.exe           000000000040E9A2  cime_comp_mod_mp_        3370  cime_comp_mod.F90
   0: e3sm.exe           00000000004265CF  MAIN__                    103  cime_driver.F90
   0: e3sm.exe           000000000040AF1E  Unknown               Unknown  Unknown
   0: e3sm.exe           0000000005403799  Unknown               Unknown  Unknown
   0: e3sm.exe           000000000040AE09  Unknown               Unknown  Unknown
   0: Rank 0 [Tue Mar 27 11:29:52 2018] [c0-1c0s1n1] application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
ndkeen commented 5 years ago

This might still be an issue -- in the sense that I could find a way to repeat it on cori, but since I only have example of it on edison (which is retired), I will close the issue.