aymeric-spiga / dynamico-giant

0 stars 2 forks source link

Error XIOS #8

Open aymeric-spiga opened 5 years ago

aymeric-spiga commented 5 years ago

The model in makestart runs smoothly in debug mode then after about 13'40 CPU time and 84 iterations, it crashes and says

00: icosa_lmdz.exe: /scratch/cnt0027/lmd1167/aspigaplaneto/dynamico-giant/code/XIOS/extern/blitz/blitz/array-impl.h:1330: bool blitz::Array<P_numtype, N_rank>::assertInRange(int) const [with P_numtype = double, N_rank = 1]: Assertion `0' failed.
00: forrtl: error (76): Abort trap signal
00: Image              PC                Routine            Line        Source             
00: icosa_lmdz.exe     0000000002CE6881  Unknown               Unknown  Unknown
00: icosa_lmdz.exe     0000000002CE49BB  Unknown               Unknown  Unknown
00: icosa_lmdz.exe     0000000002C9D644  Unknown               Unknown  Unknown
00: icosa_lmdz.exe     0000000002C9D456  Unknown               Unknown  Unknown
00: icosa_lmdz.exe     0000000002C43907  Unknown               Unknown  Unknown
00: icosa_lmdz.exe     0000000002C4A76E  Unknown               Unknown  Unknown
00: libpthread-2.17.s  00002B46245395E0  Unknown               Unknown  Unknown
00: libc-2.17.so       00002B4624AC21F7  gsignal               Unknown  Unknown
00: libc-2.17.so       00002B4624AC38E8  abort                 Unknown  Unknown
00: libc-2.17.so       00002B4624ABB266  Unknown               Unknown  Unknown
00: libc-2.17.so       00002B4624ABB312  Unknown               Unknown  Unknown
00: icosa_lmdz.exe     00000000015641D6  _ZNK5blitz5ArrayI        
00: 1328  array-impl.h
00: icosa_lmdz.exe     0000000001564A88  _ZNK5blitz5ArrayI        1697  array-impl.h
00: icosa_lmdz.exe     0000000001C62B60  _ZN4xios6CField14         181  field.cpp
00: icosa_lmdz.exe     0000000001D194B4  _ZN4xios17CFileWr          36  file_writer_filter.cpp
00: icosa_lmdz.exe     00000000023BDD61  _ZN4xios9CInputPi          37  input_pin.cpp
00: icosa_lmdz.exe     000000000276652D  _ZN4xios10COutput          46  output_pin.cpp
00: icosa_lmdz.exe     0000000002765DF6  _ZN4xios10COutput          35  output_pin.cpp
00: icosa_lmdz.exe     00000000028E1312  _ZN4xios7CFilter1          16  filter.cpp
00: icosa_lmdz.exe     00000000023BDD61  _ZN4xios9CInputPi          37  input_pin.cpp
00: icosa_lmdz.exe     000000000276652D  _ZN4xios10COutput          46  output_pin.cpp
00: icosa_lmdz.exe     0000000002765DF6  _ZN4xios10COutput          35  output_pin.cpp
00: icosa_lmdz.exe     00000000027ABAC5  _ZN4xios23CSpatia          68  spatial_transform_filter.cpp
00: icosa_lmdz.exe     00000000023BDD61  _ZN4xios9CInputPi          37  input_pin.cpp
00: i
00: cosa_lmdz.exe     000000000276652D  _ZN4xios10COutput          46  output_pin.cpp
00: icosa_lmdz.exe     0000000002765DF6  _ZN4xios10COutput          35  output_pin.cpp
00: icosa_lmdz.exe     00000000027A61CD  _ZN4xios13CSource          66  source_filter.cpp
00: icosa_lmdz.exe     0000000001CA60FA  _ZN4xios6CField7s          23  field_impl.hpp
00: icosa_lmdz.exe     00000000021234EB  cxios_write_data_         434  icdata.cpp
00: icosa_lmdz.exe     00000000015B019B  idata_mp_xios_sen         466  idata.f90
00: icosa_lmdz.exe     0000000000699130  xios_mod_mp_xios_         458  xios_mod.f90
00: icosa_lmdz.exe     000000000068FD5C  xios_mod_mp_xios_         293  xios_mod.f90
00: icosa_lmdz.exe     0000000000482821  output_field_mod_          50  output_field.f90
00: icosa_lmdz.exe     0000000000747AB4  dissip_gcm_moddis         603  dissip_gcm.f90
00: icosa_lmdz.exe     0000000000744FBC  dissip_gcm_mod_mp         564  dissip_gcm.f90
00: icosa_lmdz.exe     00000000004E65AE  timeloop_gcm_mod_         309  timeloop_gcm.f90
00: icosa_lmdz.exe     00000000004249B1
00:   icosa_init_mod_mp          66  icosa_init.f90
00: libiomp5.so        00002B46247EB413  __kmp_invoke_micr     Unknown  Unknown
00: libiomp5.so        00002B46247BB60D  __kmp_fork_call       Unknown  Unknown
00: libiomp5.so        00002B4624793EE8  __kmpc_fork_call      Unknown  Unknown
00: icosa_lmdz.exe     000000000042487D  icosa_init_mod_mp          62  icosa_init.f90
00: icosa_lmdz.exe     000000000041C0E0  Unknown               Unknown  Unknown
00: icosa_lmdz.exe     000000000041C09E  Unknown               Unknown  Unknown
aymeric-spiga commented 5 years ago

Here is a more complete copy-paste of the beginning of the error

00: [Blitz++] Precondition failure: Module /scratch/cnt0027/lmd1167/aspigaplaneto/dynamico-giant/code/XIOS/extern/blitz/blitz/array-impl.h line 1330
00: Array index out of range: 0
00: Lower bounds: (0)
00: Length:       (0)
00: 
00: icosa_lmdz.exe: /scratch/cnt0027/lmd1167/aspigaplaneto/dynamico-giant/code/XIOS/extern/blitz/blitz/array-impl.h:1330: bool blitz::Array<P_numtype, N_rank>::assertInRange(int) const [with P_numtype = double, N_rank = 1]: Assertion `0' failed.
00: forrtl: error (76): Abort trap signal
aymeric-spiga commented 5 years ago

Still having the same problem even with allowing for more memory in the job submission

ehouarn commented 5 years ago

Your issue arises when trying to output fields from the dissipation:

00: icosa_lmdz.exe 0000000000482821 output_fieldmod 50 output_field.f90 00: icosa_lmdz.exe 0000000000747AB4 dissip_gcm_moddis 603 dissip_gcm.f90 Try without these outputs (i.e. set enable=".FALSE." for file Xdissip in icosa.xml)

aymeric-spiga commented 5 years ago

Commit https://github.com/aymeric-spiga/dynamico-giant/commit/b7ac566643575910ebb6d440befdd2a34947e203 mitigates this issue but does not solve it