taurus V100 out-of-memory error

PrometheusPi commented 5 years ago

I recently started using the V100 on the ml partition of taurus again. With a very similar setup to what already ran, I got the following mallocMC error:

terminate called after throwing an instance of 'CUDA::error' 
  what():  /scratch/ws/.../picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42):
error: out of memory

@steindev @sbastrakov @psychocoderHPC Have you encountered an false memory issue error on the ml partition before? If yes, how did you solve it?

UPDATE: Before I tested a simple default LWFA with 32 GPUs and that worked fine.

sbastrakov commented 5 years ago

I did not encounter this one. Looks a little strange, as our memory (pre)allocation strategy aims at preventing such things from happening. Is it possible to get a call stack or other information to help figure out when does it happen: at least, is it during the initialization stage or somewhere after the main PIC loop has already started?

PrometheusPi commented 5 years ago

After a brief offline discussion with @sbastrakov here are the details of the error message

```bash Module GCCcore/7.3.0, binutils/2.30-GCCcore-7.3.0, GCC/7.3.0-2.30, CUDA/9.2.88-GCC-7.3.0-2.30, gcccuda/2018b, zlib/1.2.11-GCCcore-7.3.0, numactl/2.0.11-GCCcore-7.3.0, hwloc/1.11.10-GCCcore-7.3.0, OpenMPI/3.1.4-gcccuda-2018b, OpenBLAS/0.3.1-GCC-7.3.0-2.30, gompic/2018b, FFTW/3.3.8-gompic-2018b, ScaLAPACK/2.0.2-gompic-2018b-OpenBLAS-0.3.1, fosscuda/2018b, ncurses/6.1-GCCcore-7.3.0, CMake/3.11.4-GCCcore-7.3.0, libpng/1.6.34-GCCcore-7.3.0 unloaded. Module GCCcore/7.3.0, binutils/2.30-GCCcore-7.3.0, GCC/7.3.0-2.30, CUDA/9.2.88-GCC-7.3.0-2.30, gcccuda/2018b, zlib/1.2.11-GCCcore-7.3.0, numactl/2.0.11-GCCcore-7.3.0, hwloc/1.11.10-GCCcore-7.3.0, OpenMPI/3.1.4-gcccuda-2018b, OpenBLAS/0.3.1-GCC-7.3.0-2.30, gompic/2018b, FFTW/3.3.8-gompic-2018b, ScaLAPACK/2.0.2-gompic-2018b-OpenBLAS-0.3.1, fosscuda/2018b, ncurses/6.1-GCCcore-7.3.0, CMake/3.11.4-GCCcore-7.3.0, libpng/1.6.34-GCCcore-7.3.0 loaded. Module fosscuda/2018b unloaded. Module fosscuda/2018b loaded. Module CMake/3.11.4-GCCcore-7.3.0 unloaded. Module CMake/3.11.4-GCCcore-7.3.0 loaded. Module libpng/1.6.34-GCCcore-7.3.0 unloaded. Module libpng/1.6.34-GCCcore-7.3.0 loaded. ln: die symbolische Verknüpfung „output“ konnte nicht angelegt werden: Die Datei existiert bereits terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml4:141899] *** Process received signal *** [taurusml4:141899] Signal: Aborted (6) [taurusml4:141899] Signal code: (-6) [taurusml4:141899] [ 0] [0x2000000504d8] [taurusml4:141899] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml4:141899] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml4:141899] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml4:141899] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml4:141899] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml4:141899] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x10945ee8] [taurusml4:141899] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb88)[0x10b47d08] [taurusml4:141899] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml4:141899] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml4:141899] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml4:141899] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml4:141899] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml4:141899] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml7:159071] *** Process received signal *** [taurusml7:159071] Signal: Aborted (6) [taurusml7:159071] Signal code: (-6) [taurusml7:159071] [ 0] [0x2000000504d8] [taurusml7:159071] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml7:159071] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml7:159071] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml7:159071] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml7:159071] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml7:159071] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x10945ee8] [taurusml7:159071] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb88)[0x10b47d08] [taurusml7:159071] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml7:159071] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml7:159071] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml7:159071] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml7:159071] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml7:159071] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml5:140912] *** Process received signal *** [taurusml5:140912] Signal: Aborted (6) [taurusml5:140912] Signal code: (-6) [taurusml5:140912] [ 0] [0x2000000504d8] [taurusml5:140912] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml5:140912] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml5:140912] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml5:140912] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml5:140912] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml5:140912] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x10945ee8] [taurusml5:140912] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb88)[0x10b47d08] [taurusml5:140912] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml5:140912] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml5:140912] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml5:140912] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml5:140912] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml5:140912] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml7:159068] *** Process received signal *** [taurusml7:159068] Signal: Aborted (6) [taurusml7:159068] Signal code: (-6) [taurusml7:159068] [ 0] [0x2000000504d8] [taurusml7:159068] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml7:159068] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml7:159068] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml7:159068] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml7:159068] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml7:159068] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x10945ee8] [taurusml7:159068] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb88)[0x10b47d08] [taurusml7:159068] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml7:159068] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml7:159068] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml7:159068] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml7:159068] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml7:159068] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml5:140915] *** Process received signal *** [taurusml5:140915] Signal: Aborted (6) [taurusml5:140915] Signal code: (-6) [taurusml5:140915] [ 0] [0x2000000504d8] [taurusml5:140915] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml5:140915] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml5:140915] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml5:140915] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml5:140915] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml5:140915] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x10945ee8] [taurusml5:140915] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb88)[0x10b47d08] [taurusml5:140915] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml5:140915] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml5:140915] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml5:140915] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml5:140915] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml5:140915] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml2:118925] *** Process received signal *** [taurusml2:118925] Signal: Aborted (6) [taurusml2:118925] Signal code: (-6) [taurusml2:118925] [ 0] [0x2000000504d8] [taurusml2:118925] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml2:118925] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml2:118925] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml2:118925] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml2:118925] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml2:118925] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x10945ee8] [taurusml2:118925] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb88)[0x10b47d08] [taurusml2:118925] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml2:118925] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml2:118925] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml2:118925] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml2:118925] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml2:118925] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml7:159067] *** Process received signal *** [taurusml7:159067] Signal: Aborted (6) [taurusml7:159067] Signal code: (-6) [taurusml7:159067] [ 0] [0x2000000504d8] [taurusml7:159067] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml7:159067] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml7:159067] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml7:159067] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml7:159067] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml7:159067] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x10945ee8] [taurusml7:159067] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb88)[0x10b47d08] [taurusml7:159067] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml7:159067] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml7:159067] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml7:159067] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml7:159067] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml7:159067] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml5:140914] *** Process received signal *** [taurusml5:140914] Signal: Aborted (6) [taurusml5:140914] Signal code: (-6) [taurusml5:140914] [ 0] [0x2000000504d8] [taurusml5:140914] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml5:140914] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml5:140914] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml5:140914] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml5:140914] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml5:140914] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x10945ee8] [taurusml5:140914] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb88)[0x10b47d08] [taurusml5:140914] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml5:140914] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml5:140914] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml5:140914] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml5:140914] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml5:140914] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml2:118926] *** Process received signal *** [taurusml2:118926] Signal: Aborted (6) [taurusml2:118926] Signal code: (-6) [taurusml2:118926] [ 0] [0x2000000504d8] [taurusml2:118926] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml2:118926] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml2:118926] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml2:118926] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml2:118926] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml2:118926] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x10945ee8] [taurusml2:118926] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb88)[0x10b47d08] [taurusml2:118926] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml2:118926] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml2:118926] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml2:118926] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml2:118926] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml2:118926] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml8:65249] *** Process received signal *** [taurusml8:65249] Signal: Aborted (6) [taurusml8:65249] Signal code: (-6) [taurusml8:65249] [ 0] [0x2000000504d8] [taurusml8:65249] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml8:65249] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml8:65249] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml8:65249] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml8:65249] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml8:65249] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x10945ee8] [taurusml8:65249] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb88)[0x10b47d08] [taurusml8:65249] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml8:65249] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml8:65249] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml8:65249] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml8:65249] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml8:65249] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml6:58340] *** Process received signal *** [taurusml6:58340] Signal: Aborted (6) [taurusml6:58340] Signal code: (-6) [taurusml6:58340] [ 0] [0x2000000504d8] [taurusml6:58340] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml6:58340] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml6:58340] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml6:58340] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml6:58340] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml6:58340] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x10945ee8] [taurusml6:58340] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb88)[0x10b47d08] [taurusml6:58340] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml6:58340] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml6:58340] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml6:58340] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml6:58340] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml6:58340] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml4:141901] *** Process received signal *** [taurusml4:141901] Signal: Aborted (6) [taurusml4:141901] Signal code: (-6) [taurusml4:141901] [ 0] [0x2000000504d8] [taurusml4:141901] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml4:141901] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml4:141901] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml4:141901] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml4:141901] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml4:141901] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x10945ee8] [taurusml4:141901] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb88)[0x10b47d08] [taurusml4:141901] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml4:141901] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml4:141901] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml4:141901] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml4:141901] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml4:141901] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml2:118924] *** Process received signal *** [taurusml2:118924] Signal: Aborted (6) [taurusml2:118924] Signal code: (-6) [taurusml2:118924] [ 0] [0x2000000504d8] [taurusml2:118924] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml2:118924] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml2:118924] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml2:118924] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml2:118924] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml2:118924] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x10945ee8] [taurusml2:118924] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb88)[0x10b47d08] [taurusml2:118924] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml2:118924] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml2:118924] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml2:118924] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml2:118924] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml2:118924] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml8:65252] *** Process received signal *** [taurusml8:65252] Signal: Aborted (6) [taurusml8:65252] Signal code: (-6) [taurusml8:65252] [ 0] [0x2000000504d8] [taurusml8:65252] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml8:65252] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml8:65252] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml8:65252] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml8:65252] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml8:65252] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x10945ee8] [taurusml8:65252] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb88)[0x10b47d08] [taurusml8:65252] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml8:65252] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml8:65252] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml8:65252] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml8:65252] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml8:65252] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml6:58341] *** Process received signal *** [taurusml6:58341] Signal: Aborted (6) [taurusml6:58341] Signal code: (-6) [taurusml6:58341] [ 0] [0x2000000504d8] [taurusml6:58341] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml6:58341] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml6:58341] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml6:58341] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml6:58341] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml6:58341] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x10945ee8] [taurusml6:58341] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb88)[0x10b47d08] [taurusml6:58341] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml6:58341] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml6:58341] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml6:58341] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml6:58341] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml6:58341] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml4:141904] *** Process received signal *** [taurusml4:141904] Signal: Aborted (6) [taurusml4:141904] Signal code: (-6) [taurusml4:141904] [ 0] [0x2000000504d8] [taurusml4:141904] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml4:141904] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml4:141904] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml4:141904] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml4:141904] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml4:141904] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x10945ee8] [taurusml4:141904] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb88)[0x10b47d08] [taurusml4:141904] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml4:141904] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml4:141904] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml4:141904] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml4:141904] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml4:141904] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml8:65250] *** Process received signal *** [taurusml8:65250] Signal: Aborted (6) [taurusml8:65250] Signal code: (-6) [taurusml8:65250] [ 0] [0x2000000504d8] [taurusml8:65250] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml8:65250] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml8:65250] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml8:65250] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml8:65250] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml8:65250] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x10945ee8] [taurusml8:65250] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb88)[0x10b47d08] [taurusml8:65250] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml8:65250] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml8:65250] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml8:65250] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml8:65250] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml8:65250] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml2:118923] *** Process received signal *** [taurusml2:118923] Signal: Aborted (6) [taurusml2:118923] Signal code: (-6) [taurusml2:118923] [ 0] [0x2000000504d8] [taurusml2:118923] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml2:118923] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml2:118923] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml2:118923] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml2:118923] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml2:118923] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x10945ee8] [taurusml2:118923] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb88)[0x10b47d08] [taurusml2:118923] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml2:118923] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml2:118923] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml2:118923] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml2:118923] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml2:118923] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml5:140911] *** Process received signal *** [taurusml5:140911] Signal: Aborted (6) [taurusml5:140911] Signal code: (-6) [taurusml5:140911] [ 0] [0x2000000504d8] [taurusml5:140911] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml5:140911] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml5:140911] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml5:140911] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml5:140911] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml5:140911] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x10945ee8] [taurusml5:140911] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb88)[0x10b47d08] [taurusml5:140911] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml5:140911] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml5:140911] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml5:140911] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml5:140911] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml5:140911] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml6:58342] *** Process received signal *** [taurusml6:58342] Signal: Aborted (6) [taurusml6:58342] Signal code: (-6) [taurusml6:58342] [ 0] [0x2000000504d8] [taurusml6:58342] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml6:58342] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml6:58342] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml6:58342] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml6:58342] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml6:58342] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x10945ee8] [taurusml6:58342] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb88)[0x10b47d08] [taurusml6:58342] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml6:58342] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml6:58342] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml6:58342] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml6:58342] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml6:58342] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml4:141900] *** Process received signal *** [taurusml4:141900] Signal: Aborted (6) [taurusml4:141900] Signal code: (-6) [taurusml4:141900] [ 0] [0x2000000504d8] [taurusml4:141900] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml4:141900] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml4:141900] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml4:141900] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml4:141900] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml4:141900] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x10945ee8] [taurusml4:141900] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb88)[0x10b47d08] [taurusml4:141900] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml4:141900] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml4:141900] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml4:141900] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml4:141900] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml4:141900] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml8:65248] *** Process received signal *** [taurusml8:65248] Signal: Aborted (6) [taurusml8:65248] Signal code: (-6) [taurusml8:65248] [ 0] [0x2000000504d8] [taurusml8:65248] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml8:65248] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml8:65248] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml8:65248] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml8:65248] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml8:65248] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x10945ee8] [taurusml8:65248] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb88)[0x10b47d08] [taurusml8:65248] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml8:65248] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml8:65248] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml8:65248] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml8:65248] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml8:65248] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml7:159066] *** Process received signal *** [taurusml7:159066] Signal: Aborted (6) [taurusml7:159066] Signal code: (-6) [taurusml7:159066] [ 0] [0x2000000504d8] [taurusml7:159066] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml7:159066] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml7:159066] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml7:159066] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml7:159066] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml7:159066] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x10945ee8] [taurusml7:159066] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb88)[0x10b47d08] [taurusml7:159066] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml7:159066] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml7:159066] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml7:159066] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml7:159066] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml7:159066] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml6:58338] *** Process received signal *** [taurusml6:58338] Signal: Aborted (6) [taurusml6:58338] Signal code: (-6) [taurusml6:58338] [ 0] [0x2000000504d8] [taurusml6:58338] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml6:58338] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml6:58338] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml6:58338] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml6:58338] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml6:58338] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x10945ee8] [taurusml6:58338] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb88)[0x10b47d08] [taurusml6:58338] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml6:58338] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml6:58338] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml6:58338] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml6:58338] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml6:58338] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml4:141903] *** Process received signal *** [taurusml4:141903] Signal: Aborted (6) [taurusml4:141903] Signal code: (-6) [taurusml4:141903] [ 0] [0x2000000504d8] [taurusml4:141903] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml4:141903] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml4:141903] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml4:141903] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml4:141903] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml4:141903] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x10945ee8] [taurusml4:141903] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb88)[0x10b47d08] [taurusml4:141903] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml4:141903] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml4:141903] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml4:141903] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml4:141903] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml4:141903] *** End of error message *** terminate called after throwing an instance of 'std::runtime_error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/alpaka/include/alpaka/stream/StreamCudaRtAsync.hpp(90) 'cudaStreamCreateWithFlags( &m_CudaStream, 0x01)' returned error : 'cudaErrorMemoryAllocation': 'out of memory'! [taurusml4:141902] *** Process received signal *** [taurusml4:141902] Signal: Aborted (6) [taurusml4:141902] Signal code: (-6) [taurusml4:141902] [ 0] [0x2000000504d8] [taurusml4:141902] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml4:141902] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml4:141902] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml4:141902] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml4:141902] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml4:141902] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN6alpaka4cuda6detail11cudaRtCheckERK9cudaErrorPKcS6_RKi+0x2f8)[0x108fd5a8] [taurusml4:141902] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN6alpaka4cuda6detail17cudaRtCheckIgnoreIJEEEvRK9cudaErrorPKcS7_RKiDpOT_+0xa0)[0x10903b60] [taurusml4:141902] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_Z17cuplaStreamCreatePPv+0x1cc)[0x10c8f27c] [taurusml4:141902] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc11EventStreamC1Ev+0x38)[0x109034c8] [taurusml4:141902] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0x10b4)[0x10b48234] [taurusml4:141902] [11] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml4:141902] [12] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml4:141902] [13] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml4:141902] [14] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml4:141902] [15] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml4:141902] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml7:159070] *** Process received signal *** [taurusml7:159070] Signal: Aborted (6) [taurusml7:159070] Signal code: (-6) [taurusml7:159070] [ 0] [0x2000000504d8] [taurusml7:159070] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml7:159070] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml7:159070] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml7:159070] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml7:159070] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml7:159070] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x10945ee8] [taurusml7:159070] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb88)[0x10b47d08] [taurusml7:159070] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml7:159070] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml7:159070] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml7:159070] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml7:159070] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml7:159070] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml5:140913] *** Process received signal *** [taurusml5:140913] Signal: Aborted (6) [taurusml5:140913] Signal code: (-6) [taurusml5:140913] [ 0] [0x2000000504d8] [taurusml5:140913] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml5:140913] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml5:140913] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml5:140913] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml5:140913] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml5:140913] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x10945ee8] [taurusml5:140913] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb88)[0x10b47d08] [taurusml5:140913] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml5:140913] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml5:140913] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml5:140913] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml5:140913] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml5:140913] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml2:118921] *** Process received signal *** [taurusml2:118921] Signal: Aborted (6) [taurusml2:118921] Signal code: (-6) [taurusml2:118921] [ 0] [0x2000000504d8] [taurusml2:118921] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml2:118921] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml2:118921] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml2:118921] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml2:118921] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml2:118921] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x10945ee8] [taurusml2:118921] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb88)[0x10b47d08] [taurusml2:118921] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml2:118921] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml2:118921] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml2:118921] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml2:118921] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml2:118921] *** End of error message *** srun: error: taurusml4: task 6: Aborted srun: Terminating job step 14333549.1 terminate called after throwing an instance of 'std::runtime_error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/alpaka/include/alpaka/stream/StreamCudaRtAsync.hpp(90) 'cudaStreamCreateWithFlags( &m_CudaStream, 0x01)' returned error : 'cudaErrorMemoryAllocation': 'out of memory'! [taurusml2:118922] *** Process received signal *** [taurusml2:118922] Signal: Aborted (6) [taurusml2:118922] Signal code: (-6) [taurusml2:118922] [ 0] [0x2000000504d8] [taurusml2:118922] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml2:118922] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml2:118922] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml2:118922] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml2:118922] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml2:118922] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN6alpaka4cuda6detail11cudaRtCheckERK9cudaErrorPKcS6_RKi+0x2f8)[0x108fd5a8] [taurusml2:118922] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN6alpaka4cuda6detail17cudaRtCheckIgnoreIJEEEvRK9cudaErrorPKcS7_RKiDpOT_+0xa0)[0x10903b60] [taurusml2:118922] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_Z17cuplaStreamCreatePPv+0x1cc)[0x10c8f27c] [taurusml2:118922] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc11EventStreamC1Ev+0x38)[0x109034c8] [taurusml2:118922] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0x10b4)[0x10b48234] [taurusml2:118922] [11] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml2:118922] [12] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml2:118922] [13] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml2:118922] [14] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml2:118922] [15] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml2:118922] *** End of error message *** srun: Job step aborted: Waiting up to 32 seconds for job step to finish. slurmstepd: error: *** STEP 14333549.1 ON taurusml2 CANCELLED AT 2019-09-18T16:50:36 *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml6:58337] *** Process received signal *** [taurusml6:58337] Signal: Aborted (6) [taurusml6:58337] Signal code: (-6) [taurusml6:58337] [ 0] [0x2000000504d8] [taurusml6:58337] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml6:58337] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml6:58337] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml6:58337] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml6:58337] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml6:58337] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x10945ee8] [taurusml6:58337] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb88)[0x10b47d08] [taurusml6:58337] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml6:58337] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml6:58337] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml6:58337] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml6:58337] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml6:58337] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml8:65251] *** Process received signal *** [taurusml8:65251] Signal: Aborted (6) [taurusml8:65251] Signal code: (-6) [taurusml8:65251] [ 0] [0x2000000504d8] [taurusml8:65251] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml8:65251] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml8:65251] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml8:65251] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml8:65251] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml8:65251] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x10945ee8] [taurusml8:65251] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb88)[0x10b47d08] [taurusml8:65251] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml8:65251] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml8:65251] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml8:65251] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml8:65251] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml8:65251] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml7:159069] *** Process received signal *** [taurusml7:159069] Signal: Aborted (6) [taurusml7:159069] Signal code: (-6) [taurusml7:159069] [ 0] [0x2000000504d8] [taurusml7:159069] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml7:159069] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml7:159069] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml7:159069] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml7:159069] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml7:159069] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x10945ee8] [taurusml7:159069] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb88)[0x10b47d08] [taurusml7:159069] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml7:159069] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml7:159069] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml7:159069] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml7:159069] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml7:159069] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml5:140910] *** Process received signal *** [taurusml5:140910] Signal: Aborted (6) [taurusml5:140910] Signal code: (-6) [taurusml5:140910] [ 0] [0x2000000504d8] [taurusml5:140910] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml5:140910] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml5:140910] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml5:140910] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml5:140910] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml5:140910] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x10945ee8] [taurusml5:140910] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb88)[0x10b47d08] [taurusml5:140910] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml5:140910] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml5:140910] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml5:140910] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml5:140910] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml5:140910] *** End of error message *** srun: error: taurusml4: task 9: Aborted terminate called after throwing an instance of 'std::runtime_error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/alpaka/include/alpaka/stream/StreamCudaRtAsync.hpp(90) 'cudaStreamCreateWithFlags( &m_CudaStream, 0x01)' returned error : 'cudaErrorMemoryAllocation': 'out of memory'! [taurusml8:65253] *** Process received signal *** [taurusml8:65253] Signal: Aborted (6) [taurusml8:65253] Signal code: (-6) [taurusml8:65253] [ 0] [0x2000000504d8] [taurusml8:65253] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml8:65253] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml8:65253] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml8:65253] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml8:65253] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml8:65253] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN6alpaka4cuda6detail11cudaRtCheckERK9cudaErrorPKcS6_RKi+0x2f8)[0x108fd5a8] [taurusml8:65253] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN6alpaka4cuda6detail17cudaRtCheckIgnoreIJEEEvRK9cudaErrorPKcS7_RKiDpOT_+0xa0)[0x10903b60] [taurusml8:65253] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_Z17cuplaStreamCreatePPv+0x1cc)[0x10c8f27c] [taurusml8:65253] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc11EventStreamC1Ev+0x38)[0x109034c8] [taurusml8:65253] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0x10b4)[0x10b48234] [taurusml8:65253] [11] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml8:65253] [12] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml8:65253] [13] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml8:65253] [14] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml8:65253] [15] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml8:65253] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml6:58339] *** Process received signal *** [taurusml6:58339] Signal: Aborted (6) [taurusml6:58339] Signal code: (-6) [taurusml6:58339] [ 0] [0x2000000504d8] [taurusml6:58339] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml6:58339] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml6:58339] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml6:58339] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml6:58339] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml6:58339] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x10945ee8] [taurusml6:58339] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb88)[0x10b47d08] [taurusml6:58339] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10bbcc50] [taurusml6:58339] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10bbd720] [taurusml6:58339] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10877aa8] [taurusml6:58339] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml6:58339] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml6:58339] *** End of error message *** srun: error: taurusml10: task 47: Killed srun: error: taurusml2: task 4: Aborted srun: error: taurusml9: task 38: Killed srun: error: taurusml4: tasks 7-8,10: Aborted srun: error: taurusml11: task 50: Killed srun: error: taurusml2: task 5: Aborted srun: error: taurusml9: task 41: Killed srun: error: taurusml4: task 11: Aborted srun: error: taurusml2: tasks 0,2-3: Aborted srun: error: taurusml7: task 25: Aborted srun: error: taurusml5: task 14: Aborted srun: error: taurusml11: tasks 52-53: Killed srun: error: taurusml9: task 40: Killed srun: error: taurusml7: task 29: Aborted srun: error: taurusml6: task 21: Aborted srun: error: taurusml10: tasks 42-44,46: Killed srun: error: taurusml9: tasks 36-37: Killed srun: error: taurusml8: tasks 34-35: Aborted srun: error: taurusml2: task 1: Aborted srun: error: taurusml11: tasks 48-49: Killed srun: error: taurusml7: tasks 24,26,28: Aborted srun: error: taurusml5: tasks 12-13,16-17: Aborted srun: error: taurusml8: tasks 30-32: Aborted srun: error: taurusml10: task 45: Killed srun: error: taurusml11: task 51: Killed srun: error: taurusml6: tasks 19-20,22-23: Aborted srun: error: taurusml9: task 39: Killed srun: error: taurusml7: task 27: Aborted srun: error: taurusml5: task 15: Aborted srun: error: taurusml8: task 33: Aborted srun: error: taurusml6: task 18: Aborted ```

The job crashed during startup.

PrometheusPi commented 5 years ago

with debug symbols -g in CMAKE_CXX_FLAGS via ccmake I got the following error output (looks like no improvement to me 😕 - did I do something wrongly?):

CMAKE_CXX_FLAGS                  -Dlinux -g

```bash Module GCCcore/7.3.0, binutils/2.30-GCCcore-7.3.0, GCC/7.3.0-2.30, CUDA/9.2.88-GCC-7.3.0-2.30, gcccuda/2018b, zlib/1.2.11-GCCcore-7.3.0, numactl/2.0.11-GCCcore-7.3.0, hwloc/1.11.10-GCCcore-7.3.0, OpenMPI/3.1.4-gcccuda-2018b, OpenBLAS/0.3.1-GCC-7.3.0-2.30, gompic/2018b, FFTW/3.3.8-gompic-2018b, ScaLAPACK/2.0.2-gompic-2018b-OpenBLAS-0.3.1, fosscuda/2018b, ncurses/6.1-GCCcore-7.3.0, CMake/3.11.4-GCCcore-7.3.0, libpng/1.6.34-GCCcore-7.3.0 unloaded. Module GCCcore/7.3.0, binutils/2.30-GCCcore-7.3.0, GCC/7.3.0-2.30, CUDA/9.2.88-GCC-7.3.0-2.30, gcccuda/2018b, zlib/1.2.11-GCCcore-7.3.0, numactl/2.0.11-GCCcore-7.3.0, hwloc/1.11.10-GCCcore-7.3.0, OpenMPI/3.1.4-gcccuda-2018b, OpenBLAS/0.3.1-GCC-7.3.0-2.30, gompic/2018b, FFTW/3.3.8-gompic-2018b, ScaLAPACK/2.0.2-gompic-2018b-OpenBLAS-0.3.1, fosscuda/2018b, ncurses/6.1-GCCcore-7.3.0, CMake/3.11.4-GCCcore-7.3.0, libpng/1.6.34-GCCcore-7.3.0 loaded. Module fosscuda/2018b unloaded. Module fosscuda/2018b loaded. Module CMake/3.11.4-GCCcore-7.3.0 unloaded. Module CMake/3.11.4-GCCcore-7.3.0 loaded. Module libpng/1.6.34-GCCcore-7.3.0 unloaded. Module libpng/1.6.34-GCCcore-7.3.0 loaded. ln: die symbolische Verknüpfung „output“ konnte nicht angelegt werden: Die Datei existiert bereits terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml7:160637] *** Process received signal *** [taurusml7:160637] Signal: Aborted (6) [taurusml7:160637] Signal code: (-6) [taurusml7:160637] [ 0] [0x2000000504d8] [taurusml7:160637] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml7:160637] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml7:160637] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml7:160637] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml7:160637] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml7:160637] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml7:160637] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] [taurusml7:160637] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml7:160637] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml7:160637] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml7:160637] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml7:160637] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml7:160637] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml4:143583] *** Process received signal *** [taurusml4:143583] Signal: Aborted (6) [taurusml4:143583] Signal code: (-6) [taurusml4:143583] [ 0] [0x2000000504d8] [taurusml4:143583] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml4:143583] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml4:143583] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml4:143583] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml4:143583] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml4:143583] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml4:143583] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] [taurusml4:143583] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml4:143583] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml4:143583] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml4:143583] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml4:143583] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml4:143583] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml6:59934] *** Process received signal *** [taurusml6:59934] Signal: Aborted (6) [taurusml6:59934] Signal code: (-6) [taurusml6:59934] [ 0] [0x2000000504d8] [taurusml6:59934] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml6:59934] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml6:59934] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml6:59934] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml6:59934] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml6:59934] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml6:59934] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] [taurusml6:59934] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml6:59934] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml6:59934] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml6:59934] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml6:59934] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml6:59934] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml5:142536] *** Process received signal *** [taurusml5:142536] Signal: Aborted (6) [taurusml5:142536] Signal code: (-6) [taurusml5:142536] [ 0] [0x2000000504d8] [taurusml5:142536] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml5:142536] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml5:142536] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml5:142536] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml5:142536] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml5:142536] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml5:142536] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] [taurusml5:142536] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml5:142536] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml5:142536] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml5:142536] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml5:142536] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml5:142536] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml8:66898] *** Process received signal *** [taurusml8:66898] Signal: Aborted (6) [taurusml8:66898] Signal code: (-6) [taurusml8:66898] [ 0] [0x2000000504d8] [taurusml8:66898] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml8:66898] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml8:66898] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml8:66898] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml8:66898] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml8:66898] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml8:66898] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] [taurusml8:66898] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml8:66898] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml8:66898] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml8:66898] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml8:66898] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml8:66898] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml2:122338] *** Process received signal *** [taurusml2:122338] Signal: Aborted (6) [taurusml2:122338] Signal code: (-6) [taurusml2:122338] [ 0] [0x2000000504d8] [taurusml2:122338] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml2:122338] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml2:122338] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml2:122338] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml2:122338] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml2:122338] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml2:122338] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] [taurusml2:122338] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml2:122338] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml2:122338] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml2:122338] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml2:122338] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml2:122338] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml6:59936] *** Process received signal *** [taurusml6:59936] Signal: Aborted (6) [taurusml6:59936] Signal code: (-6) [taurusml6:59936] [ 0] [0x2000000504d8] [taurusml6:59936] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml6:59936] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml6:59936] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml6:59936] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml6:59936] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml6:59936] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml6:59936] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] [taurusml6:59936] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml6:59936] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml6:59936] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml6:59936] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml6:59936] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml6:59936] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml5:142534] *** Process received signal *** [taurusml5:142534] Signal: Aborted (6) [taurusml5:142534] Signal code: (-6) [taurusml5:142534] [ 0] [0x2000000504d8] [taurusml5:142534] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml5:142534] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml5:142534] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml5:142534] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml5:142534] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml5:142534] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml5:142534] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] [taurusml5:142534] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml5:142534] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml5:142534] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml5:142534] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml5:142534] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml5:142534] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml8:66895] *** Process received signal *** [taurusml8:66895] Signal: Aborted (6) [taurusml8:66895] Signal code: (-6) [taurusml8:66895] [ 0] [0x2000000504d8] [taurusml8:66895] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml8:66895] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml8:66895] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml8:66895] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml8:66895] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml8:66895] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml8:66895] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] [taurusml8:66895] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml8:66895] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml8:66895] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml8:66895] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml8:66895] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml8:66895] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml2:122341] *** Process received signal *** [taurusml2:122341] Signal: Aborted (6) [taurusml2:122341] Signal code: (-6) [taurusml2:122341] [ 0] [0x2000000504d8] [taurusml2:122341] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml2:122341] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml2:122341] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml2:122341] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml2:122341] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml2:122341] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml2:122341] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] [taurusml2:122341] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml2:122341] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml2:122341] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml2:122341] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml2:122341] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml2:122341] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml6:59937] *** Process received signal *** [taurusml6:59937] Signal: Aborted (6) [taurusml6:59937] Signal code: (-6) [taurusml6:59937] [ 0] [0x2000000504d8] [taurusml6:59937] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml6:59937] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml6:59937] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml6:59937] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml6:59937] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml6:59937] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml6:59937] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] [taurusml6:59937] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml6:59937] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml6:59937] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml6:59937] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml6:59937] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml6:59937] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml5:142532] *** Process received signal *** [taurusml5:142532] Signal: Aborted (6) [taurusml5:142532] Signal code: (-6) [taurusml5:142532] [ 0] [0x2000000504d8] [taurusml5:142532] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml5:142532] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml5:142532] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml5:142532] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml5:142532] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml5:142532] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml5:142532] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] [taurusml5:142532] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml5:142532] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml5:142532] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml5:142532] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml5:142532] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml5:142532] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml8:66896] *** Process received signal *** [taurusml8:66896] Signal: Aborted (6) [taurusml8:66896] Signal code: (-6) [taurusml8:66896] [ 0] [0x2000000504d8] [taurusml8:66896] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml8:66896] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml8:66896] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml8:66896] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml8:66896] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml8:66896] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml8:66896] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] [taurusml8:66896] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml8:66896] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml8:66896] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml8:66896] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml8:66896] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml8:66896] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml7:160636] *** Process received signal *** [taurusml7:160636] Signal: Aborted (6) [taurusml7:160636] Signal code: (-6) [taurusml7:160636] [ 0] [0x2000000504d8] [taurusml7:160636] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml7:160636] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml7:160636] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml7:160636] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml7:160636] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml7:160636] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml7:160636] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] [taurusml7:160636] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml7:160636] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml7:160636] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml7:160636] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml7:160636] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml7:160636] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml6:59933] *** Process received signal *** [taurusml6:59933] Signal: Aborted (6) [taurusml6:59933] Signal code: (-6) [taurusml6:59933] [ 0] [0x2000000504d8] [taurusml6:59933] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml6:59933] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml6:59933] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml6:59933] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml6:59933] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml6:59933] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml6:59933] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] [taurusml6:59933] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml6:59933] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml6:59933] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml6:59933] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml6:59933] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml6:59933] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml5:142533] *** Process received signal *** [taurusml5:142533] Signal: Aborted (6) [taurusml5:142533] Signal code: (-6) [taurusml5:142533] [ 0] [0x2000000504d8] [taurusml5:142533] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml5:142533] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml5:142533] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml5:142533] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml5:142533] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml5:142533] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml5:142533] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] [taurusml5:142533] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml5:142533] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml5:142533] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml5:142533] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml5:142533] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml5:142533] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml2:122342] *** Process received signal *** [taurusml2:122342] Signal: Aborted (6) [taurusml2:122342] Signal code: (-6) [taurusml2:122342] [ 0] [0x2000000504d8] [taurusml2:122342] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml2:122342] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml2:122342] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml2:122342] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml2:122342] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml2:122342] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml2:122342] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] [taurusml2:122342] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml2:122342] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml2:122342] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml8:66894] *** Process received signal *** [taurusml8:66894] Signal: Aborted (6) [taurusml8:66894] Signal code: (-6) [taurusml8:66894] [ 0] [0x2000000504d8] [taurusml8:66894] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml8:66894] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml8:66894] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml8:66894] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml8:66894] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml2:122342] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml2:122342] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml2:122342] *** End of error message *** [taurusml8:66894] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml8:66894] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] [taurusml8:66894] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml8:66894] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml8:66894] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml8:66894] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml8:66894] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml8:66894] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml7:160635] *** Process received signal *** [taurusml7:160635] Signal: Aborted (6) [taurusml7:160635] Signal code: (-6) [taurusml7:160635] [ 0] [0x2000000504d8] [taurusml7:160635] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml7:160635] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml7:160635] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml7:160635] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml7:160635] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml7:160635] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml7:160635] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] [taurusml7:160635] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml7:160635] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml7:160635] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml7:160635] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml7:160635] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml7:160635] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml4:143585] *** Process received signal *** [taurusml4:143585] Signal: Aborted (6) [taurusml4:143585] Signal code: (-6) [taurusml4:143585] [ 0] [0x2000000504d8] [taurusml4:143585] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml4:143585] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml4:143585] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml4:143585] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml4:143585] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml4:143585] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml4:143585] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] [taurusml4:143585] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml4:143585] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml4:143585] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml4:143585] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml4:143585] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml4:143585] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml2:122339] *** Process received signal *** [taurusml2:122339] Signal: Aborted (6) [taurusml2:122339] Signal code: (-6) [taurusml2:122339] [ 0] [0x2000000504d8] [taurusml2:122339] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml2:122339] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml2:122339] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml2:122339] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml2:122339] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml2:122339] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml2:122339] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] [taurusml2:122339] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml2:122339] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml2:122339] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml2:122339] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml2:122339] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml2:122339] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml7:160634] *** Process received signal *** [taurusml7:160634] Signal: Aborted (6) [taurusml7:160634] Signal code: (-6) [taurusml7:160634] [ 0] [0x2000000504d8] [taurusml7:160634] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml7:160634] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml7:160634] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml7:160634] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml7:160634] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml7:160634] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml7:160634] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] [taurusml7:160634] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml7:160634] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml7:160634] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml7:160634] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml7:160634] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml7:160634] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml4:143584] *** Process received signal *** [taurusml4:143584] Signal: Aborted (6) [taurusml4:143584] Signal code: (-6) [taurusml4:143584] [ 0] [0x2000000504d8] [taurusml4:143584] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml4:143584] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml4:143584] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml4:143584] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml4:143584] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml4:143584] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml4:143584] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] [taurusml4:143584] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml4:143584] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml4:143584] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml4:143584] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml4:143584] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml4:143584] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml2:122337] *** Process received signal *** [taurusml2:122337] Signal: Aborted (6) [taurusml2:122337] Signal code: (-6) [taurusml2:122337] [ 0] [0x2000000504d8] [taurusml2:122337] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml2:122337] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml2:122337] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml2:122337] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml2:122337] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml2:122337] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml2:122337] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] [taurusml2:122337] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml2:122337] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml2:122337] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml2:122337] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml2:122337] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml2:122337] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml7:160633] *** Process received signal *** [taurusml7:160633] Signal: Aborted (6) [taurusml7:160633] Signal code: (-6) [taurusml7:160633] [ 0] [0x2000000504d8] [taurusml7:160633] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml7:160633] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml7:160633] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml7:160633] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml7:160633] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml7:160633] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml7:160633] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] [taurusml7:160633] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml7:160633] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml7:160633] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml7:160633] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml7:160633] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml7:160633] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml4:143587] *** Process received signal *** [taurusml4:143587] Signal: Aborted (6) [taurusml4:143587] Signal code: (-6) [taurusml4:143587] [ 0] [0x2000000504d8] [taurusml4:143587] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml4:143587] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml4:143587] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml4:143587] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml4:143587] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml4:143587] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml4:143587] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] [taurusml4:143587] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml4:143587] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml4:143587] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml4:143587] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml4:143587] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml4:143587] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml5:142535] *** Process received signal *** terminate called after throwing an instance of 'CUDA::error' [taurusml5:142535] Signal: Aborted (6) [taurusml5:142535] Signal code: (-6) [taurusml5:142535] [ 0] [0x2000000504d8] [taurusml5:142535] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml5:142535] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml5:142535] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml5:142535] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml5:142535] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml5:142535] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml5:142535] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml8:66899] *** Process received signal *** [taurusml8:66899] Signal: Aborted (6) [taurusml8:66899] Signal code: (-6) [taurusml8:66899] [ 0] [0x2000000504d8] [taurusml8:66899] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml8:66899] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml8:66899] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml8:66899] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml8:66899] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml8:66899] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml8:66899] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] [taurusml8:66899] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml8:66899] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml8:66899] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml8:66899] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml8:66899] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml8:66899] *** End of error message *** [taurusml5:142535] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml5:142535] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml5:142535] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml5:142535] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml5:142535] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml5:142535] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml6:59935] *** Process received signal *** [taurusml6:59935] Signal: Aborted (6) [taurusml6:59935] Signal code: (-6) [taurusml6:59935] [ 0] [0x2000000504d8] [taurusml6:59935] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml6:59935] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml6:59935] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml6:59935] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml6:59935] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml6:59935] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml6:59935] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] [taurusml6:59935] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml6:59935] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml6:59935] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml6:59935] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml6:59935] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml6:59935] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml7:160632] *** Process received signal *** [taurusml7:160632] Signal: Aborted (6) [taurusml7:160632] Signal code: (-6) [taurusml7:160632] [ 0] [0x2000000504d8] [taurusml7:160632] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml7:160632] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml7:160632] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml7:160632] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml7:160632] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml7:160632] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml7:160632] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] [taurusml7:160632] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml7:160632] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml7:160632] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml7:160632] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml7:160632] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml7:160632] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml4:143586] *** Process received signal *** [taurusml4:143586] Signal: Aborted (6) [taurusml4:143586] Signal code: (-6) [taurusml4:143586] [ 0] [0x2000000504d8] [taurusml4:143586] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml4:143586] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml4:143586] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml4:143586] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml4:143586] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml4:143586] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml4:143586] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] [taurusml4:143586] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml4:143586] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml4:143586] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml4:143586] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml4:143586] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml4:143586] *** End of error message *** terminate called after throwing an instance of 'std::runtime_error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/alpaka/include/alpaka/stream/StreamCudaRtAsync.hpp(90) 'cudaStreamCreateWithFlags( &m_CudaStream, 0x01)' returned error : 'cudaErrorMemoryAllocation': 'out of memory'! [taurusml4:143588] *** Process received signal *** [taurusml4:143588] Signal: Aborted (6) [taurusml4:143588] Signal code: (-6) [taurusml4:143588] [ 0] [0x2000000504d8] [taurusml4:143588] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml4:143588] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml4:143588] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml4:143588] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml4:143588] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml4:143588] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN6alpaka4cuda6detail11cudaRtCheckERK9cudaErrorPKcS6_RKi+0x2f8)[0x108fb408] [taurusml4:143588] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN6alpaka4cuda6detail17cudaRtCheckIgnoreIJEEEvRK9cudaErrorPKcS7_RKiDpOT_+0xa0)[0x109019c0] [taurusml4:143588] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_Z17cuplaStreamCreatePPv+0x1cc)[0x10d2ef4c] [taurusml4:143588] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc11EventStreamC1Ev+0x38)[0x10901328] [taurusml4:143588] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0x1054)[0x10bb3154] [taurusml4:143588] [11] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml4:143588] [12] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml4:143588] [13] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml4:143588] [14] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml4:143588] [15] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml4:143588] *** End of error message *** srun: error: taurusml7: task 29: Aborted srun: Terminating job step 14334608.1 terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml8:66897] *** Process received signal *** [taurusml8:66897] Signal: Aborted (6) [taurusml8:66897] Signal code: (-6) [taurusml8:66897] [ 0] [0x2000000504d8] [taurusml8:66897] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml8:66897] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml8:66897] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml8:66897] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml8:66897] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml8:66897] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml8:66897] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] [taurusml8:66897] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml8:66897] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml8:66897] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml8:66897] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml8:66897] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml8:66897] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml5:142537] *** Process received signal *** [taurusml5:142537] Signal: Aborted (6) [taurusml5:142537] Signal code: (-6) [taurusml5:142537] [ 0] [0x2000000504d8] [taurusml5:142537] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml5:142537] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml5:142537] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml5:142537] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml5:142537] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml5:142537] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml5:142537] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] [taurusml5:142537] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml5:142537] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml5:142537] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml5:142537] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml5:142537] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml5:142537] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml6:59932] *** Process received signal *** [taurusml6:59932] Signal: Aborted (6) [taurusml6:59932] Signal code: (-6) [taurusml6:59932] [ 0] [0x2000000504d8] [taurusml6:59932] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml6:59932] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml6:59932] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml6:59932] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml6:59932] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml6:59932] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml6:59932] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] [taurusml6:59932] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml6:59932] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml6:59932] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml6:59932] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml6:59932] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml6:59932] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml2:122340] *** Process received signal *** [taurusml2:122340] Signal: Aborted (6) [taurusml2:122340] Signal code: (-6) [taurusml2:122340] [ 0] [0x2000000504d8] [taurusml2:122340] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml2:122340] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml2:122340] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml2:122340] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml2:122340] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml2:122340] [ 6] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1095f588] [taurusml2:122340] [ 7] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0xb28)[0x10bb2c28] [taurusml2:122340] [ 8] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x10c2c1e0] [taurusml2:122340] [ 9] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x10c2ce50] [taurusml2:122340] [10] /scratch/ws/...-LPWFA_till_2019-09/runs_LWFA/001_LWFA_rerun097_doping=0.01_Laguerre/input/bin/picongpu(main+0x118)[0x10873b98] [taurusml2:122340] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml2:122340] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml2:122340] *** End of error message *** srun: error: taurusml4: task 6: Aborted srun: error: taurusml7: task 27: Aborted srun: error: taurusml7: task 28: Aborted srun: error: taurusml7: tasks 24,26: Aborted srun: error: taurusml5: tasks 16-17: Aborted srun: error: taurusml4: task 9: Aborted srun: error: taurusml8: task 34: Aborted srun: error: taurusml6: tasks 19-20,22-23: Aborted srun: error: taurusml7: task 25: Aborted srun: error: taurusml6: task 21: Aborted srun: error: taurusml8: task 30: Aborted srun: error: taurusml4: tasks 7-8,11: Aborted srun: error: taurusml5: tasks 12-13,15: Aborted srun: error: taurusml6: task 18: Aborted srun: error: taurusml2: tasks 0-2,4-5: Aborted srun: error: taurusml8: tasks 31-32,35: Aborted srun: error: taurusml4: task 10: Aborted srun: error: taurusml5: task 14: Aborted srun: error: taurusml8: task 33: Aborted srun: error: taurusml2: task 3: Aborted srun: Job step aborted: Waiting up to 32 seconds for job step to finish. srun: error: taurusml9: task 36: Killed srun: error: taurusml11: task 51: Killed srun: error: taurusml10: task 44: Killed srun: error: taurusml9: task 38: Killed srun: error: taurusml11: task 50: Killed srun: error: taurusml9: tasks 37,39-40: Killed srun: error: taurusml10: tasks 46-47: Killed srun: error: taurusml11: tasks 49,52-53: Killed srun: error: taurusml10: tasks 43,45: Killed srun: error: taurusml9: task 41: Killed srun: error: taurusml10: task 42: Killed srun: error: taurusml11: task 48: Killed ```

PrometheusPi commented 5 years ago

This is the last verbose output before the crash:

```bash PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'n' (1 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'e' (1 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'n' (1 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'n' (1 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'n' (1 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'n' (1 uses) PMaccVerbose MEMORY(1) | DataConnector: data shared 'MallocMCBuffer' PMaccVerbose MEMORY(1) | DataConnector: data shared 'MallocMCBuffer' PMaccVerbose MEMORY(1) | Create device 1D data: 0 MiB PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'E' (1 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'B' (1 uses) PMaccVerbose MEMORY(1) | Create device 1D data: 0 MiB PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'E' (1 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'B' (1 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'E' (2 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'B' (2 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'E' (2 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'B' (2 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'e' (1 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'e' (1 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'en' (1 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'en' (1 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'n' (1 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'E' (2 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'B' (2 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'e' (1 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'RNGProvider3XorMin' (1 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'n' (1 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'E' (2 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'B' (2 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'e' (1 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'RNGProvider3XorMin' (1 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'n' (1 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'e' (1 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'n' (1 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'e' (1 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'n' (1 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'e' (1 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'e' (1 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'n' (1 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'n' (1 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'n' (1 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'e' (1 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'e' (1 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'n' (1 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'n' (1 uses) PMaccVerbose MEMORY(1) | DataConnector: data shared 'MallocMCBuffer' PMaccVerbose MEMORY(1) | Create device 1D data: 0 MiB PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'E' (1 uses) PMaccVerbose MEMORY(1) | DataConnector: sharing access to 'B' (1 uses) ```

steindev commented 5 years ago

You could try these two things to find the actual problem:

Reduce the problem resolution, i.e. number of cells, until it runs. Later, figure out how many devices you need to run the simulation.
For the moment, do not write checkpoints and full sim output.

I remember that I had problems of this kind, too. It seemed as if I could utilize only half of a GPUs memory (or less?). Otherwise I ran into errors.

PrometheusPi commented 5 years ago

@psychocoderHPC The simulation crashed again.

@psychocoderHPC suggested the following changes:

```diff diff --git a/include/picongpu/simulationControl/MySimulation.hpp b/include/picongpu/simulationControl/MySimulation.hpp index 863a9e6..bc94faa 100644 --- a/include/picongpu/simulationControl/MySimulation.hpp +++ b/include/picongpu/simulationControl/MySimulation.hpp @@ -21,6 +21,8 @@ #pragma once +#include + #include #include @@ -382,11 +384,12 @@ public: ForEach< VectorAllSpecies, particles::CreateSpecies > createSpeciesMemory; createSpeciesMemory( deviceHeap, cellDescription ); + std::this_thread::sleep_for(std::chrono::milliseconds(2000* gridCon.getHostRank() )); size_t freeGpuMem(0); Environment<>::get().MemoryInfo().getMemoryInfo(&freeGpuMem); if(freeGpuMem < reservedGpuMemorySize) { - pmacc::log< picLog::MEMORY > ("%1% MiB free memory < %2% MiB required reserved memory") + pmacc::log< picLog::PHYSICS > ("%1% MiB free memory < %2% MiB required reserved memory") % (freeGpuMem / 1024 / 1024) % (reservedGpuMemorySize / 1024 / 1024) ; std::stringstream msg; msg << "Cannot reserve " @@ -394,6 +397,11 @@ public: << (freeGpuMem / 1024 / 1024) << " MiB free device memory left"; throw std::runtime_error(msg.str()); } + else + { + pmacc::log< picLog::PHYSICS > ("%1% MiB free memory < %2% MiB required reserved memory (else path)") + % (freeGpuMem / 1024 / 1024) % (reservedGpuMemorySize / 1024 / 1024) ; + } #if( PMACC_CUDA_ENABLED == 1 ) size_t heapSize = freeGpuMem - reservedGpuMemorySize; ```

The added output printed:

...
PIConGPUVerbose PHYSICS(1) | 26008 MiB free memory < 350 MiB required reserved memory (else path)
PIConGPUVerbose PHYSICS(1) | 26008 MiB free memory < 350 MiB required reserved memory (else path)

``` Running program... @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ @ Note: You need to compile picongpu on a node. @ @ Likewise for building the libraries. @ @ Get a node with the getNode command. @ @ Then source V100_picongpu.profile again.@ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ PMaccVerbose CUDA_RT(16) | Set device to 5: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 5: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 3: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 4: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 3: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 1: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 5: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 5: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 3: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 3: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 3: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 0: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 5: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 0: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 2: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 1: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 2: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 2: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 4: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 3: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 0: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 3: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 4: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 5: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 1: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 2: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 1: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 4: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 0: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 0: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 4: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 3: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 1: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 2: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 4: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 5: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 2: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 4: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 1: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 0: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 3: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 1: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 0: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 2: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 5: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 0: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 1: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 2: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 4: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 1: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 4: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 2: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 5: Tesla V100-SXM2-32GB PMaccVerbose CUDA_RT(16) | Set device to 0: Tesla V100-SXM2-32GB PIConGPU: 0.4.2 Build-Type: Release Third party: OS: Linux-4.14.0-115.10.1.el7a.ppc64le arch: ppc64le CXX: GNU (7.3.0) CMake: 3.11.4 CUDA: 9.2.148 mallocMC: 2.3.0 Boost: 1.68.0 MPI: standard: 3.1 flavor: OpenMPI (3.1.4) PNGwriter: 0.7.0 libSplash: 1.7.0 (Format 4.0) ADIOS: 1.13.1 PIConGPUVerbose PHYSICS(1) | Sliding Window is ON PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PIConGPUVerbose PHYSICS(1) | used Random Number Generator: RNGProvider3XorMin seed: 42 PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PIConGPUVerbose PHYSICS(1) | 26008 MiB free memory < 350 MiB required reserved memory (else path) PIConGPUVerbose PHYSICS(1) | 26008 MiB free memory < 350 MiB required reserved memory (else path) PIConGPUVerbose PHYSICS(1) | 26008 MiB free memory < 350 MiB required reserved memory (else path) PIConGPUVerbose PHYSICS(1) | 26008 MiB free memory < 350 MiB required reserved memory (else path) PIConGPUVerbose PHYSICS(1) | 26008 MiB free memory < 350 MiB required reserved memory (else path) PIConGPUVerbose PHYSICS(1) | 26008 MiB free memory < 350 MiB required reserved memory (else path) PIConGPUVerbose PHYSICS(1) | 26008 MiB free memory < 350 MiB required reserved memory (else path) PIConGPUVerbose PHYSICS(1) | 26008 MiB free memory < 350 MiB required reserved memory (else path) PIConGPUVerbose PHYSICS(1) | 26008 MiB free memory < 350 MiB required reserved memory (else path) PIConGPUVerbose PHYSICS(1) | 26008 MiB free memory < 350 MiB required reserved memory (else path) PIConGPUVerbose PHYSICS(1) | 26008 MiB free memory < 350 MiB required reserved memory (else path) PIConGPUVerbose PHYSICS(1) | 26008 MiB free memory < 350 MiB required reserved memory (else path) PIConGPUVerbose PHYSICS(1) | 26008 MiB free memory < 350 MiB required reserved memory (else path) PIConGPUVerbose PHYSICS(1) | 26008 MiB free memory < 350 MiB required reserved memory (else path) PIConGPUVerbose PHYSICS(1) | 26008 MiB free memory < 350 MiB required reserved memory (else path) PIConGPUVerbose PHYSICS(1) | 26008 MiB free memory < 350 MiB required reserved memory (else path) PIConGPUVerbose PHYSICS(1) | 26008 MiB free memory < 350 MiB required reserved memory (else path) PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PIConGPUVerbose PHYSICS(1) | 26008 MiB free memory < 350 MiB required reserved memory (else path) PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PMaccVerbose CUDA_RT(16) | create event PIConGPUVerbose PHYSICS(1) | 26008 MiB free memory < 350 MiB required reserved memory (else path) PIConGPUVerbose PHYSICS(1) | 26008 MiB free memory < 350 MiB required reserved memory (else path) ```

stderr with a grep what -B 3:

Module libpng/1.6.34-GCCcore-7.3.0 unloaded.
Module libpng/1.6.34-GCCcore-7.3.0 loaded.
terminate called after throwing an instance of 'CUDA::error'
  what():  /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory
--
[taurusml4:148353] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4]
[taurusml4:148353] *** End of error message ***
terminate called after throwing an instance of 'std::runtime_error'
  what():  /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/alpaka/include/alpaka/stream/StreamCudaRtAsync.hpp(90) 'cudaStreamCreateWithFlags( &m_CudaStream, 0x01)' returned error  : 'cudaErrorMemoryAllocation': 'out of memory'!
--
[taurusml6:64795] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0]
[taurusml6:64795] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90]
terminate called after throwing an instance of 'CUDA::error'
  what():  /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory
--
[taurusml6:64795] [15] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4]
[taurusml6:64795] *** End of error message ***
terminate called after throwing an instance of 'CUDA::error'
  what():  /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory
--
srun: error: taurusml4: task 6: Aborted
srun: Terminating job step 14399826.1
terminate called after throwing an instance of 'CUDA::error'
  what():  /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory```

``` Module GCCcore/7.3.0, binutils/2.30-GCCcore-7.3.0, GCC/7.3.0-2.30, CUDA/9.2.88-GCC-7.3.0-2.30, gcccuda/2018b, zlib/1.2.11-GCCcore-7.3.0, numactl/2.0.11-GCCcore-7.3.0, hwloc/1.11.10-GCCcore-7.3.0, OpenMPI/3.1.4-gcccuda-2018b, OpenBLAS/0.3.1-GCC-7.3.0-2.30, gompic/2018b, FFTW/3.3.8-gompic-2018b, ScaLAPACK/2.0.2-gompic-2018b-OpenBLAS-0.3.1, fosscuda/2018b, ncurses/6.1-GCCcore-7.3.0, CMake/3.11.4-GCCcore-7.3.0, libpng/1.6.34-GCCcore-7.3.0 unloaded. Module GCCcore/7.3.0, binutils/2.30-GCCcore-7.3.0, GCC/7.3.0-2.30, CUDA/9.2.88-GCC-7.3.0-2.30, gcccuda/2018b, zlib/1.2.11-GCCcore-7.3.0, numactl/2.0.11-GCCcore-7.3.0, hwloc/1.11.10-GCCcore-7.3.0, OpenMPI/3.1.4-gcccuda-2018b, OpenBLAS/0.3.1-GCC-7.3.0-2.30, gompic/2018b, FFTW/3.3.8-gompic-2018b, ScaLAPACK/2.0.2-gompic-2018b-OpenBLAS-0.3.1, fosscuda/2018b, ncurses/6.1-GCCcore-7.3.0, CMake/3.11.4-GCCcore-7.3.0, libpng/1.6.34-GCCcore-7.3.0 loaded. Module fosscuda/2018b unloaded. Module fosscuda/2018b loaded. Module CMake/3.11.4-GCCcore-7.3.0 unloaded. Module CMake/3.11.4-GCCcore-7.3.0 loaded. Module libpng/1.6.34-GCCcore-7.3.0 unloaded. Module libpng/1.6.34-GCCcore-7.3.0 loaded. terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml4:148353] *** Process received signal *** [taurusml4:148353] Signal: Aborted (6) [taurusml4:148353] Signal code: (-6) [taurusml4:148353] [ 0] [0x2000000504d8] [taurusml4:148353] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml4:148353] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml4:148353] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml4:148353] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml4:148353] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml4:148353] [ 6] /scratch/ws/...-LPWFA_till_2019-09/_test/runs/002_LPWFArerun066/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1172fdc8] [taurusml4:148353] [ 7] /scratch/ws/...-LPWFA_till_2019-09/_test/runs/002_LPWFArerun066/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0x1388)[0x11a687a8] [taurusml4:148353] [ 8] /scratch/ws/...-LPWFA_till_2019-09/_test/runs/002_LPWFArerun066/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x11a54750] [taurusml4:148353] [ 9] /scratch/ws/...-LPWFA_till_2019-09/_test/runs/002_LPWFArerun066/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x11a55220] [taurusml4:148353] [10] /scratch/ws/...-LPWFA_till_2019-09/_test/runs/002_LPWFArerun066/input/bin/picongpu(main+0x118)[0x115dd3c8] [taurusml4:148353] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml4:148353] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml4:148353] *** End of error message *** terminate called after throwing an instance of 'std::runtime_error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/alpaka/include/alpaka/stream/StreamCudaRtAsync.hpp(90) 'cudaStreamCreateWithFlags( &m_CudaStream, 0x01)' returned error : 'cudaErrorMemoryAllocation': 'out of memory'! [taurusml6:64795] *** Process received signal *** [taurusml6:64795] Signal: Aborted (6) [taurusml6:64795] Signal code: (-6) [taurusml6:64795] [ 0] [0x2000000504d8] [taurusml6:64795] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml6:64795] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml6:64795] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml6:64795] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml6:64795] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml3:119035] *** Process received signal *** [taurusml3:119035] Signal: Aborted (6) [taurusml3:119035] Signal code: (-6) [taurusml3:119035] [ 0] [0x2000000504d8] [taurusml3:119035] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml3:119035] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml3:119035] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml3:119035] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml3:119035] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml3:119035] [ 6] /scratch/ws/...-LPWFA_till_2019-09/_test/runs/002_LPWFArerun066/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1172fdc8] [taurusml3:119035] [ 7] /scratch/ws/...-LPWFA_till_2019-09/_test/runs/002_LPWFArerun066/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0x1388)[0x11a687a8] [taurusml3:119035] [ 8] /scratch/ws/...-LPWFA_till_2019-09/_test/runs/002_LPWFArerun066/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x11a54750] [taurusml3:119035] [ 9] /scratch/ws/...-LPWFA_till_2019-09/_test/runs/002_LPWFArerun066/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x11a55220] [taurusml3:119035] [10] /scratch/ws/...-LPWFA_till_2019-09/_test/runs/002_LPWFArerun066/input/bin/picongpu(main+0x118)[0x115dd3c8] [taurusml3:119035] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml3:119035] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml3:119035] *** End of error message *** [taurusml6:64795] [ 6] /scratch/ws/...-LPWFA_till_2019-09/_test/runs/002_LPWFArerun066/input/bin/picongpu(_ZN6alpaka4cuda6detail11cudaRtCheckERK9cudaErrorPKcS6_RKi+0x2f8)[0x116a4f88] [taurusml6:64795] [ 7] /scratch/ws/...-LPWFA_till_2019-09/_test/runs/002_LPWFArerun066/input/bin/picongpu(_ZN6alpaka4cuda6detail17cudaRtCheckIgnoreIJEEEvRK9cudaErrorPKcS7_RKiDpOT_+0xa0)[0x116af500] [taurusml6:64795] [ 8] /scratch/ws/...-LPWFA_till_2019-09/_test/runs/002_LPWFArerun066/input/bin/picongpu(_Z17cuplaStreamCreatePPv+0x1cc)[0x11cedcbc] [taurusml6:64795] [ 9] /scratch/ws/...-LPWFA_till_2019-09/_test/runs/002_LPWFArerun066/input/bin/picongpu(_ZN5pmacc11EventStreamC1Ev+0x38)[0x116aee68] [taurusml6:64795] [10] /scratch/ws/...-LPWFA_till_2019-09/_test/runs/002_LPWFArerun066/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0x1ad4)[0x11a68ef4] [taurusml6:64795] [11] /scratch/ws/...-LPWFA_till_2019-09/_test/runs/002_LPWFArerun066/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x11a54750] [taurusml6:64795] [12] /scratch/ws/...-LPWFA_till_2019-09/_test/runs/002_LPWFArerun066/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x11a55220] [taurusml6:64795] [13] /scratch/ws/...-LPWFA_till_2019-09/_test/runs/002_LPWFArerun066/input/bin/picongpu(main+0x118)[0x115dd3c8] [taurusml6:64795] [14] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml6:64795] [15] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml6:64795] *** End of error message *** terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml8:71698] *** Process received signal *** [taurusml8:71698] Signal: Aborted (6) [taurusml8:71698] Signal code: (-6) [taurusml8:71698] [ 0] [0x2000000504d8] [taurusml8:71698] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml8:71698] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml8:71698] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml8:71698] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml8:71698] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml8:71698] [ 6] /scratch/ws/...-LPWFA_till_2019-09/_test/runs/002_LPWFArerun066/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1172fdc8] [taurusml8:71698] [ 7] /scratch/ws/...-LPWFA_till_2019-09/_test/runs/002_LPWFArerun066/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0x1388)[0x11a687a8] [taurusml8:71698] [ 8] /scratch/ws/...-LPWFA_till_2019-09/_test/runs/002_LPWFArerun066/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x11a54750] [taurusml8:71698] [ 9] /scratch/ws/...-LPWFA_till_2019-09/_test/runs/002_LPWFArerun066/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x11a55220] [taurusml8:71698] [10] /scratch/ws/...-LPWFA_till_2019-09/_test/runs/002_LPWFArerun066/input/bin/picongpu(main+0x118)[0x115dd3c8] [taurusml8:71698] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml8:71698] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml8:71698] *** End of error message *** srun: error: taurusml4: task 6: Aborted srun: Terminating job step 14399826.1 terminate called after throwing an instance of 'CUDA::error' what(): /scratch/ws/...-LPWFA_till_2019-09/picongpu/thirdParty/mallocMC/src/include/mallocMC/reservePoolPolicies/SimpleCudaMalloc_impl.hpp(42): error: out of memory [taurusml5:147523] *** Process received signal *** [taurusml5:147523] Signal: Aborted (6) [taurusml5:147523] Signal code: (-6) [taurusml5:147523] [ 0] [0x2000000504d8] [taurusml5:147523] [ 1] /usr/lib64/libc.so.6(abort+0x2b4)[0x200000da2094] [taurusml5:147523] [ 2] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1c4)[0x200000b3f7c4] [taurusml5:147523] [ 3] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(+0xba524)[0x200000b3a524] [taurusml5:147523] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZSt9terminatev+0x20)[0x200000b3a5e0] [taurusml5:147523] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(__cxa_throw+0x80)[0x200000b3aa90] [taurusml5:147523] [ 6] /scratch/ws/...-LPWFA_till_2019-09/_test/runs/002_LPWFArerun066/input/bin/picongpu(_ZN8mallocMC9AllocatorINS_16CreationPolicies7ScatterIN8picongpu16DeviceHeapConfigENS1_11ScatterConf27DefaultScatterHashingParamsEEENS_20DistributionPolicies4NoopENS_11OOMPolicies10ReturnNullENS_19ReservePoolPolicies16SimpleCudaMallocENS_17AlignmentPolicies6ShrinkINSE_12ShrinkConfig19DefaultShrinkConfigEEEE5allocEm+0x4a8)[0x1172fdc8] [taurusml5:147523] [ 7] /scratch/ws/...-LPWFA_till_2019-09/_test/runs/002_LPWFArerun066/input/bin/picongpu(_ZN8picongpu12MySimulation4initEv+0x1388)[0x11a687a8] [taurusml5:147523] [ 8] /scratch/ws/...-LPWFA_till_2019-09/_test/runs/002_LPWFArerun066/input/bin/picongpu(_ZN5pmacc16SimulationHelperILj3EE15startSimulationEv+0x70)[0x11a54750] [taurusml5:147523] [ 9] /scratch/ws/...-LPWFA_till_2019-09/_test/runs/002_LPWFArerun066/input/bin/picongpu(_ZN8picongpu17SimulationStarterINS_21InitialiserControllerENS_16PluginControllerENS_12MySimulationEE5startEv+0x110)[0x11a55220] [taurusml5:147523] [10] /scratch/ws/...-LPWFA_till_2019-09/_test/runs/002_LPWFArerun066/input/bin/picongpu(main+0x118)[0x115dd3c8] [taurusml5:147523] [11] /usr/lib64/libc.so.6(+0x25200)[0x200000d85200] [taurusml5:147523] [12] /usr/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000d853f4] [taurusml5:147523] *** End of error message *** srun: Job step aborted: Waiting up to 32 seconds for job step to finish. slurmstepd: error: *** STEP 14399826.1 ON taurusml3 CANCELLED AT 2019-09-19T13:28:09 *** srun: error: taurusml6: task 18: Aborted srun: error: taurusml3: task 0: Aborted srun: error: taurusml8: task 30: Aborted srun: error: taurusml5: task 12: Aborted srun: error: taurusml9: task 40: Killed srun: error: taurusml3: task 4: Killed srun: error: taurusml10: task 46: Killed srun: error: taurusml11: task 53: Killed srun: error: taurusml6: tasks 22-23: Killed srun: error: taurusml7: tasks 28-29: Killed srun: error: taurusml3: tasks 2-3,5: Killed srun: error: taurusml10: task 42: Killed srun: error: taurusml6: tasks 20-21: Killed srun: error: taurusml7: task 26: Killed srun: error: taurusml4: tasks 8-11: Killed srun: error: taurusml10: tasks 44,47: Killed srun: error: taurusml9: task 36: Killed srun: error: taurusml3: task 1: Killed srun: error: taurusml7: tasks 24,27: Killed srun: error: taurusml6: task 19: Killed srun: error: taurusml11: tasks 48,50-52: Killed srun: error: taurusml4: task 7: Killed srun: error: taurusml10: task 45: Killed srun: error: taurusml9: tasks 38-39,41: Killed srun: error: taurusml7: task 25: Killed srun: error: taurusml9: task 37: Killed srun: error: taurusml10: task 43: Killed srun: error: taurusml11: task 49: Killed srun: error: taurusml5: task 15: Killed srun: error: taurusml8: task 35: Killed srun: error: taurusml5: tasks 13,16-17: Killed srun: error: taurusml8: tasks 31,33-34: Killed srun: error: taurusml5: task 14: Killed srun: error: taurusml8: task 32: Killed ```

PrometheusPi commented 5 years ago

Node	JOB ID	status
taurusml1	14403281	defective
taurusml2	14403298	defective
taurusml3	14403318	defective
taurusml4	14403329	defective
taurusml5	14403334	defective
taurusml6	14403344	defective
taurusml7	14403350	defective
taurusml8	14403356	defective
taurusml9	14403363	working
taurusml10	14403367	working
taurusml11	14403376	working
taurusml12	14403388	defective
taurusml13	14403392	working
taurusml14	14403399	defective
taurusml15	14403411	defective
taurusml16	14403421	working
taurusml17	14403429	working
taurusml18	14403437	working
taurusml19	14403446	working
taurusml20	14403454	working
taurusml21	14403470	working
taurusml22	14403474	working
taurusml23	14403480	defective
taurusml24	14403484	defective
taurusml25	14403489	working
taurusml26	14403496	working
taurusml27	14403503	working
taurusml28	14403509	working
taurusml29	14403518	defective
taurusml30	14403543	working
taurusml31	14403554	working
taurusml32	14403575	defective

psychocoderHPC commented 5 years ago

We are currently doing more test but I think the problem is coming from memory fragmentations. We are querying the amount of free memory with the cuda function cudaMemGetInfo. It looks like the free amount provided by this call is not taken half-filled pages into account. Since we allocate first all memory which has a fixed size we have also a lot of small allocations, those allocations maybe get placed on different memory pages. When we then allocate the large memory chunk for the mallocMC heap the driver is not able to find enough free memory pages.

I found a issues but it is very old but also point to this problem.

psychocoderHPC commented 5 years ago

TODO for tomorrow:

Check if only GPUs are effected those are using spare memory pages. If a GPU has memory bit flips the driver can deactivate memory pages and use spare pages (on K80 there were 10 spare pages).

PrometheusPi commented 5 years ago

all nodes are now complete

only 17 of 32 are usable for our simulations. Thus two standard L(P)WFA simulations using 9 nodes will not run in parallel.

PrometheusPi commented 5 years ago

I reran the test over all nodes. So far, is is far from finished, but it looks like, now also taurusml30 has to be considered defective. (This might be caused by the changes @psychocoderHPC and me added in the debugging process.)

steindev commented 5 years ago

Wouldn't you expect a change whenever you run? I would expect, if the error is related to the current memory layout, that the error is also related to previous GPU usage or at least the somewhat random memory allocation at initialization. Or is my understanding of the problem wrong?

PrometheusPi commented 5 years ago

Crashed do not necessarily depend on the previous usage. (Only if more pages became faulty.) Currently it looks like the defective nodes are mostly reproducible - but I will have a detailed look at that today. As far as I understand @psychocoderHPC, the cause for the error is either a bug from our side (how we allocate memory) or a strange behavior of the nvcc. He can explain it in more detail.

psychocoderHPC commented 5 years ago

This BUG is maybe related to https://github.com/ComputationalRadiationPhysics/alpaka/issues/850 The problem is that mallocMC is calling kernel without adding any stream. Normally this will block all other stream but since we are using cudaStreamCrateWithFlags and disable this blocking behavior we need to review PIConGPU and mallocMC for side effects.

PrometheusPi commented 5 years ago

The last code fix @psychocoderHPC proposed solved the issue. There should be a fix out soon.

PrometheusPi commented 5 years ago

@psychocoderHPC: @StillerPatrick now has also issues on the V100 nodes that we see as defective. He is certain it has to do with both memory and MPI.

PrometheusPi commented 4 years ago

~~EDIT:~~ ~~Sorry - this was just a cluster hick-up - no job ran so far.~~ EDIT2: fixed

Node	status
taurusml1	defective
taurusml2	defective
taurusml3	defective
taurusml4	defective
taurusml5	defective
taurusml6	defective
taurusml7	defective
taurusml8	defective
taurusml9	defective
taurusml10	defective
taurusml11	defective
taurusml12	defective
taurusml13	defective
taurusml14	works
taurusml15	works
taurusml16	defective
taurusml17	defective
taurusml18	defective
taurusml19	defective
taurusml20	defective
taurusml21	defective
taurusml22	defective
taurusml23	defective
taurusml24	defective
taurusml25	defective
taurusml26	defective
taurusml27	defective
taurusml28	defective
taurusml29	defective
taurusml30	defective
taurusml31	defective
taurusml32	defective

It looks like there have been changes.

steindev commented 3 years ago

@psychocoderHPC: Was the fix mentioned in this issue ever pushed into mainline? It looks as if @PrometheusPi encounters the same error again, see #3433.

psychocoderHPC commented 3 years ago

No this was never going into the mainline. The workaround/fix is only fixing a broken system. We will write a reproduce and hopefully show that this is a driver issue and the notes should be restarted.

ComputationalRadiationPhysics / picongpu

taurus V100 out-of-memory error #3064