I see that both allgather and alltoall (in benchmarks/ampi/alltoall/) are crashing during exit with this backtrace
On process 0:
Starting benchmark on 2 processors with 100 iterations
100 1024 0.022 msec, 753.627 Mbits/sec
[Partition 0][Node 0] End of program
[Thread 0x7ffff2215700 (LWP 32664) exited]
[Thread 0x7ffff5018700 (LWP 32658) exited]
[Inferior 1 (process 32653) exited normally]
On process 1:
0x00007ffff684b438 in __GI_raise (sig=sig@entry=6)
at ../sysdeps/unix/sysv/linux/raise.c:54
54 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 0x00007ffff684b438 in __GI_raise (sig=sig@entry=6)
at ../sysdeps/unix/sysv/linux/raise.c:54
#1 0x00007ffff684d03a in __GI_abort () at abort.c:89
#2 0x00000000008f1f9e in dlmalloc_impl::mspace_free (
this=0xcd0680 <global_malloc_instance_storage>,
msp=0xcd0720 <main_arena+64>, mem=0x7ffff7ef0c80)
at /scratch/nitin/charm/src/conv-core/memory-gnu-internal.C:5726
#3 0x00000000008e8153 in mm_impl_free (mem=0x7ffff7ef0c80)
at /scratch/nitin/charm/src/conv-core/memory-gnu.C:874
#4 0x00000000008e981c in mm_free (mem=0x7ffff7ef0c80)
at /scratch/nitin/charm/src/conv-core/memory.C:734
#5 0x00000000008e9a75 in free (mem=0x7ffff7ef0c80)
at /scratch/nitin/charm/src/conv-core/memory.C:906
#6 0x00007ffff43e1179 in deregister_handler ()
from /scratch/nitin/openmpi-4.0.1/build/lib/openmpi/mca_pmix_pmix3x.so
#7 0x00007ffff1013322 in finalize ()
from /scratch/nitin/openmpi-4.0.1/build/lib/openmpi/mca_errmgr_default_app.so
#8 0x00007ffff65ab326 in orte_errmgr_base_close ()
from /scratch/nitin/openmpi-4.0.1/build/lib/libopen-rte.so.40
#9 0x00007ffff62a87d9 in mca_base_framework_close ()
from /scratch/nitin/openmpi-4.0.1/build/lib/libopen-pal.so.40
#10 0x00007ffff65ad93a in orte_ess_base_app_finalize ()
The failure is only seen in a 2 process run and this was seen on courage with an mpi-linux-x86_64-debug build (./build LIBS mpi-linux-x86_64 --suffix=debug --enable-error-checking -j16 -g -O0)
I see that both
allgather
andalltoall
(inbenchmarks/ampi/alltoall/
) are crashing during exit with this backtraceOn process 0:
On process 1:
The failure is only seen in a 2 process run and this was seen on
courage
with anmpi-linux-x86_64-debug
build (./build LIBS mpi-linux-x86_64 --suffix=debug --enable-error-checking -j16 -g -O0
)