open-mpi / ompi

Open MPI main development repository
https://www.open-mpi.org
Other
2.18k stars 864 forks source link

ARM invalid address alignment on IBM benchmarks #11646

Open a-szegel opened 1 year ago

a-szegel commented 1 year ago

Background information

AWS was looking at AWS MTT issues and notices that on ARM arch'es, IBM benchmark ibm/onesided/c_reqops was failing due to:

[ip-172-31-24-134:12271] *** Process received signal ***
[ip-172-31-24-134:12271] Signal: Bus error (7)
[ip-172-31-24-134:12271] Signal code: Invalid address alignment (1)

I was able to isolate this to not include EFA.

What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)

v4.0.x

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

git clone:

./configure --prefix=/home/ec2-user/PortaFiducia/build/libraries/openmpi/v4.1.x-debug/install --with-sge --without-verbs --with-libfabric=/home/ec2-user/libfabric/install --disable-man-pages --with-libevent=external --with-hwloc=external --enable-builtin-atomics --enable-debug

If you are building/installing from a git clone, please copy-n-paste the output from git submodule status.

> git submodule status
> 

Please describe the system on which you are running


Details of the problem

> mpirun -np 2 -N 2 -hostfile /home/ec2-user/hostfile /home/ec2-user/ompi-tests/ibm/onesided/c_reqops
[ip-172-31-24-134:12580] *** Process received signal ***
[ip-172-31-24-134:12580] Signal: Bus error (7)
[ip-172-31-24-134:12580] Signal code: Invalid address alignment (1)
[ip-172-31-24-134:12580] Failing at address: 0x3bbc6c03
[ip-172-31-24-134:12580] [ 0] linux-vdso.so.1(__kernel_rt_sigreturn+0x0)[0xffff8cfa7860]
[ip-172-31-24-134:12580] [ 1] /home/ec2-user/PortaFiducia/build/libraries/openmpi/v4.1.x-debug/install/lib/openmpi/mca_btl_vader.so(+0xd7ec)[0xffff89f6b7ec]
[ip-172-31-24-134:12580] [ 2] /home/ec2-user/PortaFiducia/build/libraries/openmpi/v4.1.x-debug/install/lib/openmpi/mca_btl_vader.so(+0xc620)[0xffff89f6a620]
[ip-172-31-24-134:12580] [ 3] /home/ec2-user/PortaFiducia/build/libraries/openmpi/v4.1.x-debug/install/lib/openmpi/mca_btl_vader.so(+0xcbd4)[0xffff89f6abd4]
[ip-172-31-24-134:12580] [ 4] /home/ec2-user/PortaFiducia/build/libraries/openmpi/v4.1.x-debug/install/lib/openmpi/mca_btl_vader.so(+0xcd68)[0xffff89f6ad68]
[ip-172-31-24-134:12580] [ 5] /home/ec2-user/PortaFiducia/build/libraries/openmpi/v4.1.x-debug/install/lib/openmpi/mca_btl_vader.so(mca_btl_vader_poll_handle_frag+0x188)[0xffff89f666fc]
[ip-172-31-24-134:12580] [ 6] /home/ec2-user/PortaFiducia/build/libraries/openmpi/v4.1.x-debug/install/lib/openmpi/mca_btl_vader.so(+0x61b4)[0xffff89f641b4]
[ip-172-31-24-134:12580] [ 7] /home/ec2-user/PortaFiducia/build/libraries/openmpi/v4.1.x-debug/install/lib/openmpi/mca_btl_vader.so(+0x8aac)[0xffff89f66aac]
[ip-172-31-24-134:12580] [ 8] /home/ec2-user/PortaFiducia/build/libraries/openmpi/v4.1.x-debug/install/lib/libopen-pal.so.40(opal_progress+0x34)[0xffff8c9634b8]
[ip-172-31-24-134:12580] [ 9] /home/ec2-user/PortaFiducia/build/libraries/openmpi/v4.1.x-debug/install/lib/libmpi.so.40(+0x64fc8)[0xffff8cdb9fc8]
[ip-172-31-24-134:12580] [10] /home/ec2-user/PortaFiducia/build/libraries/openmpi/v4.1.x-debug/install/lib/libmpi.so.40(ompi_request_default_wait+0x24)[0xffff8cdba008]
[ip-172-31-24-134:12580] [11] /home/ec2-user/PortaFiducia/build/libraries/openmpi/v4.1.x-debug/install/lib/libmpi.so.40(+0x130808)[0xffff8ce85808]
[ip-172-31-24-134:12580] [12] /home/ec2-user/PortaFiducia/build/libraries/openmpi/v4.1.x-debug/install/lib/libmpi.so.40(ompi_coll_base_barrier_intra_recursivedoubling+0x1b4)[0xffff8ce85d60]
[ip-172-31-24-134:12580] [13] /home/ec2-user/PortaFiducia/build/libraries/openmpi/v4.1.x-debug/install/lib/openmpi/mca_coll_tuned.so(ompi_coll_tuned_barrier_intra_do_this+0xcc)[0xffff8970aaf4]
[ip-172-31-24-134:12580] [14] /home/ec2-user/PortaFiducia/build/libraries/openmpi/v4.1.x-debug/install/lib/openmpi/mca_coll_tuned.so(ompi_coll_tuned_barrier_intra_dec_fixed+0x108)[0xffff89701800]
[ip-172-31-24-134:12580] [15] /home/ec2-user/PortaFiducia/build/libraries/openmpi/v4.1.x-debug/install/lib/libmpi.so.40(MPI_Barrier+0x134)[0xffff8cdddeb4]
[ip-172-31-24-134:12580] [16] /home/ec2-user/ompi-tests/ibm/onesided/c_reqops[0x402434]
[ip-172-31-24-134:12580] [17] /lib64/libc.so.6(__libc_start_main+0xe4)[0xffff8cba2da4]
[ip-172-31-24-134:12580] [18] /home/ec2-user/ompi-tests/ibm/onesided/c_reqops[0x401648]
[ip-172-31-24-134:12580] *** End of error message ***
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 0 on node ip-172-31-24-134 exited on signal 7 (Bus error).
--------------------------------------------------------------------------
 > gdb /home/ec2-user/ompi-tests/ibm/onesided/c_reqops core.12580
GNU gdb (GDB) Red Hat Enterprise Linux 8.0.1-36.amzn2.0.1
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "aarch64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/ec2-user/ompi-tests/ibm/onesided/c_reqops...done.
[New LWP 12580]
[New LWP 12582]
[New LWP 12584]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/home/ec2-user/ompi-tests/ibm/onesided/c_reqops'.
Program terminated with signal SIGBUS, Bus error.
#0  0x0000ffff89f6b7ec in __aarch64_swp4_relax () from /home/ec2-user/PortaFiducia/build/libraries/openmpi/v4.1.x-debug/install/lib/openmpi/mca_btl_vader.so
[Current thread is 1 (Thread 0xffff8cf9d650 (LWP 12580))]
Missing separate debuginfos, use: debuginfo-install glibc-2.26-62.amzn2.aarch64 hwloc-libs-1.11.8-4.amzn2.aarch64 libatomic-7.3.1-15.amzn2.aarch64 libevent-2.0.21-4.amzn2.0.3.aarch64 libgcc-7.3.1-15.amzn2.aarch64 libibverbs-core-43.0-1.amzn2.0.2.aarch64 libnl3-3.2.28-4.amzn2.0.1.aarch64 librdmacm-43.0-1.amzn2.0.2.aarch64 libtool-ltdl-2.4.2-22.2.amzn2.0.2.aarch64 numactl-libs-2.0.9-7.amzn2.aarch64 zlib-1.2.7-19.amzn2.0.2.aarch64
(gdb) bt
#0  0x0000ffff89f6b7ec in __aarch64_swp4_relax () from /home/ec2-user/PortaFiducia/build/libraries/openmpi/v4.1.x-debug/install/lib/openmpi/mca_btl_vader.so
#1  0x0000ffff89f6a620 in opal_atomic_swap_32 (addr=0x3bbc6c03, newval=65535) at ../../../../opal/include/opal/sys/gcc_builtin/atomic.h:117
#2  0x0000ffff89f6abd4 in mca_btl_vader_sc_emu_atomic_32 (operand=0xffffcbc4b2d4, addr=0x3bbc6c03, op=MCA_BTL_ATOMIC_SWAP) at btl_vader_sc_emu.c:73
#3  0x0000ffff89f6ad68 in mca_btl_vader_sc_emu_rdma (btl=0xffff89f8e1e0 <mca_btl_vader>, tag=35 '#', desc=0xffffcbc4b318, ctx=0x0) at btl_vader_sc_emu.c:113
#4  0x0000ffff89f666fc in mca_btl_vader_poll_handle_frag (hdr=0xffff891f4e00, endpoint=0x3bbc0568) at btl_vader_component.c:669
#5  0x0000ffff89f641b4 in mca_btl_vader_check_fboxes () at btl_vader_fbox.h:231
#6  0x0000ffff89f66aac in mca_btl_vader_component_progress () at btl_vader_component.c:768
#7  0x0000ffff8c9634b8 in opal_progress () at runtime/opal_progress.c:231
#8  0x0000ffff8cdb9fc8 in ompi_request_wait_completion (req=0x3bbbff80) at ../ompi/request/request.h:440
#9  0x0000ffff8cdba008 in ompi_request_default_wait (req_ptr=0xffffcbc4b640, status=0xffffcbc4b628) at request/req_wait.c:42
#10 0x0000ffff8ce85808 in ompi_coll_base_sendrecv_zero (dest=1, stag=-16, source=1, rtag=-16, comm=0x420da0 <ompi_mpi_comm_world>) at base/coll_base_barrier.c:64
#11 0x0000ffff8ce85d60 in ompi_coll_base_barrier_intra_recursivedoubling (comm=0x420da0 <ompi_mpi_comm_world>, module=0x3bbc4650) at base/coll_base_barrier.c:219
#12 0x0000ffff8970aaf4 in ompi_coll_tuned_barrier_intra_do_this (comm=0x420da0 <ompi_mpi_comm_world>, module=0x3bbc4650, algorithm=3, faninout=0, segsize=0)
    at coll_tuned_barrier_decision.c:101
#13 0x0000ffff89701800 in ompi_coll_tuned_barrier_intra_dec_fixed (comm=0x420da0 <ompi_mpi_comm_world>, module=0x3bbc4650) at coll_tuned_decision_fixed.c:500
#14 0x0000ffff8cdddeb4 in PMPI_Barrier (comm=0x420da0 <ompi_mpi_comm_world>) at pbarrier.c:66
#15 0x0000000000402434 in main (argc=1, argv=0xffffcbc4b9c8) at c_reqops.c:214
(gdb) 
awlauria commented 1 year ago

I thought this sounded vaguely familiar.

This was never ported back to v4.0.x from my eyes:

https://github.com/open-mpi/ompi/commit/dc8ead901ef63ddca6aa2354d1fc5a77c1131580

can you try applying this to you branch and retrying?

It may not apply cleanly, but on the v4.0.x branch It is here:

https://github.com/open-mpi/ompi/blob/v4.0.x/ompi/mca/osc/rdma/osc_rdma_accumulate.c#L773

awlauria commented 1 year ago

Well it applied cleaner than I thought, here's a branch to try:

https://github.com/open-mpi/ompi/compare/v4.0.x...awlauria:ompi:rdma_potential_unaligned_mem_v4.0.x

If it works I can open a pr, though v4.0.x is long in the tooth and the RM's may be wary in taking it.

If nothing else it can go into v4.1.x (the fix also didn't get ported there).

Edit - actually looking at the stack traces this will most likely not fix it. But it's possible something similar needs to be done. Sorry. :(