open-mpi / ompi

Open MPI main development repository
https://www.open-mpi.org
Other
2.15k stars 859 forks source link

Build failure on Apple Silicon #8410

Closed fxcoudert closed 3 years ago

fxcoudert commented 3 years ago

Background information

What version of Open MPI are you using?

4.1.0 from official sources

Please describe the system on which you are running


Details of the problem

https://github.com/Homebrew/homebrew-core/pull/67367#issuecomment-753315171 Compiling open-mpi 4.1.0 on Apple Silicon (aarch64-appel-darwin20) fails with build errors:

2020-12-28T16:04:59.0201020Z In file included from pmix_mca_base_close.c:26:
2020-12-28T16:04:59.0202550Z In file included from /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/util/output.h:76:
2020-12-28T16:04:59.0204530Z In file included from /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/class/pmix_object.h:131:
2020-12-28T16:04:59.0206570Z In file included from /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/threads/thread_usage.h:31:
2020-12-28T16:04:59.0208620Z In file included from /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/atomics/sys/atomic.h:168:
2020-12-28T16:04:59.0211330Z /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/atomics/sys/gcc_builtin/atomic.h:197:12: error: address argument to atomic operation must be a pointer to integer or pointer ('pmix_atomic_int128_t *' (aka '_Atomic(pmix_int128_t) *') invalid)
2020-12-28T16:04:59.0213200Z     return __atomic_compare_exchange_n (addr, oldval, newval, false,
2020-12-28T16:04:59.0213870Z            ^                            ~~~~
2020-12-28T16:04:59.0214520Z In file included from pmix_mca_base_component_compare.c:25:
2020-12-28T16:04:59.0216100Z In file included from /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/mca/base/base.h:31:
2020-12-28T16:04:59.0218100Z In file included from /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/class/pmix_object.h:131:
2020-12-28T16:04:59.0220120Z In file included from /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/threads/thread_usage.h:31:
2020-12-28T16:04:59.0222180Z In file included from /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/atomics/sys/atomic.h:168:
2020-12-28T16:04:59.0224880Z /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/atomics/sys/gcc_builtin/atomic.h:197:12: error: address argument to atomic operation must be a pointer to integer or pointer ('pmix_atomic_int128_t *' (aka '_Atomic(pmix_int128_t) *') invalid)
2020-12-28T16:04:59.0226740Z     return __atomic_compare_exchange_n (addr, oldval, newval, false,
2020-12-28T16:04:59.0227410Z            ^                            ~~~~
2020-12-28T16:04:59.0228090Z In file included from pmix_mca_base_component_repository.c:38:
2020-12-28T16:04:59.0229710Z In file included from /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/class/pmix_list.h:77:
2020-12-28T16:04:59.0231700Z In file included from /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/class/pmix_object.h:131:
2020-12-28T16:04:59.0233710Z In file included from /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/threads/thread_usage.h:31:
2020-12-28T16:04:59.0235740Z In file included from /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/atomics/sys/atomic.h:168:
2020-12-28T16:04:59.0238400Z /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/atomics/sys/gcc_builtin/atomic.h:197:12: error: address argument to atomic operation must be a pointer to integer or pointer ('pmix_atomic_int128_t *' (aka '_Atomic(pmix_int128_t) *') invalid)
2020-12-28T16:04:59.0240220Z     return __atomic_compare_exchange_n (addr, oldval, newval, false,
2020-12-28T16:04:59.0240950Z            ^                            ~~~~
2020-12-28T16:04:59.0241510Z In file included from pmix_mca_base_cmd_line.c:26:
2020-12-28T16:04:59.0243000Z In file included from /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/util/cmd_line.h:121:
2020-12-28T16:04:59.0244970Z In file included from /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/class/pmix_object.h:131:
2020-12-28T16:04:59.0247010Z In file included from /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/threads/thread_usage.h:31:
2020-12-28T16:04:59.0249020Z In file included from /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/atomics/sys/atomic.h:168:
2020-12-28T16:04:59.0251730Z /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/atomics/sys/gcc_builtin/atomic.h:197:12: error: address argument to atomic operation must be a pointer to integer or pointer ('pmix_atomic_int128_t *' (aka '_Atomic(pmix_int128_t) *') invalid)
2020-12-28T16:04:59.0253560Z     return __atomic_compare_exchange_n (addr, oldval, newval, false,
2020-12-28T16:04:59.0254200Z            ^                            ~~~~
2020-12-28T16:04:59.0254830Z In file included from pmix_mca_base_components_open.c:31:
2020-12-28T16:04:59.0256380Z In file included from /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/class/pmix_list.h:77:
2020-12-28T16:04:59.0258360Z In file included from /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/class/pmix_object.h:131:
2020-12-28T16:04:59.0260350Z In file included from /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/threads/thread_usage.h:31:
2020-12-28T16:04:59.0262360Z In file included from /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/atomics/sys/atomic.h:168:
2020-12-28T16:04:59.0265020Z /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/atomics/sys/gcc_builtin/atomic.h:197:12: error: address argument to atomic operation must be a pointer to integer or pointer ('pmix_atomic_int128_t *' (aka '_Atomic(pmix_int128_t) *') invalid)
2020-12-28T16:04:59.0266840Z     return __atomic_compare_exchange_n (addr, oldval, newval, false,
2020-12-28T16:04:59.0267950Z /opt/homebrew/Library/Homebrew/shims/scm/git --version
2020-12-28T16:04:59.0268580Z            ^                            ~~~~
2020-12-28T16:04:59.0269050Z 1 error generated.
2020-12-28T16:04:59.0269700Z In file included from pmix_mca_base_component_find.c:49:
2020-12-28T16:04:59.0271380Z In file included from /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/mca/pinstalldirs/pinstalldirs.h:21:
2020-12-28T16:04:59.0273440Z In file included from /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/mca/base/base.h:31:
2020-12-28T16:04:59.0275410Z In file included from /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/class/pmix_object.h:131:
2020-12-28T16:04:59.0277420Z In file included from /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/threads/thread_usage.h:31:
2020-12-28T16:04:59.0279430Z In file included from /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/atomics/sys/atomic.h:168:
2020-12-28T16:04:59.0282090Z /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/atomics/sys/gcc_builtin/atomic.h:197:12: error: address argument to atomic operation must be a pointer to integer or pointer ('pmix_atomic_int128_t *' (aka '_Atomic(pmix_int128_t) *') invalid)
2020-12-28T16:04:59.0283900Z     return __atomic_compare_exchange_n (addr, oldval, newval, false,
2020-12-28T16:04:59.0284550Z            ^                            ~~~~
2020-12-28T16:04:59.0285260Z In file included from pmix_mca_base_components_close.c:25:
2020-12-28T16:04:59.0286810Z In file included from /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/class/pmix_list.h:77:
2020-12-28T16:04:59.0288780Z In file included from /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/class/pmix_object.h:131:
2020-12-28T16:04:59.0290910Z In file included from /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/threads/thread_usage.h:31:
2020-12-28T16:04:59.0292030Z 1 error generated.
2020-12-28T16:04:59.0293450Z In file included from /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/atomics/sys/atomic.h:168:
2020-12-28T16:04:59.0296100Z /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/atomics/sys/gcc_builtin/atomic.h:197:12: error: address argument to atomic operation must be a pointer to integer or pointer ('pmix_atomic_int128_t *' (aka '_Atomic(pmix_int128_t) *') invalid)
2020-12-28T16:04:59.0297920Z     return __atomic_compare_exchange_n (addr, oldval, newval, false,
2020-12-28T16:04:59.0298580Z            ^                            ~~~~
2020-12-28T16:04:59.0299040Z 1 error generated.
2020-12-28T16:04:59.0299620Z make[5]: *** [pmix_mca_base_close.lo] Error 1
2020-12-28T16:04:59.0300270Z make[5]: *** Waiting for unfinished jobs....
2020-12-28T16:04:59.0300860Z 1 error generated.
2020-12-28T16:04:59.0301380Z 1 error generated.
2020-12-28T16:04:59.0302060Z make[5]: *** [pmix_mca_base_component_repository.lo] Error 1
2020-12-28T16:04:59.0302880Z make[5]: *** [pmix_mca_base_component_compare.lo] Error 1
2020-12-28T16:04:59.0303590Z make[5]: *** [pmix_mca_base_cmd_line.lo] Error 1
2020-12-28T16:04:59.0304290Z make[5]: *** [pmix_mca_base_components_open.lo] Error 1
2020-12-28T16:04:59.0304910Z 1 error generated.
2020-12-28T16:04:59.0305430Z 1 error generated.
2020-12-28T16:04:59.0306120Z In file included from pmix_mca_base_components_select.c:25:
2020-12-28T16:04:59.0307690Z In file included from /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/class/pmix_list.h:77:
2020-12-28T16:04:59.0309650Z In file included from /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/class/pmix_object.h:131:
2020-12-28T16:04:59.0311660Z In file included from /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/threads/thread_usage.h:31:
2020-12-28T16:04:59.0313650Z In file included from /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/atomics/sys/atomic.h:168:
2020-12-28T16:04:59.0316300Z /private/tmp/open-mpi-20201228-52672-1jln4sf/openmpi-4.1.0/opal/mca/pmix/pmix3x/pmix/src/atomics/sys/gcc_builtin/atomic.h:197:12: error: address argument to atomic operation must be a pointer to integer or pointer ('pmix_atomic_int128_t *' (aka '_Atomic(pmix_int128_t) *') invalid)
2020-12-28T16:04:59.0318120Z     return __atomic_compare_exchange_n (addr, oldval, newval, false,
2020-12-28T16:04:59.0318770Z            ^                            ~~~~
2020-12-28T16:04:59.0319360Z make[5]: *** [pmix_mca_base_components_close.lo] Error 1
2020-12-28T16:04:59.0320100Z make[5]: *** [pmix_mca_base_component_find.lo] Error 1
2020-12-28T16:04:59.0320820Z 1 error generated.
2020-12-28T16:04:59.0321470Z make[5]: *** [pmix_mca_base_components_select.lo] Error 1
2020-12-28T16:04:59.0322410Z make[4]: *** [all-recursive] Error 1
2020-12-28T16:04:59.0323300Z make[3]: *** [all-recursive] Error 1
2020-12-28T16:04:59.0324180Z make[2]: *** [all-recursive] Error 1
2020-12-28T16:04:59.0325050Z make[1]: *** [all-recursive] Error 1
2020-12-28T16:04:59.0325940Z make: *** [all-recursive] Error 1
fxcoudert commented 3 years ago

--disable-builtin-atomics as suggested by @ggouaillardet does not avoid the issue

hjelmn commented 3 years ago

Hmmm. Let me take a look. master builds fine on the M1 but I rarely ever build releases.

ggouaillardet commented 3 years ago

@fxcoudert thanks for the report.

the logs you posted are related to PMIx using the GCC builtin atomics. did you use --disable-builtin-atomics to generate them?

if so, the error might be that Open MPI does not pass --disable-builtin-atomics to PMIx configure (you can check that in opal/mca/pmix/pmix3x/pmix/config.status)

hjelmn commented 3 years ago

@ggouaillardet We shouldn't be failing even without that option. The gcc builtins are inferior on Apple Silicon so they should really be disable on AArch64 in v4.1.0. For master C11 should be used.

hjelmn commented 3 years ago

I really need to refactor the atomic support. Even when using C11 I still want the LL/SC atomics to be available. The LL/SC lifo/fifo implementations are ~ 2x the speed of the CAS128 implementations (measured on Power 8). C11 and builtins do not provide direct access to them. CAS is an Intel thing.

hjelmn commented 3 years ago

Hmm, the v4.1.x branch builds just fine for me.

$ ../configure --prefix=/tmp/ompi --disable-mpi-fortran --disable-oshmem &> config.out
$ make -j 32 &> make.out
$ echo $?
0
$ git branch
  master
* v4.1.x
$ uname -a
Darwin Mac-mini.local 20.3.0 Darwin Kernel Version 20.3.0: Thu Jan 14 14:38:22 PST 2021; root:xnu-7195.81.2~2/RELEASE_ARM64_T8101 arm64
fxcoudert commented 3 years ago

With run configure with ./configure --prefix=/opt/homebrew/Cellar/open-mpi/4.1.0 --disable-dependency-tracking --disable-silent-rules --enable-ipv6 --enable-mca-no-build=op-avx,reachable-netlink --with-libevent=/opt/homebrew/opt/libevent --with-sge --disable-builtin-atomics, with clang as C compiler and gfortran as Fortran compiler

hjelmn commented 3 years ago

@fxcoudert Odd. I will try to build with all those options but fortran. It is a cancer on MPI :) and shouldn't have an impact on building PMIx.

hjelmn commented 3 years ago

What I may do is update just v4.0.x and v4.1.x to never select the builtins for AArch64. master will get an update to not use CAS128.

LL/SC:

Mac-mini:class hjelmn$ ./opal_lifo  -t 1
Single thread test. Time: 0 s 13621 us 13 nsec/poppush
Atomics thread finished. Time: 0 s 14375 us 14 nsec/poppush
Atomics thread finished. Time: 0 s 154525 us 154 nsec/poppush
Atomics thread finished. Time: 0 s 154661 us 154 nsec/poppush
Atomics thread finished. Time: 0 s 156505 us 156 nsec/poppush
Atomics thread finished. Time: 0 s 157013 us 157 nsec/poppush
Atomics thread finished. Time: 0 s 157493 us 157 nsec/poppush
Atomics thread finished. Time: 0 s 158275 us 158 nsec/poppush
Atomics thread finished. Time: 0 s 158647 us 158 nsec/poppush
Atomics thread finished. Time: 0 s 158973 us 158 nsec/poppush
All threads finished. Thread count: 8 Time: 0 s 159023 us 19 nsec/poppush
SUPPORT: OMPI Test Passed: opal_lifo_t: (7 tests)

CAS128:

Mac-mini:class hjelmn$ ./opal_lifo  -t 1
Single thread test. Time: 0 s 25688 us 25 nsec/poppush
Atomics thread finished. Time: 0 s 29322 us 29 nsec/poppush
Atomics thread finished. Time: 4 s 57595 us 4057 nsec/poppush
Atomics thread finished. Time: 4 s 151568 us 4151 nsec/poppush
Atomics thread finished. Time: 4 s 162332 us 4162 nsec/poppush
Atomics thread finished. Time: 4 s 173651 us 4173 nsec/poppush
Atomics thread finished. Time: 4 s 176088 us 4176 nsec/poppush
Atomics thread finished. Time: 4 s 178025 us 4178 nsec/poppush
Atomics thread finished. Time: 4 s 178713 us 4178 nsec/poppush
Atomics thread finished. Time: 4 s 178760 us 4178 nsec/poppush
All threads finished. Thread count: 8 Time: 4 s 178830 us 522 nsec/poppush
SUPPORT: OMPI Test Passed: opal_lifo_t: (7 tests)

Not even a contest.

hjelmn commented 3 years ago

Simlarly bad with opal_fifo:

LL/SC

Mac-mini:class hjelmn$ ./opal_fifo 
Single thread test. Time: 0 s 7620 us 7 nsec/poppush
Atomics thread finished. Time: 0 s 7918 us 7 nsec/poppush
Atomics thread finished. Time: 0 s 76081 us 76 nsec/poppush
Atomics thread finished. Time: 0 s 79458 us 79 nsec/poppush
Atomics thread finished. Time: 0 s 84994 us 84 nsec/poppush
Atomics thread finished. Time: 0 s 90103 us 90 nsec/poppush
Atomics thread finished. Time: 0 s 90403 us 90 nsec/poppush
Atomics thread finished. Time: 0 s 91280 us 91 nsec/poppush
Atomics thread finished. Time: 0 s 92466 us 92 nsec/poppush
Atomics thread finished. Time: 0 s 93835 us 93 nsec/poppush
All threads finished. Thread count: 8 Time: 0 s 93916 us 11 nsec/poppush
Exhaustive atomics thread finished. Popped 821530 items. Time: 0 s 107912 us 131 nsec/poppush
Exhaustive atomics thread finished. Popped 810445 items. Time: 0 s 114695 us 141 nsec/poppush
Exhaustive atomics thread finished. Popped 806449 items. Time: 0 s 116241 us 144 nsec/poppush
Exhaustive atomics thread finished. Popped 813960 items. Time: 0 s 117182 us 143 nsec/poppush
Exhaustive atomics thread finished. Popped 825230 items. Time: 0 s 118810 us 143 nsec/poppush
Exhaustive atomics thread finished. Popped 826685 items. Time: 0 s 119486 us 144 nsec/poppush
Exhaustive atomics thread finished. Popped 828373 items. Time: 0 s 120327 us 145 nsec/poppush
Exhaustive atomics thread finished. Popped 830266 items. Time: 0 s 121114 us 145 nsec/poppush
All threads finished. Thread count: 8 Time: 0 s 121186 us 15 nsec/poppush
SUPPORT: OMPI Test Passed: opal_fifo_t: (8 tests)

CAS128:

Mac-mini:class hjelmn$ ./opal_fifo 
Single thread test. Time: 0 s 7611 us 7 nsec/poppush
Atomics thread finished. Time: 0 s 19256 us 19 nsec/poppush
Atomics thread finished. Time: 2 s 555095 us 2555 nsec/poppush
Atomics thread finished. Time: 2 s 562521 us 2562 nsec/poppush
Atomics thread finished. Time: 2 s 570284 us 2570 nsec/poppush
Atomics thread finished. Time: 2 s 570760 us 2570 nsec/poppush
Atomics thread finished. Time: 2 s 571438 us 2571 nsec/poppush
Atomics thread finished. Time: 2 s 573642 us 2573 nsec/poppush
Atomics thread finished. Time: 2 s 575019 us 2575 nsec/poppush
Atomics thread finished. Time: 2 s 575161 us 2575 nsec/poppush
All threads finished. Thread count: 8 Time: 2 s 575231 us 321 nsec/poppush
Exhaustive atomics thread finished. Popped 639525 items. Time: 1 s 828167 us 2858 nsec/poppush
Exhaustive atomics thread finished. Popped 642578 items. Time: 1 s 840312 us 2863 nsec/poppush
Exhaustive atomics thread finished. Popped 641617 items. Time: 1 s 846852 us 2878 nsec/poppush
Exhaustive atomics thread finished. Popped 639283 items. Time: 1 s 849705 us 2893 nsec/poppush
Exhaustive atomics thread finished. Popped 646423 items. Time: 1 s 851183 us 2863 nsec/poppush
Exhaustive atomics thread finished. Popped 645146 items. Time: 1 s 851750 us 2870 nsec/poppush
Exhaustive atomics thread finished. Popped 645428 items. Time: 1 s 852076 us 2869 nsec/poppush
Exhaustive atomics thread finished. Popped 648267 items. Time: 1 s 852240 us 2857 nsec/poppush
All threads finished. Thread count: 8 Time: 1 s 852359 us 231 nsec/poppush
SUPPORT: OMPI Test Passed: opal_fifo_t: (8 tests)
fxcoudert commented 3 years ago

I've uploaded our full build log at https://gist.github.com/fxcoudert/0710566fc631546b7a5ad496dabcb747 so you can check what is happening.

One weird thing is checking for builtin atomics... BUILTIN_GCC because we're using clang as C compiler.

hjelmn commented 3 years ago

@fxcoudert That is because clang implements the gcc builtin atomics (__atomic_*). They are now used in Open MPI over the older Intel __sync_* atomics. I think we defaulted to the builtins for v4.x. This appears to have been a mistake for AArch64 as the performance is definitely worse.

ggouaillardet commented 3 years ago

@hjelmn

from the logs posted by @fxcoudert I noted:

checking for assembly architecture... UNSUPPORTED

I quickly checked config/opal_config_asm.m4, and indeed, we do not support M1:

checking host system type... arm-apple-darwin20.2.0

right after,

checking for builtin atomics... BUILTIN_SYNC

so we could have two issues in Open MPI:

fxcoudert commented 3 years ago

The triplet for that arch should not be arm-apple-darwin20.2.0 but aarch64-apple-darwin20.2.0 (https://github.com/gcc-mirror/gcc/blob/5a36cae275ad84cc7e623f2f5829bdad767e3f6a/config.guess#L1345)

Therefore config.{guess,sub} need to be updated: https://www.gnu.org/software/gettext/manual/html_node/config_002eguess.html

ggouaillardet commented 3 years ago

@fxcoudert thanks for the pointer!

@jsquyres any advice on how we should handle that?

my best bet is we should patch config.{guess, sub} the same way we patch configure to correctly handle third party dependencies.

jsquyres commented 3 years ago

Just to be clear -- are we saying that the upstream config.sub / config.guess files include the now-correct notation aarch64-apple-darwin20.2.0?

If so, we should probably stash copies of them in our git repo and just cp them to the appropriate places during autogen.pl. We used to do something like this (we would wget the most recent config.* files during autogen, but that's not really good for repeatability -- stashing known-good versions in git is probably a better scheme).

That being said, we should probably only conditionally cp / replace the config.* files that autoconf and friends install: i.e., do a version check of what we have in git vs. the what is installed by autoconf and friends, and use whichever one is newer.

bwbarrett commented 3 years ago

I don't think we should include config.guess and config.sub in git. One day, autoconf's will be newer and then we'll have a real problem. There is a timestamp in the files, so we may be able to cover that. But it still seems a little awkward.

We used to grab the latest config.guess/sub as part of building a tarball, although it looks like we no longer do. That seems much better than trying to cover this for all use cases.

bwbarrett commented 3 years ago

I did verify that the config.guess we ship with OMPI tarballs (which is the one included with Autoconf 2.69) returns arm-apple-darwin20.2.0. The latest config.guess in Savannah returns aarch64-apple-darwin20.2.0. So it looks like we do need to pull config.guess/config.sub, at least when building tarballs.

jsquyres commented 3 years ago

As noted on the Jan 26: we will also need to apply (at least the config.* files) in PMIx and PRRTE.

jsquyres commented 3 years ago

@bwbarrett and I talked offline. I'll go make a PR to do what was described above: stash known good copies of config.* in Open MPI's git repo, and during autogen, do the version compare, and if the stashed versions are newer, copy those in over what autoconf installed.

jsquyres commented 3 years ago

See #8417 for autogen.pl updates to use known-good config.guess and config.sub.

fxcoudert commented 3 years ago

GNU's own documentation recommends using config.{guess,sub} from their own repo, rather than rely on autoconf versions. https://www.gnu.org/software/gettext/manual/html_node/config_002eguess.html

jsquyres commented 3 years ago

@fxcoudert PR #8417 includes cached copies of config.guess and config.sub from Savannah from today.

We don't want to just arbitrarily grab those files from Savannah when building a tarball for a few reasons:

  1. We lose reproducibility, making debugging potentially more difficult.
  2. At any given time, those files might be "bad" on Savannah for some reason. Having a known-good set of files eliminates the variability of what we might be grabbing from upstream.
  3. Open MPI tarballs are sometimes built in restricted networks that cannot reach the internet.

Hence, it seems safer to just cache known-good versions of these files in the Open MPI repo, and document them as so. If we ever need to update these files, no problem -- we can re-pull from Savannah.

Does that address your concern?

jsquyres commented 3 years ago

@fxcoudert This issue auto-closed, sorry about that. The v4.1.x version of the fix is in #8421.

I don't know if you want to just pull the patch and apply that; we can (and probably will) roll an RC soon, but I think you said that you don't generally test upstream betas. FWIW: we have just one more AVX blocker issue before v4.1.1 (it compiles and runs properly now, but at least in some cases there's a performance degradation that we're working to understand).