open-mpi / ompi

Open MPI main development repository
https://www.open-mpi.org
Other
2.13k stars 858 forks source link

v5.0.0: Default configure on MacOS with OneAPI compilers fails #12051

Open jsquyres opened 11 months ago

jsquyres commented 11 months ago

With the v5.0.0 tarball, building with the Intel OneAPI compilers on MacOS Sonoma (14.x) with XCode 15.x, a default configure fails, with PMIx's configure reporting that it can't find a suitable libevent. Specifically:

./configure CC=icc FC=ifort CXX=icpc

Is what fails. Here's the corresponding files:

However, explicitly specifying to use the "internal" packages via this configure command will succeed:

./configure CC=icc FC=ifort CXX=icpc --with-libevent=internal --with-hwloc=internal --with-pmix=internal --with-prrte=internal

Here's the corresponding files:

I haven't dug deeply into this, but it seems like PMIx is not realizing that it should "shortcut" and use the Open MPI-provided libevent. I'm not sure if this is Open MPI's fault or PMIx's fault, so I filed the issue here to investigate the Open MPI side first.

Initially reported by Volker Blum: https://www.mail-archive.com/users@lists.open-mpi.org/msg35242.html.

jsquyres commented 10 months ago

This seems to be the crux of the problem:

Here's a trivial program that shows the issue -- my trivial program does include <event.h>, doesn't even call any Libevent APIs, but does -levent_core just to make the linker search for it:

$ cat foo.c                        
#include <event.h>
#include <stdio.h>

int main()
{
    printf("Hello world\n");
    return 0;
}

# Success
$ gcc -o foo.exe foo.c -levent_core

# Success
$ clang -o foo.exe foo.c -levent_core

# Fail without -L/usr/local/lib
$ icc -o foo.exe foo.c -levent_core  
icc: remark #10441: The Intel(R) C++ Compiler Classic (ICC) is deprecated and will be removed from product release in the second half of 2023. Use '-diag-disable=10441' to disable this message.
-macosx_version_min has been renamed to -macos_version_min
ld: warning: -keep_dwarf_unwind is obsolete
ld: warning: ignoring duplicate libraries: '/opt/intel/oneapi/compiler/2023.2.0/mac/bin/intel64/../../compiler/lib/libsvml.a'
ld: library 'event_core' not found

# Success with -L/usr/local/lib
$ icc -o foo.exe foo.c -L/usr/local/lib -levent_core
icc: remark #10441: The Intel(R) C++ Compiler Classic (ICC) is deprecated and will be removed from product release in the second half of 2023. Use '-diag-disable=10441' to disable this message.
-macosx_version_min has been renamed to -macos_version_min
ld: warning: -keep_dwarf_unwind is obsolete
ld: warning: ignoring duplicate libraries: '/opt/intel/oneapi/compiler/2023.2.0/mac/bin/intel64/../../compiler/lib/libsvml.a'
ld: warning: no platform load command found in '/private/var/folders/8c/wytfk7955cq9kdh1ppsrrq4w0000gp/T/iccwq1DsY.o', assuming: macOS
ld: warning: no platform load command found in '/opt/intel/oneapi/compiler/2023.2.0/mac/bin/intel64/../../compiler/lib/libirc.a[4](irc_msg_support.o)', assuming: macOS
ld: warning: no platform load command found in '/opt/intel/oneapi/compiler/2023.2.0/mac/bin/intel64/../../compiler/lib/libirc.a[80](cpu_feature_disp.o)', assuming: macOS
ld: warning: no platform load command found in '/opt/intel/oneapi/compiler/2023.2.0/mac/bin/intel64/../../compiler/lib/libirc.a[100](proc_init_utils.o)', assuming: macOS
ld: warning: no platform load command found in '/opt/intel/oneapi/compiler/2023.2.0/mac/bin/intel64/../../compiler/lib/libirc.a[103](new_proc_init.o)', assuming: macOS
ld: warning: no platform load command found in '/opt/intel/oneapi/compiler/2023.2.0/mac/bin/intel64/../../compiler/lib/libirc.a[117](sse2_strlen.o)', assuming: macOS
ld: warning: no platform load command found in '/opt/intel/oneapi/compiler/2023.2.0/mac/bin/intel64/../../compiler/lib/libirc.a[118](sse2_strchr.o)', assuming: macOS
ld: warning: no platform load command found in '/opt/intel/oneapi/compiler/2023.2.0/mac/bin/intel64/../../compiler/lib/libirc.a[127](ssse3_strncpy.o)', assuming: macOS

This feels like a bug in the Intel OneAPI compilers to me -- it's definitely weird that it doesn't need -I/usr/local/include to find <event.h>, but does need -L/usr/local/lib to find libevent_core.*.

Someone should report this issue upstream to the Intel OneAPI maintainers.

Regardless, there's a workaround for Open MPI. I don't know for sure, but I'm guessing that the original reporter was in a similar situation as me, in that they have libevent (and potentially hwloc) installed via Homebrew or MacPorts, and Open MPI's configure finds it, and therefore decides not to build its internal copies. In this situation, you can:

./configure CC=icc CXX=icpc FC=ifort LDFLAGS=-L/usr/local/lib ...

Which, given that Open MPI's configure script finds the external Libevent (and potentially hwloc), will tell the Intel compiler/linker where to find libevent_core.* (and libhwloc.*).

If your situation is different, I'd like to hear the details.

One user noted on the mailing list that they typically build like this with the Intel ONE compilers on macOS:

./configure CC=icc CXX=icpc FC=ifort --with-hwloc=internal --with-libevent=internal --with-pmix=internal --with-prrte=internal ...

Using the internal Libevent and hwloc is another way to address this issue, presuming:

If you meet the above conditions, you could probably shorten the above workaround to:

./configure CC=icc CXX=icpc FC=ifort --with-hwloc=internal --with-libevent=internal ...
jsquyres commented 10 months ago

@volkerblum Could you confirm that the workarounds I listed above work for you?

Also, do you have a support contract with Intel to ask them about the weird -- and feels-like-a-bug -- behavior of not needing -I/usr/local/include but requiring -L/usr/local/lib?

volkerblum commented 10 months ago

Have not yet been able to check on my own setup but all this sounds very plausible. I do not have a support contract from Intel (and Intel indicated that they are not, or not yet, supporting MacOS 14 with OneAPI anyway). I tried to go back and follow up on my original, and declined, request for help from Intel, pointing them to this thread, but a problem in Intel's interface prevented me from submitting that follow-up after I had written it. Hm.

I will need to check the -I vs -L substitution on my own system. This could take a while ...

Thank you so much!! for sorting this out, though. Again, I think this sounds very plausible as a more general solution pathway.