open-mpi / ompi

Open MPI main development repository
https://www.open-mpi.org
Other
2.17k stars 861 forks source link

MPI apps segfault on Mac OSX #12265

Open rhc54 opened 10 months ago

rhc54 commented 10 months ago

Background information

What version of Open MPI are you using? (e.g., v4.1.6, v5.0.1, git branch name and hash, etc.)

HEAD of main branch

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

Git clone

If you are building/installing from a git clone, please copy-n-paste the output from git submodule status.

1e2325e3ea12739b6ee46e6afca36e2af05214d8 3rd-party/openpmix (v1.1.3-3977-g1e2325e3)
4f27008906d96845e22df6502d6a9a29d98dec83 3rd-party/prrte (psrvr-v2.0.0rc1-4746-g4f27008906)
c1cfc910d92af43f8c27807a9a84c9c13f4fbc65 config/oac (heads/main)

Please describe the system on which you are running


Details of the problem

When attempting to run a simple MPI app, it segfaults with missing symbol:

$ mpirun -n 2 ./hello_c
dyld[28739]: symbol not found in flat namespace '_mca_common_monitoring_output_stream_id'
dyld[28738]: symbol not found in flat namespace '_mca_common_monitoring_output_stream_id'
--------------------------------------------------------------------------
prterun noticed that process rank 1 with PID 28739 on node Ralphs-MacBook-Air-8 exited on
signal 9 (Killed: 9).
--------------------------------------------------------------------------

I checked and found that the offending symbol lacked an OMPI_DECLSPEC, but that didn't help. I then found that libmca_common_monitoring is not referenced anywhere in a Makefile.am, and thus is never linked into the library. Perhaps someone can figure out where it is supposed to go???

bosilca commented 10 months ago
  1. It does not need to be OMPI_DECLSPEC, we don't expose our internal output streams.
  2. libmca_common_monitoring.la is the common support for monitoring in other frameworks. It is included in the PML/monitoring, OSC/monitoring and coll/monitoring.
rhc54 commented 10 months ago

Hmmm...well, I do a vanilla configure - just a prefix, nothing else. Yet somehow that symbol is missing from libmpi. 🤷‍♂️