open-mpi / ompi

Open MPI main development repository
https://www.open-mpi.org
Other
2.08k stars 844 forks source link

Main branch compilation broken with older hwloc & segfault with internal hwloc #11637

Closed wenduwan closed 11 months ago

wenduwan commented 1 year ago

Background information

What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)

https://github.com/open-mpi/ompi/commit/42e577f1d7e39207359146594b37264a5a7a5709

We confirmed that the issue is related to this change.

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

On main branch

./autogen.pl
./configure --prefix=/home/ec2-user/ompi/install --with-sge --without-verbs --with-libfabric=/opt/amazon/efa --disable-man-pages --with-libevent=external --with-hwloc=external --enable-cuda --with-cuda=/usr/local/cuda --with-cuda-libdir=/usr/local/cuda/lib64/stubs --disable-builtin-atomics --enable-debug
make -j install

If you are building/installing from a git clone, please copy-n-paste the output from git submodule status.

$ git submodule status
 10fe4735ee374f5807c2160e61274c4aa53491ae 3rd-party/openpmix (v1.1.3-3847-g10fe4735)
 d8bd12b3ffda4af6918d641f024a6b0118789700 3rd-party/prrte (psrvr-v2.0.0rc1-4624-gd8bd12b3ff)
 c1cfc910d92af43f8c27807a9a84c9c13f4fbc65 config/oac (heads/main)

Please describe the system on which you are running

Interna Externall hwloc

$ yum list installed | grep hwloc
hwloc.x86_64                        1.11.8-4.amzn2                   @amzn2-core
hwloc-devel.x86_64                  1.11.8-4.amzn2                   @amzn2-core
hwloc-gui.x86_64                    1.11.8-4.amzn2                   @amzn2-core
hwloc-libs.x86_64                   1.11.8-4.amzn2                   @amzn2-core
hwloc-plugins.x86_64                1.11.8-4.amzn2                   @amzn2-core

Details of the problem

Problem 1: Compilation error with external hwloc

  CC       common_ofi.lo
  LN_S     libopen-palmca_common_ofi.la
common_ofi.c: In function 'is_near':
common_ofi.c:619:25: error: 'struct hwloc_obj' has no member named 'io_first_child'; did you mean 'first_child'?
     for(osdev = pcidev->io_first_child; osdev != NULL; osdev = osdev->next_sibling) {
                         ^~~~~~~~~~~~~~
                         first_child

Problem 2: Segfault with internal libevent & hwloc with OSU microbenchmark

In this case we did ./configure ... --with-libevent=internal --with-hwloc=internal ....

Then we ran omb

/home/ec2-user/ompi/install/bin/mpirun --hostfile hostfile --map-by ppr:2:node --bind-to none -x PATH=/home/ec2-user/ompi/install/bin:$PATH /home/ec2-user/omb/install/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_mbw_mr

Note: omb was build against ompi and hostfile has 2 p4d.24xlarge instances.

The segfault happens here(redacted paths for conciseness)

#0  0x00007ff9c81e0707 in __strncasecmp_l_avx () from /lib64/libc.so.6
#1  0x00007ff9c712e57f in pmix_hwloc_destruct_topology (src=0x7ff9c7a40a7a) at hwloc/pmix_hwloc_datatype.c:504
#2  0x00007ff9c728ff09 in pmix_bfrops_base_tma_topology_destruct (t=0x7ff9c7a40a7a, tma=0x0) at .../ompi/3rd-party/openpmix/src/mca/bfrops/base/bfrop_base_tma.h:901
#3  0x00007ff9c7290081 in pmix_bfrops_base_tma_topology_free (t=0x7ff9c7a40a7a, n=1, tma=0x0) at .../ompi/3rd-party/openpmix/src/mca/bfrops/base/bfrop_base_tma.h:950
#4  0x00007ff9c72995b1 in PMIx_Topology_free (t=0x7ff9c7a40a7a, n=1) at base/bfrop_base_macro_backers.c:327
#5  0x00007ff9c79d85f5 in compute_dev_distances (distances=0x7ffea0685be0, ndist=0x7ffea0685e20) at common_ofi.c:487
#6  0x00007ff9c79d875d in get_nearest_nics (num_distances=0x7ffea0685e7c, valin=0x7ffea0685e88) at common_ofi.c:535
#7  0x00007ff9c79d94f1 in opal_common_ofi_select_provider (provider_list=0x1e97d40, process_info=0x7ff9c7c78bc0 <opal_process_info>) at common_ofi.c:804
#8  0x00007ff9c8b19751 in select_ofi_provider (providers=0x1e97d40, include_list=0x0, exclude_list=0x1d3c310) at mtl_ofi_component.c:357
#9  0x00007ff9c8b1ab20 in ompi_mtl_ofi_component_init (enable_progress_threads=false, enable_mpi_threads=false, accelerator_support=0x7ff9c8fd9770 <mca_mtl_ofi_component+272>) at mtl_ofi_component.c:780
#10 0x00007ff9c8b0fb70 in ompi_mtl_base_select (enable_progress_threads=false, enable_mpi_threads=false, priority=0x7ffea06860fc) at base/mtl_base_frame.c:78
#11 0x00007ff9c8c77f42 in mca_pml_cm_component_init (priority=0x7ffea06860fc, enable_progress_threads=false, enable_mpi_threads=false) at pml_cm_component.c:146
#12 0x00007ff9c8c4a67f in mca_pml_base_select (enable_progress_threads=false, enable_mpi_threads=false) at base/pml_base_select.c:127
#13 0x00007ff9c892b8f1 in ompi_mpi_instance_init_common (argc=1, argv=0x7ffea0686f08) at instance/instance.c:508
#14 0x00007ff9c892c2da in ompi_mpi_instance_init (ts_level=0, info=0x62e9e0 <ompi_mpi_info_null>, errhandler=0x7ff9c8fea820 <ompi_mpi_errors_are_fatal>, instance=0x7ff9c8ffb900 <ompi_mpi_instance_default>, argc=1, argv=0x7ffea0686f08) at instance/instance.c:814
#15 0x00007ff9c891ba74 in ompi_mpi_init (argc=1, argv=0x7ffea0686f08, requested=0, provided=0x7ffea0686d7c, reinit_ok=false) at runtime/ompi_mpi_init.c:359
#16 0x00007ff9c89819cf in PMPI_Init (argc=0x7ffea0686dbc, argv=0x7ffea0686db0) at init.c:67
#17 0x000000000040269e in main (argc=<optimized out>, argv=<optimized out>) at osu_mbw_mr.c:49
wenduwan commented 1 year ago

@amirshehataornl Do you have any insight into this?

rhc54 commented 1 year ago

Hmmm....well, with bind-to none, you cannot get device distances as you are not bound to anything. This is why mpirun didn't provide them. That said, the code in OMPI has an error in it as you cannot free the topology returned by PMIx_Load_topology - you are just being given a pointer to the data stored in PMIx.

I can try to provide an OMPI patch for that problem.

wenduwan commented 1 year ago

@rhc54 Thank you!

Meanwhile the compile error with external hwloc should be addressed separately. Is howloc 1.11.8 too old to be useful? Or I guess the question is what is the oldest "supported" hwloc version?

rhc54 commented 1 year ago

what is the oldest "supported" hwloc version?

I wouldn't know - I believe you folks support back that far, but someone over there would have to answer that question.

amirshehataornl commented 1 year ago

sorry for the late response. I'm currently on the move.

@rhc54

Is this what you were thinking:

diff --git a/opal/mca/common/ofi/common_ofi.c b/opal/mca/common/ofi/common_ofi.c
index e882c3c833..6e03ac1be5 100644
--- a/opal/mca/common/ofi/common_ofi.c
+++ b/opal/mca/common/ofi/common_ofi.c
@@ -484,7 +484,6 @@ static int compute_dev_distances(pmix_device_distance_t **distances,
     }

     /* load the PMIX topology */
-    PMIx_Topology_free(pmix_topo, 1);
     ret = PMIx_Load_topology(pmix_topo);
     if (PMIX_SUCCESS != ret) {
         goto out;
@@ -497,7 +496,6 @@ static int compute_dev_distances(pmix_device_distance_t **distances,
                                  ndist);
     PMIx_Info_free(info, ninfo);

-    PMIx_Topology_free(pmix_topo, 1);
 out:
     return ret;
 }
rhc54 commented 1 year ago

See https://github.com/open-mpi/ompi/pull/11641 for full fix

lrbison commented 1 year ago

Thank you for the fix Ralph!

@amirshehataornl Based on VERSION we still support hwloc>=1.11.0, but I've confirmed I get the compile error @wenduwan showed above (about io_first_child) when I try to compile using hwloc 1.11.0.

Do you know of an alternate way to do that loop?

wenduwan commented 11 months ago

Issue fixed. Closing.