open-mpi / ompi

Open MPI main development repository
https://www.open-mpi.org
Other
2.16k stars 858 forks source link

Slow MPI_Group_difference #12286

Open k202077 opened 9 months ago

k202077 commented 9 months ago

In a setup (using OpenMPI 4.1.3) with >14,000 processes, we noticed an unusually long initialization time. While investigating this, we found out that ~60 consecutive calls to MPI_Group_difference involving a group, which contained all processes of the run, took several minutes. I suspect that the implementation of ompi_group_dense_overlap (used by MPI_Group_difference) is sub optimal for such cases, because it seems to use an algorithm with a time complexity of O(n²) .

We could replicate a similar functionality using a collective MPI_Allreduce, which was many times faster, even though MPI_Group_difference is a local operation.

A more sophisticated algorithm (by for example by using sorted lists of the processes of each group) should be able to improve the performance significantly.

jsquyres commented 9 months ago

This is a request for a performance improvement of MPI_Group_difference(). It is unlikely that we'll take such an improvement back on the v4.1.x series -- that series is (slowly) being retired in favor of the v5.0.x series. I.e., we're still actively taking bug fixes, but not necessarily new features / overhauls of existing algorithms.