open-mpi / ompi

Open MPI main development repository
https://www.open-mpi.org
Other
2.17k stars 860 forks source link

fortran binding (mpif.h)'s MPI_GROUP_EMPTY invalid in main branch #11806

Closed wzamazon closed 1 year ago

wzamazon commented 1 year ago

Thank you for taking the time to submit an issue!

Background information

What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)

main branch

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

built from source by mtt

If you are building/installing from a git clone, please copy-n-paste the output from git submodule status.

22fe51cb7a961b6060fc5c48e659237cbe162566 ../3rd-party/openpmix (v1.1.3-3872-g22fe51cb) ece4f3c45a07a069e5b8f9c5e641613dfcaeffc3 ../3rd-party/prrte (psrvr-v2.0.0rc1-4638-gece4f3c45a) c1cfc910d92af43f8c27807a9a84c9c13f4fbc65 ../config/oac (heads/main)

Please describe the system on which you are running


Details of the problem

As can been seen in the result of mtt's intel test suite, all fortran tests that used MPI_GROUP_EMPTY is failing. For example: The MPI_Group_union1_f test, which is run by following command:

mpirun -np 144 -N 36 -hostfile /home/ec2-user/PortaFiducia/hostfile /home/ec2-user/PortaFiducia/workloads/mtt/job/rizuka/scratch/TestGet_Intel/ompi-tests/intel_tests/src/MPI_Group_union1_f

Failed with following log:

MPITEST_ERROR(       119): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       119):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       117): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       117):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       122): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       122):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       109): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       109):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       114): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       114):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       125): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       125):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       116): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_ERROR(       113): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       113):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       108): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       108):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_FATAL(       116):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       123): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       123):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       110): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       110):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       118): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       118):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       120): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       120):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       115): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       115):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        67): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        67):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        59): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        59):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        70): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 109 in communicator MPI_COMM_WORLD
  Proc: [[17236,1],109]
  Errorcode: -1

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
 MPITEST_FATAL(        70):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        68): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_ERROR(        63): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        63):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_FATAL(        68):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        65): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_ERROR(        55): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        55):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_FATAL(        65):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        64): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        64):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        57): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        57):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        61): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        61):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        56): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        56):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        66): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        66):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        60): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        60):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        58): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        58):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       106): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       106):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       131): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       131):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        54): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        54):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        95): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        95):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        92): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_ERROR(        98): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        98):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_FATAL(        92):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       101): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       101):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       132): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       132):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       141): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       141):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        36): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        36):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       133): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       133):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        41): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        41):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       135): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       135):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        51): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        51):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       126): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       126):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        38): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        38):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       127): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       127):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        52): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        52):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       136): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       136):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        50): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        50):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       142): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       142):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        46): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        46):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       129): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       129):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        47): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        47):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       139): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       139):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        44): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        44):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       140): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       140):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        53): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        53):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       130): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       130):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        79): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        79):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        42): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        42):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       143): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       143):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        96): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        96):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        40): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        40):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       137): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       137):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       105): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       105):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        48): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        48):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       128): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       128):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        81): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        81):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        39): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        39):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       134): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       134):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        72): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        72):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        45): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        45):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       121): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       121):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       100): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       100):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        49): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        49):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       111): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       111):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        77): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        77):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        43): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        43):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       138): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       138):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        93): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        93):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        71): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        71):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       112): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       112):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        99): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        99):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        62): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        62):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       124): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       124):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        82): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        82):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        69): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        69):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        74): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        74):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        37): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        37):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        87): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        87):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        84): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        84):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        76): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        76):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       107): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       107):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        18): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        18):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       102): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       102):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        35): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        35):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        90): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        90):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        19): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        19):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        78): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        78):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        27): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        27):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       103): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       103):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        30): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        30):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        97): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        97):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        22): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        22):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        73): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        73):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        26): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        26):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        91): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        91):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        24): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        24):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        75): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        75):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        32): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        32):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        89): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        89):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        25): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        25):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        86): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        86):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        29): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        29):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        85): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        85):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        28): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        28):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        83): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        83):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        33): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        33):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        94): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        94):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        31): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        31):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        80): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        80):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        21): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        21):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        88): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        88):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        23): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        23):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(       104): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(       104):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        34): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        34):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        11): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        11):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_INFO (         0): Starting test MPI_GROUP_UNION
 MPITEST_ERROR(         0): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(         0):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(         4): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(         4):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        15): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        15):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        16): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        16):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        13): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        13):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(         8): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(         8):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        14): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        14):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(         3): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(         3):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        12): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        12):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(         9): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(         9):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(         7): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(         7):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(         2): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(         2):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        10): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        10):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(         5): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(         5):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        17): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(        17):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(         1): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(         1):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(         6): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
 MPITEST_FATAL(         6):                                                                         

MPI_ERR_GROUP: invalid group
 MPITEST_ERROR(        20): Non-Zero return code (         9)  From:  MPI_GROUP_UNION #2
(EMPTY,GROUP) ( COMM_INDEX          1
exactly when Open MPI kills them.

basically the test was trying to do a union between a user created group and MPI_GROUP_EMPTY, but found MPI_GROUP_EMPTY to be invalid

wzamazon commented 1 year ago

I did some debug and found the root cause.

It was broken by the following commit

commit 6a406fb3c9ea74aa371bdff82e8eb01bf9820293
Author: Aurelien Bouteiller <bouteill@icl.utk.edu>
Date:   Mon Feb 8 22:39:09 2021 -0500

    Import ULFM Fault Tolerance

    The historical repositories contain the full history and
    attribution and are available from
      https://bitbucket.org/icldistcomp/ulfm2/src/ulfm/
    and prior
      https://github.com/ICLDisco/ulfm-legacy

    Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
    Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
    Signed-off-by: Josh Hursey <jjhursey@open-mpi.org>
    Signed-off-by: Thomas Herault <herault@icl.utk.edu>
    Signed-off-by: Wesley Bland <wbland@icl.utk.edu>
    Signed-off-by: Nuria Losada <nlosada@icl.utk.edu>
    Signed-off-by: Nathan T. Weeks <weeks@iastate.edu>

Which add fault-tolerent MPI support.

Specifically, it is the following code snippet in the commit broke fortran's MPI_GROUP_EMPTY:

 /*
  * Allocate a new group structure
@@ -324,6 +337,16 @@ int ompi_group_init(void)
         return OMPI_ERROR;
     }

+#if OPAL_ENABLE_FT_MPI
+    /* Setup global list of failed processes */
+    ompi_group_all_failed_procs = OBJ_NEW(ompi_group_t);
+    ompi_group_all_failed_procs->grp_proc_count     = 0;
+    ompi_group_all_failed_procs->grp_my_rank        = MPI_UNDEFINED;
+    ompi_group_all_failed_procs->grp_proc_pointers  = NULL;
+    ompi_group_all_failed_procs->grp_flags         |= OMPI_GROUP_DENSE;
+    ompi_group_all_failed_procs->grp_flags         |= OMPI_GROUP_INTRINSIC;
+#endif
+
     /* add MPI_GROUP_NULL to table */
     OBJ_CONSTRUCT(&ompi_mpi_group_null, ompi_group_t);
     ompi_mpi_group_null.group.grp_proc_count        = 0;
    ompi_mpi_group_null.group.grp_my_rank           = MPI_PROC_NULL;
    ompi_mpi_group_null.group.grp_proc_pointers     = NULL;
    ompi_mpi_group_null.group.grp_flags            |= OMPI_GROUP_DENSE;
    ompi_mpi_group_null.group.grp_flags            |= OMPI_GROUP_INTRINSIC;

    /* add MPI_GROUP_EMPTY to table */
    OBJ_CONSTRUCT(&ompi_mpi_group_empty, ompi_group_t);
    ompi_mpi_group_empty.group.grp_proc_count        = 0;
    ompi_mpi_group_empty.group.grp_my_rank           = MPI_UNDEFINED;
    ompi_mpi_group_empty.group.grp_proc_pointers     = NULL;
    ompi_mpi_group_empty.group.grp_flags            |= OMPI_GROUP_DENSE;
    ompi_mpi_group_empty.group.grp_flags            |= OMPI_GROUP_INTRINSIC;

The reason it broke fortran's MPI_GROUP_EMPTY is because MPI_GROUP_EMPTY in fortran is an integer, whose value is 1. which corresponds to the 2nd element in ompi_group_f_to_c_table, and the function MPI_Group_f2c is used to convert fortran index to a C pointer. Therefore, for fortran's MPI_GROUP_EMPTY to work, the C object ompi_mpi_group_empty must be the 2nd elements in ompi_group_f_to_c_table.

However, this sequence was broken by 6a406fb3c9ea74aa371bdff82e8eb01bf9820293, which introduced a new group at the beginning of the ompi_group_f_to_c_table, Causing fortran's MPI_GROUP_EMPTY to be pointing to a different table.

wzamazon commented 1 year ago

Opened https://github.com/open-mpi/ompi/pull/11807 to addressed issue, which move the initialization of ompi_group_all_failed_procs to after the empty group.

wzamazon commented 1 year ago

PR has been merged and backported