Closed SteVwonder closed 3 years ago
@SteVwonder: what remains to be done for this? At today's meeting, @mtitov wanted this PR to go in first before his PR #56 is tested one last time before the merge.
The OpenMPI that is a part of Centos 7 does not bootstrap properly under Flux (see: https://github.com/flux-framework/flux-core/issues/3779). So that will need to be worked around/fixed before this PR can go in as-is. Alternatively, we could switch the primary MPI tested (maybe mpich?) or the primary distro tested. Maybe it makes sense for #56 to go in first, and I can rebase this PR on top of the changes?
Maybe it makes sense for #56 to go in first, and I can rebase this PR on top of the changes?
@mtitov: this proposal seems to make more sense. Can your PR go in first?
@dongahn I applied follow up updates and set it as ready for review (to be merged)
p.s. don't have permissions to assign reviewers
p.s. don't have permissions to assign reviewers
You should now :)
Follow up on the OpenMPI issues:
Based on the trail in https://github.com/flux-framework/flux-core/issues/625, it looks like bootstrapping OpenMPI 1.10 requires the pmi
plugin, which the Centos OpenMPI doesn't seem to be compiled with:
[root@6e45ee300d7e /]# ompi_info | grep pmi
[root@6e45ee300d7e /]#
It might also require a patch to the autoconf file to avoid rpath'ing the libpmi library, but that may only be necessary for the case where you want to run OpenMPI under Flux and Slurm (need to dig further to confirm).
Given that the OpenMPI in centos does not provide the pmi
plugin, how do we want to proceed? It doesn't appear that EPEL has any newer versions of OpenMPI. Should I compile a version of OpenMPI 1.10 into the base image that includes pmi
? If we are going the route of "hand-compiling" MPIs, do we want to use a newer/different version? Should we switch to using mpich
or mvapich
?
EDIT: my current vote is for compiling a version of OpenMPI 1.10 that includes pmi
, since this will avoid affecting any other packages in the SDK. In later phases of the CI testing, we can include other MPIs and other versions.
Should I compile a version of OpenMPI 1.10 into the base image that includes pmi?
Yes, I vote for this as well using OpenMPI. Since MPI coverage is something that we will have to continue to grow, perhaps we can build our base testing against a stable version that is commonly used in DOE labs including LC/TOSS. That is, unless people like @j-woz has a need for a specific version of OpenMPI?
CI is green now! And this is rebased on master. I assume if this goes in before #56, it will cause more issue than the other way around. So I think we should continue to wait to merge this until after #56 goes in. Given that, the reviews on this can probably wait too.
Sounds good. I've added a MWP label for #56 so it will go in soon. I will get to this PR soonish.
This is (finally) ready for a review, @dongahn and others :)
The CI test suite passed :tada:. I left most of the changes as fixup commits to make it easier for @dongahn and @mtitov to re-review. The only exceptions are the commit title typo fix and the splitting up of the annotate-checks.sh
commit (vendor first, then customize).
@dongahn and @mtitov , if you are happy with the new changes, I'll squash them down. Thanks!
@SteVwonder: you changes for my suggestions look all good. Feel free to MWP when you and @mtitov agree on further potential changes.
@dongahn and @mtitov : mergify cannot automatically merge this PR since it modifies the Github Actions yaml. Could one of you push the "merge" button?
Done. Thanks @SteVwonder!
Expand the smoketest for the Flux docker image by cloning the source code, building it (required for test infrastructure), and then running the tests against the installed Flux (as opposed to in-tree)
Closes #46