ExaWorks / SDK

ExaWorks SDK
11 stars 12 forks source link

ci: run Flux integration tests #51

Closed SteVwonder closed 3 years ago

SteVwonder commented 3 years ago

Expand the smoketest for the Flux docker image by cloning the source code, building it (required for test infrastructure), and then running the tests against the installed Flux (as opposed to in-tree)

Closes #46

dongahn commented 3 years ago

@SteVwonder: what remains to be done for this? At today's meeting, @mtitov wanted this PR to go in first before his PR #56 is tested one last time before the merge.

SteVwonder commented 3 years ago

The OpenMPI that is a part of Centos 7 does not bootstrap properly under Flux (see: https://github.com/flux-framework/flux-core/issues/3779). So that will need to be worked around/fixed before this PR can go in as-is. Alternatively, we could switch the primary MPI tested (maybe mpich?) or the primary distro tested. Maybe it makes sense for #56 to go in first, and I can rebase this PR on top of the changes?

dongahn commented 3 years ago

Maybe it makes sense for #56 to go in first, and I can rebase this PR on top of the changes?

@mtitov: this proposal seems to make more sense. Can your PR go in first?

mtitov commented 3 years ago

@dongahn I applied follow up updates and set it as ready for review (to be merged)

p.s. don't have permissions to assign reviewers

SteVwonder commented 3 years ago

p.s. don't have permissions to assign reviewers

You should now :)

SteVwonder commented 3 years ago

Follow up on the OpenMPI issues:

Based on the trail in https://github.com/flux-framework/flux-core/issues/625, it looks like bootstrapping OpenMPI 1.10 requires the pmi plugin, which the Centos OpenMPI doesn't seem to be compiled with:

[root@6e45ee300d7e /]# ompi_info | grep pmi
[root@6e45ee300d7e /]#

It might also require a patch to the autoconf file to avoid rpath'ing the libpmi library, but that may only be necessary for the case where you want to run OpenMPI under Flux and Slurm (need to dig further to confirm).

Given that the OpenMPI in centos does not provide the pmi plugin, how do we want to proceed? It doesn't appear that EPEL has any newer versions of OpenMPI. Should I compile a version of OpenMPI 1.10 into the base image that includes pmi? If we are going the route of "hand-compiling" MPIs, do we want to use a newer/different version? Should we switch to using mpich or mvapich?

EDIT: my current vote is for compiling a version of OpenMPI 1.10 that includes pmi, since this will avoid affecting any other packages in the SDK. In later phases of the CI testing, we can include other MPIs and other versions.

dongahn commented 3 years ago

Should I compile a version of OpenMPI 1.10 into the base image that includes pmi?

Yes, I vote for this as well using OpenMPI. Since MPI coverage is something that we will have to continue to grow, perhaps we can build our base testing against a stable version that is commonly used in DOE labs including LC/TOSS. That is, unless people like @j-woz has a need for a specific version of OpenMPI?

SteVwonder commented 3 years ago

CI is green now! And this is rebased on master. I assume if this goes in before #56, it will cause more issue than the other way around. So I think we should continue to wait to merge this until after #56 goes in. Given that, the reviews on this can probably wait too.

dongahn commented 3 years ago

Sounds good. I've added a MWP label for #56 so it will go in soon. I will get to this PR soonish.

SteVwonder commented 3 years ago

This is (finally) ready for a review, @dongahn and others :)

SteVwonder commented 3 years ago

The CI test suite passed :tada:. I left most of the changes as fixup commits to make it easier for @dongahn and @mtitov to re-review. The only exceptions are the commit title typo fix and the splitting up of the annotate-checks.sh commit (vendor first, then customize).

@dongahn and @mtitov , if you are happy with the new changes, I'll squash them down. Thanks!

dongahn commented 3 years ago

@SteVwonder: you changes for my suggestions look all good. Feel free to MWP when you and @mtitov agree on further potential changes.

SteVwonder commented 3 years ago

@dongahn and @mtitov : mergify cannot automatically merge this PR since it modifies the Github Actions yaml. Could one of you push the "merge" button?

dongahn commented 3 years ago

Done. Thanks @SteVwonder!