Closed wihobbs closed 9 months ago
The MPI hello test should probably be named hello or similar instead of more the more generic mpi_tests.c. The reason is that we may want to add other simple MPI based tessin the future (e.g. in flux-core we have hello and abort and version tests)
My original reason for doing this was that I thought we could add the abort and version tests as additional functions to mpi_tests.c
, which would mean only having to compile and link one piece of code. I'm guessing you want these to be separately compiled and run?
Eventually, we might want to move the MPI testing driver (currently the script in mpi-test.gitlab-ci.yml) to a script so it is easier to update the set of MPI implementations and compilers that are tested on each cluster. (We may want to add a config file for example so the list is easily updated and all in one place.)
I like the idea of a config file that could compile and run tests and standardize this across multiple machines. The implementation of this is still a little nebulous in my mind. I'll see if I can hammer out an example...
I think moving the flux run ./src/cmd/flux
call to the script would be a good start. We could probably trash the mpi-test.gitlab-ci.yml
file if we did this (and just call the script instead).
My original reason for doing this was that I thought we could add the abort and version tests as additional functions to mpi_tests.c, which would mean only having to compile and link one piece of code. I'm guessing you want these to be separately compiled and run?
This is not a bad idea, but I think it will result in more complexity in the long term (plus if we have a test or benchmark from elsewhere, it will be more work to integrate it into the test program than it would be to just drop in the new test).
I think moving the flux run ./src/cmd/flux call to the script would be a good start. We could probably trash the mpi-test.gitlab-ci.yml file if we did this (and just call the script instead).
That sounds good. I think eventually we'll be submitting a suite of tests to the CI flux instance. The script can eventually handle this submission, monitoring of tests, and collection of results from all jobs.
I think eventually we'll be submitting a suite of tests to the CI flux instance.
What you're describing sounds to me like we'll be creating one Flux instance in CI (probably 2 full nodes) and then submitting many different MPI jobs utilizing different compilers to it, rather than creating many small instances (say, 2 nodes, 1 core on each) for each individual MPI job. Am I tracking correctly?
What you're describing sounds to me like we'll be creating one Flux instance in CI (probably 2 full nodes) and then submitting many different MPI jobs utilizing different compilers to it, rather than creating many small instances (say, 2 nodes, 1 core on each) for each individual MPI job. Am I tracking correctly?
I think there's a small bit of design work that needs to be done here. I haven't thought about this in detail so I apologize if my thoughts are not well-formed, but it seems like each MPI+compiler test is comprised of the following steps (this is just my first thought, so happy to discuss further)
These steps seem to naturally compose what we'd think of as a batch job. The batch script would handle these steps including compilation of the MPI tests with the defined compiler and MPI, then would submit the suite of jobs and collect and report resuls (implementation TBD). An outer script would submit a batch job for each test mpi and compiler that we're targeting to the CI instance. That way the more resources the CI Flux instance has, the faster we'll run through these tests.
Does that make any sense?
The batch script would handle these steps including compilation of the MPI tests with the defined compiler and MPI, then would submit the suite of jobs and collect and report resuls (implementation TBD)
I'll note one drawback to doing the compilation in the batch job is that cores in the allocation will go idle during this stage since no jobs can be run until the compilation completes. An optimization might be to submit the compile step as one single-node job, and the tests as a batch job with a dependency on the compile job. However, this feels like a premature optimization at this point.
Hm, we could also submit all of the compile and MPI tests as jobs to the CI instance with appropriate dependencies (no nested batch jobs). This would allow more flexibility in the size of MPI test jobs and would perhaps be more efficient scheduling. It also may be easier to collect the results since all the jobs are submitted at one level :thinking:
An outer script would submit a batch job for each test mpi and compiler that we're targeting to the CI instance. That way the more resources the CI Flux instance has, the faster we'll run through these tests.
The outer script you described is a major piece this PR is missing. The bare bones of 1-5 you described comprising a batch job are prototyped in de1ce16
. However, 3-5 need improvement.
Hm, we could also submit all of the compile and MPI tests as jobs to the CI instance with appropriate dependencies (no nested batch jobs).
I think we're on the same page. If we're requesting a 2 node instance for testing interconnects, we could submit all of the compilation and run batch jobs to the enclosing instance (each requesting 2 nodes and n cores where n >=2) and let Flux sort out what runs when.
then would submit the suite of jobs and collect and report resuls (implementation TBD).
This is on my todo list not only for the MPI work but for aggregating results from the testsuite runs as well. One thing I have noticed when running MPI jobs is there are some things in stderr
we may want to collect that don't cause a nonzero return code but do say things we should look at.
All excellent thoughts, thanks so much @grondo. I think we're making a lot of progress here, or at least I'm starting to grasp what this could look like. As a first step, I'll look into the "outer script" you described, and we can reason more from there.
@grondo Let me know if this is closer to the target. Note that, for debugging purposes, it currently outputs the stdout
of all completed jobs. I imagine that in the future we could have a debug=True
or --d
flag that did this, and the normal behavior would be to only output failed jobs as we discussed.
Here's how I've been running for testing:
flux alloc -N2
cd ~/flux-test-collective
MPI_TESTS_DIRECTORY=$(pwd)/mpi FTC_DIRECTORY=$(pwd) flux run -N2 ../flux-core/src/cmd/flux start ./mpi/outer_script.sh
@grondo This is ready for another review. Some notable changes:
hello.c
and replaced with 3 tests from flux-coreOh, and another GitLab logfile that might be helpful.
Thanks @grondo for the feedback! I believe I've addressed all your comments.
Merging. Thank you @grondo, I know this one took a lot of work and iterations to review.
This PR is a stab at supporting MPI testing on LC resources.
We want MPI testing to be extensible easily in three major ways:
.gitlab/mpi-test.gitlab-ci.yml
.mpi/mpi_tests.c
. A call in the main function gathering the return code would also be required..gitlab/mpi-test.gitlab-ci.yml
that covered the MPI implementations and compilers for that machine, and three things would need to be added to the main.gitlab-ci.yml
file: the machine specifications, a reference wrapper building flux and executing the MPI tests, and a test for gitlab to run. See.corona
,.test-core-mpi-corona
, andcorona-mpi-test
, respectively, examples of this.