envelope-project / laik

Other
9 stars 8 forks source link

tests/mpi: Use "mpiexec -n 4" instead of "mpirun -np 4" everywhere. #133

Closed AlexanderKurtz closed 6 years ago

AlexanderKurtz commented 6 years ago

In contrast to mpirun -np 4, mpiexec -n 4 is actually supported by the MPI standard [0] and works with both OpenMPI and MPICH, so this should get us one step closer to be able to testing with MPICH.

[0] http://mpi-forum.org/docs/mpi-3.1/mpi31-report/node228.htm#Node228

weidendo commented 6 years ago

NACK. When both MPICH and OpenMPI are installed on Debian, it looks like this:

ls -l /usr/bin/mpirun* lrwxrwxrwx 1 root root 24 Mai 26 2017 /usr/bin/mpirun -> /etc/alternatives/mpirun lrwxrwxrwx 1 root root 13 Aug 10 2017 /usr/bin/mpirun.mpich -> mpiexec.hydra lrwxrwxrwx 1 root root 7 Jun 22 2017 /usr/bin/mpirun.openmpi -> orterun

Thus, the "mpirun" command used in tests must be configurable. However, I am fine with using the variable name MPIEXEC for this :-)

AlexanderKurtz commented 6 years ago

You are right, we eventually want to have the "mpirun" command used in the tests to be configurable. However, this PR still has merit without this: Since mpiexec -n is guaranteed to work by the MPI standard itself, this makes the test suite more robust when confronted with an unknown MPI implementantion. That is the actual improvement this PR brings, so please re-consider this!

weidendo commented 6 years ago

It works now already. I run the tests sometimes using MPICH on my laptop without problems (this needs "update-alternatives" for changing the symbolic links). And no, on SuperMUC you have to use "poe" with IBM MPI. You may argue that this is not standard specific, but this is the reality :-)

But to make the test-suite really run multi-node tests e.g. on SuperMUC, we would need to (1) generate job-scripts (specific to the HPC system - must use a template) (2) submit them to the job manager (command is specific to job manager) (3) wait for jobs to finish, and collect the final results (probably by polling for output from a command specific to the job manager)

I think this only can be done from a custom test driver. Therefore I see absolutely no use in running the tests from cmake: we better directly go for a custom test driver.