MxUI / MUI

Multiscale Universal Interface: A Concurrent Framework for Coupling Heterogeneous Solvers
http://mxui.github.io/
Apache License 2.0
55 stars 40 forks source link

Calling mpi_split_by_app with a single app #28

Closed rupertnash closed 5 years ago

rupertnash commented 5 years ago

In lib_mpi_split.h, the function mpi_split_by_app will crash with an unhelpful error (SIGSEGV) if not run in MPMD mode. I implemented a version (see my branch https://github.com/rupertnash/MUI/tree/fix-MPI_APPNUM-missing) that will return MPI_COMM_WORLD in the case of being run as a SPMD program.

I'm not sure if this is in fact the correct approach. It may be better to abort with a sensible message to the user. E.g.:

if (flag) {
  // MPI_Comm_split etc...
  return domain;
} else {
  std::cerr << "Calling mui::mpi_split_by_app with only a single app is erroneous" << std::endl;
  MPI_Abort(MPI_COMM_WORLD, 1);
}

Can you please let me know what fits better with the "MUI philosophy"! Cheers!

SLongshaw commented 5 years ago

I think the second approach of aborting with a helpful error is probably the safest option. The situation doesn't make much sense and it is better a non-MPI savvy user is informed of this than it appearing to work but the logic being wrong. There are a few places in the code where this sort of thing crops up for specific use-cases...

As MUI progresses, decisions like this will need to be unified into en ethos, documentation made, CI added to the repository etc. etc.

rupertnash commented 5 years ago

Cheers Stephen. I'll do that and send a PR. I had a very confused 30mins... Debugging MPMD runs is often non-trivial...

SLongshaw commented 5 years ago

Agreed! If you aren't already, you can make it a fraction easier by tunnelling each rank's console output to individual files using the --output-filename flag with mpirun (i.e. mpirun --output -filename Logs/outputfiles -np 2 solver1 : -np 4 solver 2)