NWChemEx / ParallelZone

You're travelling through another dimension, a dimension not only of CPUs and threads but of GPUs; a journey into a wondrous land whose boundaries are bandwidth limited. That's the signpost up ahead - your next stop, the ParallelZone!
https://nwchemex.github.io/ParallelZone/
Apache License 2.0
0 stars 1 forks source link

Non-MPI_COMM_WORLD MADNESS Initialization on MPICH #55

Closed wavefunction91 closed 2 years ago

wavefunction91 commented 2 years ago

This is actually more of a MADNESS issue than a NWX one (@evaleev), but since we use the signature, I'm just going to track it here.

Non-MPI_COMM_WORLD initialization of MADNESS doesn't compile using the following signature for MPICH-API compatible MPI https://github.com/NWChemEx-Project/ParallelZone/blob/master/src/parallelzone/runtime/runtime_view.cpp#L26

This is because, unlike OpenMPI where MPI_Comm is a proper strong-type, MPICH used a typedef to int. This makes that signature ambiguous. For example, on NERSC Cori (Cray MPI)

[ 70%] Building CXX object CMakeFiles/parallelzone.dir/src/parallelzone/runtime/runtime_view.cpp.o
/global/cfs/projectdirs/m1027/dbwy/NWChemEx/logger/ParallelZone/src/parallelzone/runtime/runtime_view.cpp: In function 'auto parallelzone::runtime::{anonymous}::start_madness(int, char**, const MPI_Comm&)':
/global/cfs/projectdirs/m1027/dbwy/NWChemEx/logger/ParallelZone/src/parallelzone/runtime/runtime_view.cpp:26:42: error: call of overloaded 'initialize(int&, char**&, const MPI_Comm&, bool)' is ambiguous
   26 |             pworld = &madness::initialize(argc, argv, comm, true);
      |                       ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
In file included from /global/cfs/projectdirs/m1027/dbwy/NWChemEx/logger/build/_deps/madworld-src/src/madness/world/world.h:68,
                 from /global/cfs/projectdirs/m1027/dbwy/NWChemEx/logger/build/_deps/madworld-src/src/madness/world/MADworld.h:42,
                 from /global/cfs/projectdirs/m1027/dbwy/NWChemEx/logger/ParallelZone/include/parallelzone/runtime/runtime_view.hpp:3,
                 from /global/cfs/projectdirs/m1027/dbwy/NWChemEx/logger/ParallelZone/src/parallelzone/runtime/detail_/resource_set_pimpl.hpp:3,
                 from /global/cfs/projectdirs/m1027/dbwy/NWChemEx/logger/ParallelZone/src/parallelzone/runtime/runtime_view.cpp:1:
/global/cfs/projectdirs/m1027/dbwy/NWChemEx/logger/build/_deps/madworld-src/src/madness/world/worldinit.h:95:12: note: candidate: 'madness::World& madness::initialize(int&, char**&, int, bool)'
   95 |     World& initialize(int& argc, char**& argv, int nthread, bool quiet = false);
      |            ^~~~~~~~~~
/global/cfs/projectdirs/m1027/dbwy/NWChemEx/logger/build/_deps/madworld-src/src/madness/world/worldinit.h:153:12: note: candidate: 'madness::World& madness::initialize(int&, char**&, const MPI_Comm&, bool)'
  153 |     World& initialize(int& argc, char**& argv, const MPI_Comm& comm,
      |            ^~~~~~~~~~
/global/cfs/projectdirs/m1027/dbwy/NWChemEx/logger/build/_deps/madworld-src/src/madness/world/worldinit.h:175:12: note: candidate: 'madness::World& madness::initialize(int&, char**&, const MPI_Comm&, int, bool)'
  175 |     World& initialize(int& argc, char**& argv, const MPI_Comm& comm,
      |            ^~~~~~~~~~
make[2]: *** [CMakeFiles/parallelzone.dir/build.make:132: CMakeFiles/parallelzone.dir/src/parallelzone/runtime/runtime_view.cpp.o] Error 1
evaleev commented 2 years ago

this has been addressed in https://github.com/m-a-d-n-e-s-s/madness/commit/7ad8de6011b0fb5d0391b8bd9a8d1b7513fea844

wavefunction91 commented 2 years ago

Closed by #56