SCOREC / pcms

BSD 3-Clause "New" or "Revised" License
2 stars 11 forks source link

wdmapp_coupling should link to MPI #40

Closed germasch closed 4 years ago

germasch commented 4 years ago

For me, the spack-installed coupler (as a dependency of "wdmapp") fails to build because it uses MPI, but the CMakeLists.txt doesn't do find_package(MPI REQUIRED) and the code doesn't link to MPI.

My guess as to why this may work for you is that Kokkos will set the C++ compiler to mpicxx wrapping nvcc_wrapper, but that only happens if Kokkos itself is built with CUDA enabled, but the case where it's failing for me doesn't have a CUDA-enabled Kokkos.

phyboyzhang commented 4 years ago

I can not find MPI_CXX either when I used find_package(MPI REQUIRED) on AIMOS.

cwsmith commented 4 years ago

Hi Kai, I agree, the CMake dependencies and Spack dependencies are under specified.

On the CMake front, ignoring Spack for a moment, my experience has been that portability across the large systems (especially Cray) when CMake find_package(MPI) is not used is better. To that end, in the projects that have legacy usage of it, we have gone so far as provide arguments to bypass it. What has been your experience with CMake 3.10+. Hopefully things have improved.

The Spack behavior is a bit puzzling. Prior to https://github.com/wdmapp/wdmapp-config/pull/30 (kokkos cuda support), I built kokkos and cabana with openmp support underneath the coupler without issue on Summit. I'll try a clean run with the new setup to see if I can reproduce the linking problem.

I seemed to recall that Spack was using the MPI compiler wrappers by default... but my attempts to search the Spack docs to find some mention of this failed.

cwsmith commented 4 years ago

I can not find MPI_CXX either when I used find_package(MPI REQUIRED) on AIMOS.

In general, this type of behavior (i.e., errors) has been my experience too.

germasch commented 4 years ago

I'm aware that FindMPI.cmake can be touchy, though for me it's usually been working (including on Cray). What error are you getting? Maybe this helps:

set(MPI_CXX_SKIP_MPICXX ON)
find_package(MPI REQUIRED)

(Even if not, unless one uses the deprecated MPI C++ bindings, this is not a bad idea.)

So I guess I also don't understand why this works for you but not for me (it works on Summit for me, but not on Ubuntu).

In fact, I'm pretty sure that, e.g., adios2 essentially calls find_package(MPI) in its config, ie., inside of find_package(ADIOS2). So for one, I suppose that means that find_package(MPI) should work on your system, otherwise I'd expect things to go wrong there. But the other side of the coin is that adios2 ends up with a public dependency on MPI, ie., it should make the coupler link to MPI transitively, anyway. And that may be the explanation as to why you got away without doing so explicitly. (Though I'd probably still argue that since you're using MPI directly, not just through adios2, the dependency should be made explicit.)

This however, also contradicts my assumption that my build went wrong because of the lack of a MPI dependency. So I'll dig into what happens here on ubuntu...

germasch commented 4 years ago

So FWIW, the errors I got where undefined references to some function from the MPI C++ bindings. My working theory for why that happened (just a guess, really) is that ADIOS2 doesn't use the MPI C++ bindings, hence didn't link them, and the MPI libs that got inherited by the coupler hence didn't include the C++ MPI bindings -- but just including mpi.h in C++ will cause these undefined references when not linking to them. (The latter part I'm somewhat positive about, because I've seen that happen before)

A proposed fix is in #41, though, I don't know if that'll cause you troubles on your systems.

cwsmith commented 4 years ago

Ahhh, I think you nailed it. The Adios2 CMake config file pulls in a bunch of dependencies and that satisfies the couplers needs when the mpi compiler wrappers are not used.

The fix in the PR looks good. Thank you. For the record, I was a bit confused by the use of MPI::MPI_CXX in the link specification. The following comment in the cmake 3.13 docs explains it:

The module exposes the components C, CXX, MPICXX and Fortran. Each of these controls the various MPI languages to search for. The difference between CXX and MPICXX is that CXX refers to the MPI C API being usable from C++, whereas MPICXX refers to the MPI-2 C++ API that was removed again in MPI-3.