LLNL / SAMRAI

Structured Adaptive Mesh Refinement Application Infrastructure - a scalable C++ framework for block-structured AMR application development
https://computing.llnl.gov/projects/samrai
Other
224 stars 80 forks source link

/usr/bin/ld: cannot find -lmpi #222

Closed PhilipDeegan closed 11 months ago

PhilipDeegan commented 1 year ago

Hi there

In our CI we have multiple ways of using SAMRAI.

  1. We build SAMRAI with our code via "add_subdirectory" in cmake
  2. We install SAMRAI system wide in docker container

This second method is more efficient so we don't build SAMRAI on each PR, however we are noticing an issue with this recently in that the link commands are now different. Previously our link commands would look like this

... /usr/local/lib/libSAMRAI_hier.a /usr/local/lib/libSAMRAI_tbox.a /usr/lib64/openmpi/lib/libmpi.so /usr/lib64/openmpi/lib/libhdf5.so ...

but now we are seeing the following

... /usr/local/lib/libSAMRAI_hier.a /usr/local/lib/libSAMRAI_tbox.a /usr/lib64/openmpi/lib/libhdf5.so -lmpi ...

Which is failing as libmpi.so is not in a default search directory.

I'm not positive any recent changes in SAMRAI caused this, but I do find it quite peculiar.

Any help would be appreciated, we have relied upon SAMRAI to sort out MPI things in cmake until now.

PhilipDeegan commented 1 year ago

Digging a bit further into it we can see in the installation file SAMRAITarget.cmake

with the current version of SAMRAI we get

set_target_properties(SAMRAI_tbox PROPERTIES
  INTERFACE_INCLUDE_DIRECTORIES "/usr/include/hdf5/mpich;${_IMPORT_PREFIX}/include"
  INTERFACE_LINK_LIBRARIES "/usr/lib/x86_64-linux-gnu/hdf5/mpich/libhdf5.so;/usr/lib/x86_64-linux-gnu/libsz.so;/usr/lib/x86_64-linux-gnu/libz.so;/usr/lib/x86_64-linux-gnu/libdl.so;/usr/lib/x86_64-linux-gnu/libm.so;mpi"
)

This shows mpi without a complete path

in v4.1.2 we get

set_target_properties(SAMRAI_tbox PROPERTIES
  INTERFACE_INCLUDE_DIRECTORIES "/usr/include/hdf5/mpich;/path/to/lib/mpich/include;${_IMPORT_PREFIX}/include"
  INTERFACE_LINK_LIBRARIES "/usr/lib/x86_64-linux-gnu/hdf5/mpich/libhdf5.so;/usr/lib/x86_64-linux-gnu/libsz.so;/usr/lib/x86_64-linux-gnu/libz.so;/usr/lib/x86_64-linux-gnu/libdl.so;/usr/lib/x86_64-linux-gnu/libm.so;/path/to/lib/mpich/lib/libmpi.so;/path/to/lib/mpich/lib/libmpicxx.so;/path/to/lib/mpich/lib/libmpifort.so"
  INTERFACE_LINK_OPTIONS "SHELL:-Wl,-rpath -Wl,/path/to/lib/mpich/lib -Wl,--enable-new-dtags"
  INTERFACE_SYSTEM_INCLUDE_DIRECTORIES "/path/to/lib/mpich/include"
)

This shows mpi with a complete path

If I was to take a guess I would say it was this commit updating BLT.

nselliott commented 1 year ago

One thing you could try is building SAMRAI develop or 4.2.0 with the older BLT, commit 2c192774b587c245ec2d7022b2e862395ffa8a21 which is BLT v0.3.0. We were years behind on BLT releases, but the older BLT should still work for x86_64/gnu builds, so you could cd into the blt submodule subdirectory in SAMRAI and checkout that commit and try to build. That could verify that this is due to a change in BLT, and you could proceed through subsequent BLT releases to narrow down when the change occurred.

I also see -lmpi in our installations rather that the full libmpi.so path, but we're getting a -L flag on the link line that adds the MPI directory to the search.

PhilipDeegan commented 1 year ago

Thanks for the reply, I'll try a bisect on BLT and see how it goes

PhilipDeegan commented 1 year ago

Assuming I didn't make a mess of the bisect, it suggests the last working commit is https://github.com/LLNL/blt/commit/01d9dadd1469cf1f7e86967a04bac5e60930f611

and the first "bad" commit is https://github.com/LLNL/blt/commit/bd140fe74a92ab8d5f1fc5af2be0a790d5d49f91

humhang commented 1 year ago

I had the same problem. After digging into the details of BLT I found that this is because of a change made in BLT v0.4.0.

During BLT set up, blt/cmake/thirdparty/SetupMPI.cmake will be called to setup the MPI-related stuff.

Before BLT v0.4.0, SetupMPI.cmake calls blt_register_library, which sets variable BLT_MPI_LIBRARIES as the absolute path to mpi runtime library. When SAMRAI calls blt_add_library to add its targets, the variable BLT_MPI_LIBRARIES will be read and appended to properties LINK_LIBRARIES and INTERFACE_LINK_LIBRARIES of SAMRAI targets.

On and after BLT v0.4.0, SetupMPI.cmake will call blt_import_library, which creates a CMake target named "mpi" without setting the variable BLT_MPI_LIBRARIES. When SAMRAI calls blt_add_library to add its targets, simply the string "mpi" will be appended to properties LINK_LIBRARIES and INTERFACE_LINK_LIBRARIES of SAMRAI targets.

My code links against SAMRAI, therefore string "mpi" in INTERFACE_LINK_LIBRARIES of SAMRAI targets will be transitively added to LINK_LIBRARIES of my code. My code does not use BLT, therefore the string "mpi" is not a CMake target name, and cmake will treat this as "-lmpi".

Add include("SetupBLT.cmake") in the CMakeLists.txt of my code solves this problem since the string "mpi" is now a CMake target.

humhang commented 1 year ago

Umpire has fixed this issue for newer BLT by exporting the target (openmp, mpi, etc) created by blt, thus the downstream code does not need to setup BLT. SAMRAI has not made this fix yet. So maybe not using BLT newer than 0.4..

PhilipDeegan commented 11 months ago

We've added find_package(MPI REQUIRED) in cmake ourselves as a fallback which seems to resolve things