3dem / relion

Image-processing software for cryo-electron microscopy
https://relion.readthedocs.io/en/latest/
GNU General Public License v2.0
453 stars 202 forks source link

MPI_C error in build process #825

Closed prehensilecode closed 3 years ago

prehensilecode commented 3 years ago

Environment:

I am trying to build Relion 4.0beta (see attached CMakeLists.txt): https://github.com/3dem/relion/tree/ver4.0

MPI-related environment variables set:

MPIF90=/ifs/opt_cuda/openmpi/cuda11.2/4.1.0/bin/mpif90
MPICXX=/ifs/opt_cuda/openmpi/cuda11.2/4.1.0/bin/mpic++
MPICC=/ifs/opt_cuda/openmpi/cuda11.2/4.1.0/bin/mpicc
MPIFC=/ifs/opt_cuda/openmpi/cuda11.2/4.1.0bin/mpifort
MPI_HOME=/ifs/opt_cuda/openmpi/cuda11.2/4.1.0
MPI_INCDIR=/ifs/opt_cuda/openmpi/cuda11.2/4.1.0/include
MPIF77=/ifs/opt_cuda/openmpi/cuda11.2/4.1.0/bin/mpif77
MPI_LIBDIR=/ifs/opt_cuda/openmpi/cuda11.2/4.1.0/lib
MPI_CPPFLAGS=-I/ifs/opt_cuda/openmpi/cuda11.2/4.1.0/include

Cmake raises an error with MPI_C, but not with MPI_CXX:

 Could NOT find MPI_C (missing: MPI_C_WORKS)
 CMake Error at /ifs/opt/cmake/3.21.3/share/cmake-3.21/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
   Could NOT find MPI (missing: MPI_C_FOUND) (found version "3.1")
 Call Stack (most recent call first):
   /ifs/opt/cmake/3.21.3/share/cmake-3.21/Modules/FindPackageHandleStandardArgs.cmake:594 (_FPHSA_FAILURE_MESSAGE)
   /ifs/opt/cmake/3.21.3/share/cmake-3.21/Modules/FindMPI.cmake:1748 (find_package_handle_standard_args)
   CMakeLists.txt:213 (find_package)

However, it did successfully find and set MPI_C_COMPILER and MPI_C_COMPILER_INCLUDE_DIRS and other related variables:

 MPIEXEC_EXECUTABLE               /ifs/opt_cuda/openmpi/cuda11.2/4.1.0/bin/mpiexec
 MPIEXEC_MAX_NUMPROCS             48
 MPIEXEC_NUMPROC_FLAG             -n
 MPIEXEC_POSTFLAGS
 MPIEXEC_PREFLAGS
 MPI_CXX_ADDITIONAL_INCLUDE_DIR
 MPI_CXX_COMPILER                 /ifs/opt_cuda/openmpi/cuda11.2/4.1.0/bin/mpicxx
 MPI_CXX_COMPILER_INCLUDE_DIRS    /ifs/opt_cuda/openmpi/cuda11.2/4.1.0/include
 MPI_CXX_COMPILE_DEFINITIONS
 MPI_CXX_COMPILE_OPTIONS          -fexceptions;-pthread
 MPI_CXX_HEADER_DIR               /ifs/opt_cuda/openmpi/cuda11.2/4.1.0/include
 MPI_CXX_LIB_NAMES                mpi_cxx;mpi
 MPI_CXX_LINK_FLAGS               -Wl,-rpath -Wl,/ifs/opt_cuda/openmpi/cuda11.2/4.1.0/lib -pthread
 MPI_CXX_SKIP_MPICXX              OFF
 MPI_C_ADDITIONAL_INCLUDE_DIRS
 MPI_C_COMPILER                   /ifs/opt_cuda/openmpi/cuda11.2/4.1.0/bin/mpicc
 MPI_C_COMPILER_INCLUDE_DIRS      /ifs/opt_cuda/openmpi/cuda11.2/4.1.0/include
 MPI_C_COMPILE_DEFINITIONS
 MPI_C_COMPILE_OPTIONS            -fopenmp
 MPI_C_HEADER_DIR                 /ifs/opt_cuda/openmpi/cuda11.2/4.1.0/include
 MPI_C_LIB_NAMES                  mpi
 MPI_C_LINK_FLAGS                 -Wl,-rpath -Wl,/ifs/opt_cuda/openmpi/cuda11.2/4.1.0/lib
 MPI_mpi_LIBRARY                  /ifs/opt_cuda/openmpi/cuda11.2/4.1.0/lib/libmpi.so
 MPI_mpi_cxx_LIBRARY              /ifs/opt_cuda/openmpi/cuda11.2/4.1.0/lib/libmpi_cxx.so

I am pretty certain the OpenMPI installation works: the test suite succeeded when I ran it.

What can I do to help Cmake find MPI_C correctly?

Relion-4.0beta-CMakeLists.txt

prehensilecode commented 3 years ago

The fix seems to be to clear the value of MPI_C_COMPILE_OPTIONS (i.e. delete -fopenmp)

biochem-fan commented 3 years ago

Just to make sure, in RELION 4.0, please use CMake variables, not environmental variables, as described in https://relion.readthedocs.io/en/release-4.0/Installation.html to specify compilers.

prehensilecode commented 3 years ago

Just to make sure, in RELION 4.0, please use CMake variables, not environmental variables, as described in https://relion.readthedocs.io/en/release-4.0/Installation.html to specify compilers.

I used the ccmake TUI to set Cmake variables.

As for the "-fopenmp" option: our Open MPI installation is built with OpenMP enabled, for hybrid MPI-OpenMP code. The Open MPI self-tests succeed. Another test is our build of GROMACS 2021.3 using Intel icc, Open MPI 4.1.0 with OpenMP, and CUDA 11.2 works: it successfully runs multi-node, multi-GPU, and multi-threaded.

biochem-fan commented 3 years ago

Running ccmake and then setting variables manually, and running cmake -D... can lead to different results. Can you first make sure cmake -D... also does not work? (Please delete your current build directory first)

prehensilecode commented 3 years ago

Running ccmake and then setting variables manually, and running cmake -D... can lead to different results. Can you first make sure cmake -D... also does not work? (Please delete your current build directory first)

This issue has been resolved by editing the Cmake variable in ccmake, which is why I closed it.

Why would using -D... give a different result from defining the variable interactively, either in the GUI or TUI? The official documentation makes no such warning.

biochem-fan commented 3 years ago

Why would using -D... give a different result from defining the variable interactively, either in the GUI or TUI? The official documentation makes no such warning.

The CMake script defines some variables depending on other variables. If you define a variable by -DXXX=YYY, the script see the variable from the beginning and set other variables accordingly. If you don't use -D, the script sees the default values first and then you modify some of the results. You can set the variable XXX to YYY afterwards, but you never know which other variables need to be changed to get the same effect as the first case.

Please do this for #826.