Reference-ScaLAPACK / scalapack

ScaLAPACK development repository
Other
127 stars 55 forks source link

MPI not linked during build? #66

Closed drhpc closed 2 years ago

drhpc commented 2 years ago

This issue covers apparently very basic build trouble on the path to package ScaLAPACK for pkgsrc, so I hope an issue is warranted. I do have changes to suggest, too, convering handling of dependencies (.pc file), but first the current version should at least build …

I might be too obvious, but I have trouble building scalapack 2.2.0. CMake lokates my compiler/MPI setup just fine:

$ mkdir b
$ cd b
$ cmake -DMPI_BASE_DIR=$(dirname $(dirname $(which mpicxx))) -DBUILD_SHARED_LIBS=on ..
[…]
-- Found MPI_C: /stuff/sw/env/gcc-10.3.0/pkgsrc/cvs-20210222/lib/libmpi.so (found version "3.0") 
-- Found MPI_Fortran: /stuff/sw/env/gcc-10.3.0/pkgsrc/cvs-20210222/lib/libmpi_usempif08.so (found version "3.0") 
-- Found MPI: TRUE (found version "3.0")  
-- Found MPI_LIBRARY : TRUE 
-- --> MPI C Compiler : /stuff/sw/env/gcc-10.3.0/pkgsrc/cvs-20210222/bin/mpicc
-- --> C Compiler : /stuff/sw/env/gcc-10.3.0/pkgsrc/cvs-20210222/bin/mpicc
-- --> MPI Fortran Compiler : /stuff/sw/env/gcc-10.3.0/pkgsrc/cvs-20210222/bin/mpif90
-- --> Fortran Compiler : /usr/bin/gfortran

But it doesn't even try to link to MPI, apparently:

[ 69%] Building Fortran object TESTING/LIN/CMakeFiles/xslu.dir/pslafchk.f.o
[ 69%] Linking Fortran executable ../xslu
/usr/bin/ld: ../../lib/libscalapack.so: undefined reference to `ompi_mpi_op_sum'
/usr/bin/ld: ../../lib/libscalapack.so: undefined reference to `MPI_Bcast'
/usr/bin/ld: ../../lib/libscalapack.so: undefined reference to `MPI_Op_create'
/usr/bin/ld: ../../lib/libscalapack.so: undefined reference to `MPI_Pack_size'
/usr/bin/ld: ../../lib/libscalapack.so: undefined reference to `MPI_Send'
/usr/bin/ld: ../../lib/libscalapack.so: undefined reference to `ompi_mpi_packed'

and, obviously

$ ldd lib/libscalapack.so 
        linux-vdso.so.1 (0x00007ffe647b2000)
        libopenblas.so.0 => /stuff/sw/env/gcc-10.3.0/pkgsrc/cvs-20210222/lib/libopenblas.so.0 (0x00007f4f73090000)
        libpthread.so.0 => /lib/libpthread.so.0 (0x00007f4f73057000)
        libgfortran.so.5 => /usr/lib/libgfortran.so.5 (0x00007f4f72ea0000)
        libm.so.6 => /lib/libm.so.6 (0x00007f4f72ddc000)
        libdl.so.2 => /lib/libdl.so.2 (0x00007f4f72dd7000)
        libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x00007f4f72dbf000)
        libquadmath.so.0 => /usr/lib/libquadmath.so.0 (0x00007f4f72d81000)
        libc.so.6 => /lib/libc.so.6 (0x00007f4f72bf6000)
        /lib/ld-linux-x86-64.so.2 (0x00007f4f75245000)

Shouldn't there be a reference to libmpi?

I also don't see e.g. MPI_LIBRARIES, MPI_LINK_FLAGS, or anything like that used in CMakeLists.txt. I guess it is supposed to make more use of the MPI compiler wrappers? How is this supposed to work? (Note: Don't be fooled by the cvs-20210222 name, it is a current checkout of the pkgsrc tree, which I am trying to add scalapack to.)

When I tried to hack in use of MPI_LIBRARIES, the main MPI symbols were resolved, but I still had some missing ones at some stage. So, before I dive too deep, I hoped to get some clarification that the build should normally just work or that I'm doing something wrong. I did have a working draft installation of scalapack-2.1.0 before. I might have to revisit why that worked.

Rickbude commented 2 years ago

I think I have hit the same (or at least a very similar) issue, but it manifests slightly differently. make scalapack succeeds in my case, but ld reveals there are undefined symbols, just like you:

#build (clean build directory, empty cache)
cmake .. -DBUILD_SHARED_LIBS=1
make -j24
ld lib/libscalapack.so

Outputs:

ld: warning: cannot find entry symbol _start; not setting start address
ld: lib/libscalapack.so: undefined reference to `ompi_mpi_op_sum'
ld: lib/libscalapack.so: undefined reference to `MPI_Bcast'
ld: lib/libscalapack.so: undefined reference to `MPI_Op_create'
ld: lib/libscalapack.so: undefined reference to `MPI_Pack_size'
ld: lib/libscalapack.so: undefined reference to `MPI_Send'
ld: lib/libscalapack.so: undefined reference to `ompi_mpi_packed'
ld: lib/libscalapack.so: undefined reference to `MPI_Isend'
....

This seems to easily be fixed by linking scalapack with MPI::MPI_C (or with ${MPI_C_LIBRARIES} for CMake < 3.9):

#At CMakeLists.txt line 247, change
target_link_libraries( scalapack ${LAPACK_LIBRARIES} ${BLAS_LIBRARIES})
#to
target_link_libraries( scalapack ${LAPACK_LIBRARIES} ${BLAS_LIBRARIES} ${MPI_C_LIBRARIES})

For me this seems to work, but I don't if this would have any negative side effects, and have not tested this patch on Windows. So I am a bit hesitant to make a PR for this.


Sidenote: I hit this problem when trying to link with ScaLAPACK (imported with FetchContent_Declare). If you have a similar usecase, it is also possible to just add a patch in your own code and circumvent the problem that way. Here is a minimal example to demonstrate that:

#CMakeLists.txt
cmake_minimum_required(VERSION 3.14)
project(ScaLAPACK_TEST LANGUAGES C CXX Fortran)

#Enable fetching packages from git
include(FetchContent)

set(BUILD_SHARED_LIBS TRUE)

#Download and build ScaLAPACK 2.2.0 , commit from 7 July 2022
message(STATUS "Dowloading and building ScaLAPACK from source...")
FetchContent_Declare(referenceScaLAPACK
    GIT_REPOSITORY https://github.com/Reference-ScaLAPACK/scalapack.git
    GIT_TAG        637a6e5f2953263d4f05574c8d6037356a81b9ea
)
FetchContent_MakeAvailable(referenceScaLAPACK)

#Enable the following two lines to resolve build issues
#find_package(MPI)
#target_link_libraries(scalapack MPI::MPI_C)

add_executable(main main.cpp)
target_link_libraries(main scalapack)
//main.cpp
int main(){
    return 0;
}

Without target_link_libraries(scalapack MPI::MPI_C) make main fails, and after enabling target_link_libraries(scalapack MPI::MPI_C) make main succeeds.

weslleyspereira commented 2 years ago

Related to #47 and #56. I also don't like the way MPI is linked to ScaLAPACK. Would you be able to propose a PR addressing this issue? Thanks!

Rickbude commented 2 years ago

Created a PR that should fix this issue, and addresses some minor things that stood out to me in the CMakeLists. Please check carefully on all target platforms whether this doesn't cause any problems before merging.

weslleyspereira commented 2 years ago

@drhpc, would you have time to test if #67 solves your issue?

drhpc commented 2 years ago

@drhpc, would you have time to test if #67 solves your issue?

Getting back to that … and see that I need a moment for testing this. Ideally, I'd have a patch against 2.2.0 to integrate in the pkgsrc build, the PR diff doesn't apply cleanly. I can sort that out, but tomorrow.

drhpc commented 2 years ago

So, yes, #67 does solve the build issue, thanks. Somewhat related, we might want to discuss the pkgconfig file. I add this patch:

--- scalapack.pc.in.orig        2021-10-12 19:09:12.000000000 +0000
+++ scalapack.pc.in
@@ -5,5 +5,5 @@ Name: scalapack
 Description: SCALAPACK reference implementation
 Version: @SCALAPACK_VERSION@
 URL: http://www.netlib.org/scalapack/
 Libs: -L${libdir} -lscalapack
-Requires: mpi lapack blas
+Requires: @BLA_PKGCONFIG_LAPACK@ @BLA_PKGCONFIG_BLAS@
+Libs.private: @MPI_LIBRARIES@

This requires a CMake that does use BLA_PKGCONFIG_LAPACK and BLA_PKGCONFIG_BLAS, something I only recently added upstream and am patching locally. It allows FindBLAS etal. to use a hint to use a certain pkgconfig module. They are not just named blas, but something like openblas_openmp here.

Similarily, there is no mpi.pc installed by openmpi here, so I am resorting to the actual libraries. I am not sure if it's possible to write a generic pc template that works for everyone … also I might have to revisit Libs vs Libs.private for the MPI part. For the shared library, the MPI linkage is implied. You'll have fun if the user program is linked with a different one, anyway.

But maybe one can add some smartness in CMakeLists to determine what to fill into the .pc template. Using the plain library names as fallback, but using the pkg-config names if pkg-config has been used to locate BLAS and LAPACK. A small cosmetic issue is that in the usual case of an optimized library, you'll have the same pkgconfig module appearing twice, as BLAS and LAPACK is contained in the same library.

The initial issue I reported is fixed. Just wanted to drop those lines … maybe we create a new issue/PR for that?

weslleyspereira commented 2 years ago

I think a new issue would be ideal, thanks!

drhpc commented 2 years ago

I did that as #68 … so … we can close this here, again, yes?

weslleyspereira commented 2 years ago

Yes, closing... I reopened it basically because you reopened it. I thought you wanted it open while you created the other ones. Thanks