trilinos / Trilinos

Primary repository for the Trilinos Project
https://trilinos.org/
Other
1.19k stars 565 forks source link

CMake: Trying to understand linking to static TPLs (esp. MKL) #8926

Closed TBird2001 closed 2 years ago

TBird2001 commented 3 years ago

Question

@bartlettroscoe/CMake/TriBITS

I have a set of questions related to the Intel MKL. The first is (hopefully) simple. In the CMake TPLs, there is an explicit FindTPLMKL.cmake file. I've always specified the MKL libraries only to the BLAS and LAPACK searches. Maybe that is wrong? What does the MKL-specific library search enable in Trilinos?

Second, I have been trying to understand the flags associated with building static executables as described in the Build Reference (particularly as it applies to the MKL). To review, there are three flags:

-D BUILD_SHARED_LIBS=OFF \
-D TPL_FIND_SHARED_LIBS=OFF \
-D Trilinos_LINK_SEARCH_START_STATIC=ON

The first flag is to build static versions of the Trilinos libraries (and I think OFF is the default, anyway). The second, "tells CMake to locate static library versions of any required TPLs." This seems to reject shared libraries completely. The third, "tells the auto-detection routines… to locate static versions."

I was thinking that one could specify -D Trilinos_LINK_SEARCH_START_STATIC=ON, alone, and then the CMake would preferentially search for the static libraries, rather than the default, where it looks first for shared libraries. This does not seem to be the case, at least for the MKL. I had to specify all three flags to make the CMake find the static MKL libraries.

Related to that, if all of the TPLs are required to be static, I've encountered problems, because not all of the required libraries are available in a static form (e.g., pthreads, which is a system library). This is why the all-or-nothing approach is difficult, and I was hoping for the preferential approach.

jhux2 commented 3 years ago

@bartlettroscoe

jjellio commented 3 years ago

TBird2001,

Are you using Intel as the compiler? If so, I believe the best option to dealing with MKL is:

I believe these sets of flags were added in Intel compiler suite 2018 (or maybe 17) For lapack/blas use:

# just drop the flags, semi-colon deliminted into TPL_BLAS|LAPACK_LIBRARIES
# this should give you a full intel static linkage
TPL_BLAS_LIBRARIES="-mkl;-static-intel;-qopenmp-link=static"
# should give you dynamically linked openmp, because that is the default
TPL_BLAS_LIBRARIES="-mkl;-static-intel"

# this doesn't make much sense, MKL won't have intel OMP... if you still use OpenMP in your
# code, I would move `-static-intel;-qopenmp-link=static` to `CMAKE_EXE_LINKER_FLAGS`
TPL_BLAS_LIBRARIES="-mkl=sequential;-static-intel;-qopenmp-link=static"

mkl can also take a value, mkl=parallel is the default and links OpenMP parallel routines - mkl=sequential will link the non-threaded variants. This only impacts some of MKL's libs. I recommend seeing the -qopenmp-link-type flag below as well

From man icc

  -mkl[=lib] (M*X only)

              Tells the compiler to link to certain
                     libraries in the Intel(R) Math Kernel Library (Intel(R) MKL). On Windows systems, you must specify this option at compile time.

              Arguments:

              lib               Indicates which Intel(R) MKL library files should be linked. Possible values are:

                                parallel       Tells  the  compiler  to link using the threaded libraries in the Intel(R) MKL. This is the default if the option is specified
                                               with no lib.

                                sequential     Tells the compiler to link using the sequential libraries in the Intel(R) MKL.

                                cluster        Tells the compiler to  link  using  the  cluster-specific  libraries  and  the  sequential  libraries  in  the  Intel(R)  MKL.
                                               Cluster-specific libraries are not available for macOS*.

              Default:

              OFF               The compiler does not link to the Intel(R) MKL.

              Description:

              This option tells the compiler to link to certain libraries in the Intel(R) Math Kernel Library (Intel(R) MKL).

              On Linux* and macOS* systems, dynamic linking is the default when you specify -mkl.  To link with Intel(R) MKL statically, you must specify:

              -mkl -static-intel

              On Windows* systems, static linking is the default when you specify /Qmkl. To link with Intel(R) MKL dynamically, you must specify:

              /Qmkl /MD

              For  more  information  about  using MKL libraries, see the article in Intel(R) Developer Zone titled: Intel(R) Math Kernel Library Link Line Advisor, which is
              located in https://software.intel.com/en-us/articles/intel-mkl-link-line-advisor .

              NOTE: On Windows* systems, this option adds directives to the compiled code, which the linker then reads without further input from the driver. On  Linux*  and
              macOS* systems, the driver must add the library names explicitly to the link command.

              NOTE:  If you specify option [Q]mkl or [Q]mkl=parallel, and you also specify option [Q]tbb, the compiler links to the standard threaded version of the Intel(R)
              MKL.  However, if you specify [Q]mkl or [Q]mkl=parallel, and you also specify option [Q]tbb and option [q or  Q]openmp,  the  compiler  links  to  the  OpenMP*
              threaded version of the Intel(R) MKL.
       -qopenmp-link=library (M*X only)

              Controls whether the compiler links to
                     static or dynamic OpenMP* run-time libraries.

              Arguments:

              library           Specifies the OpenMP library to use. Possible values are:

                                static         Tells the compiler to link to static OpenMP run-time libraries. Note that static OpenMP libraries are deprecated.

                                dynamic        Tells the compiler to link to dynamic OpenMP run-time libraries.

              Default:

              -qopenmp-link=dynamic
                                The compiler links to dynamic OpenMP* run-time libraries. However, if Linux* option -static is specified, the compiler links to static OpenMP
                                run-time libraries.

              Description:

              This option controls whether the compiler links to static or dynamic OpenMP* run-time libraries.

              To link to the static OpenMP run-time library (RTL) and create a purely static  executable,  you  must  specify  -qopenmp-link=static.   However,  we  strongly
              recommend you use the default setting, -qopenmp-link=dynamic.

              NOTE: Compiler options -static-intel and -shared-intel (Linux* and macOS*) have no effect on which OpenMP run-time library is linked.

              NOTE: On Linux* systems, -qopenmp-link=dynamic cannot be used in conjunction with option -static. If you try to specify both options together, an error will be
              displayed.

              NOTE: On Linux systems, the OpenMP runtime library depends on using libpthread and libc (libgcc when compiled with gcc). Libpthread and libc (libgcc) must both
              be static or both be dynamic. If both libpthread and libc (libgcc) are static, then the static version of the OpenMP runtime should be used. If both libpthread
              and libc (libgcc) are dynamic, then either the static or dynamic version of the OpenMP runtime may be used.

              Alternate Options:

              None

If you aren't using intel... I've never used CMake's 'find' with MKL. I use Intel Link adviser

Then put whatever they say into TPL_BLAS_LIBRARIES/LAPACK_LIBRARIES, make sure to set the include if it isn't finding mkl.h ($MKLROOT/include)

That's just my experience on various platforms around SNL.

bartlettroscoe commented 3 years ago

@TBird2001,

In the CMake TPLs, there is an explicit FindTPLMKL.cmake file

I have never used the "MKL" TPL. I did not add that TriBiTS TPL file and I am not sure how it is related to the BLAS and LAPACK TriBITS TPLs. Looking at the git log for that file it was added by @mhoemmen. Reading the commit log for that file, it seems like it was designed to pull in all of MKL, not just BLAS and LAPACK.

What does the MKL-specific library search enable in Trilinos?

You can find that by doing:

$ cd Trilinos/

$ find . -name "Dependencies.cmake" -exec grep -nH MKL {} \;
./kokkos-kernels/cmake/Dependencies.cmake:3:  LIB_OPTIONAL_TPLS quadmath MKL BLAS LAPACK CUSPARSE MAGMA SUPERLU CHOLMOD LAPACKE CBLAS ARMPL
./packages/amesos/cmake/Dependencies.cmake:6:SET(LIB_OPTIONAL_DEP_TPLS SuperLUDist ParMETIS UMFPACK SuperLU BLACS SCALAPACK MUMPS TAUCS PARDISO_MKL PARDISO CSparse)
./packages/amesos2/cmake/Dependencies.cmake:11:SET(LIB_OPTIONAL_DEP_TPLS MPI SuperLU SuperLUMT SuperLUDist UMFPACK PARDISO_MKL ParMETIS METIS Cholmod MUMPS STRUMPACK CUSPARSE CUSOLVER)
./packages/muelu/cmake/Dependencies.cmake:9:SET(LIB_OPTIONAL_DEP_TPLS Boost MATLAB AmgX CGAL ViennaCL MKL Avatar CUSPARSE MAGMASparse mlpack)
./packages/kokkos-kernels/cmake/Dependencies.cmake:3:  LIB_OPTIONAL_TPLS quadmath MKL BLAS LAPACK CUSPARSE MAGMA SUPERLU CHOLMOD LAPACKE CBLAS
./packages/shylu/shylu_node/basker/cmake/Dependencies.cmake:6:SET(LIB_OPTIONAL_DEP_TPLS HWLOC MKL VTune)
./packages/shylu/shylu_node/basker/cmake/Dependencies.cmake:8:SET(TEST_OPTIONAL_DEP_TPLS HWLOC MKL)
./packages/shylu/shylu_node/tacho/cmake/Dependencies.cmake:6:SET(LIB_OPTIONAL_DEP_TPLS METIS Scotch Cholmod HWLOC HYPRE MKL LAPACK BLAS Pthread QTHREAD VTune CUSOLVER CUSPARSE CUBLAS CUDA)
./packages/shylu/shylu_node/tacho/cmake/Dependencies.cmake:8:SET(TEST_OPTIONAL_DEP_TPLS METIS HWLOC Cholmod MKL LAPACK BLAS Pthread QTHREAD CUSOLVER CUSPARSE CUBLAS CUDA)
./packages/shylu/shylu_node/hts/cmake/Dependencies.cmake:6:SET(LIB_OPTIONAL_DEP_TPLS MKL BLAS)
./packages/shylu/shylu_dd/bddc/cmake/Dependencies.cmake:6:SET(LIB_OPTIONAL_DEP_TPLS MKL PARDISO_MKL SuperLU METIS ParMETIS)
./packages/stokhos/cmake/Dependencies.cmake:6:SET(LIB_OPTIONAL_DEP_TPLS ForUQTK CUDA Thrust Cusp CUSPARSE Clp GLPK qpOASES Boost MATLABLib MKL)

$ find packages/kokkos-kernels/  -name "CMakeLists.txt" -exec grep -nH MKL {} \;
packages/kokkos-kernels/src/CMakeLists.txt:46:IF (KOKKOSKERNELS_ENABLE_TPL_BLAS OR KOKKOSKERNELS_ENABLE_TPL_MKL)
packages/kokkos-kernels/src/CMakeLists.txt:436:KOKKOSKERNELS_LINK_TPL(kokkoskernels PUBLIC MKL)

$ find packages/shylu/  -name "CMakeLists.txt" -exec grep -nH MKL {} \;
packages/shylu/shylu_node/tacho/example/CMakeLists.txt:174:  # Intel MKL Pardiso and PerfTest
packages/shylu/shylu_node/tacho/example/CMakeLists.txt:177:    IF(TPL_ENABLE_MKL)
packages/shylu/shylu_node/tacho/src/CMakeLists.txt:9:SET(TACHO_HAVE_MKL ${TPL_ENABLE_MKL})
packages/shylu/shylu_node/hts/src/CMakeLists.txt:8:  "Enable HTS template specializations for complex data types; only needed if using BLAS or MKL."

Given what these packages do sparse linear algebra, my guess is that they use the sparse linear algebra routines in MKL.

So don't bother with the "MKL" TPL, unless you need to enable those packages with sparse MKL support. Stick with the BLAS and LAPACK TPLs. For those, you need to specify the MKL libraries manually (don't know of a better way).

Note that you can use Intel MKL for BLAS and LAPACK with any compatible compiler, including GCC and Clang. You can look at the ATDM Trilinos configurations and see how they use MKL. It looks like they are linking Trilinos against shared MKL libraries, even for static builds of Trilinos. For the Intel builds on 'cts1', they are just using:

  -D TPL_BLAS_LIBRARIES='-mkl' \
  -D TPL_LAPACK_LIBRARIES='-mkl' \

Simple as pie. That leaves it up to the Intel compiler to decide if to link against shared or static MKL libs.

Otherwise, I don't know of any recent examples of linking against static MKL libraries. @jjellio likely has the best advice above. If that does not resolve your issue, then we can dig deeper into this and figure this out (and if MKL truly supports static libraries).

NOTE: I removed the TriBITS label from this Issue because this really has nothing to do with TriBITS itself. This is a Trilinos issue since Trilinos decides how it wants to use TPLs.

Related to my epic SEPW-213

TBird2001 commented 3 years ago

I just wanted to make a quick comment that your comments have been very helpful. We're still understanding the best approach for this, so I will leave this issue open a little longer in case we have follow-up questions.

(I'll also remove TriBITS from the title to keep things clear.)

jjellio commented 3 years ago

@TBird2001 - It's worth adding - TPL_ENABLE_MKL tells Trilinos packages that MKL is available. You use MKL for BLAS and LAPACK without enabling MKL as a package in Trilinos. When you enable MKL as TPL (whether it is BLAS or not), things like MueLu or Tpetra can gain the ability to use MKL (beyond basic blas/lapack apis) - As in, MueLu/Tpetra/Kokkos Kernels may offer to do operations using MKL's sparse api (mat-mat multiplication or sparse mat-vec)

That's a round-about answer, but if you are wanting MKL for things beyond lapack/blas - that stuff is used in some packages.

From a pure linking point of view, I'd use the -mkl flag w/intel, or the link adviser from Intel for GNU.

TBird2001 commented 3 years ago

@jjellio, that's actually very helpful in helping my understanding. Thanks!

mhoemmen commented 3 years ago

FYI, I don't actually work on Trilinos or work for Sandia any more, but I'll be happy to comment : - ) . I added the "MKL TPL" specifically for MKL's C interface to sparse linear algebra operations. It's separate from the BLAS, though I imagine you would just use the MKL for the BLAS library if you had it.

bartlettroscoe commented 3 years ago

It's separate from the BLAS, though I imagine you would just use the MKL for the BLAS library if you had it.

It seems like enabling the MKL TPL should automatically enable the BLAS and LAPACK TPLs (and everything that can use those) but that type of relationship between TPLs is not something that TriBITS supports. Making that work would take some effort and add more complexity to TriBITS.

jjellio commented 3 years ago

Ross, I believe if you enabled MKL and set BLAS to something else, you would have a mess on your hands. Packages that enable MKL (muelu) would then get MKL added as TPL library, but they also use BLAS, so that would get added too.

Which one gives you BLAS or LAPACK would depend on the order of the link line. And then for packages that don't depend on MKL (say Teuchos Numerics), their exe's may compile w/only the BLAS/LAPACK libs on the link line. There could also be shenanigans if one lib mangled one way and the other differently (atleast that would cause an error - otherwise, this would be a silent error where the user doesn't realize blas/lapack isn't coming from the source they expect!)

tldr: sounds like a headache. (added this post mostly so if it gets searched in the future maybe it will be helpful)

@mhoemmen - You are a ghost a in the shell. A digital haunt.... dare we call you a virus? Has your process gone Zombie? I've seen how this plays out in the Matrix. Give this spirit write access to the repo and let god's judgment befall us.

bartlettroscoe commented 3 years ago

Ross, I believe if you enabled MKL and set BLAS to something else, you would have a mess on your hands. Packages that enable MKL (muelu) would then get MKL added as TPL library, but they also use BLAS, so that would get added too.

Which one gives you BLAS or LAPACK would depend on the order of the link line. And then for packages that don't depend on MKL (say Teuchos Numerics), their exe's may compile w/only the BLAS/LAPACK libs on the link line. There could also be shenanigans if one lib mangled one way and the other differently (atleast that would cause an error - otherwise, this would be a silent error where the user doesn't realize blas/lapack isn't coming from the source they expect!)

@jjellio, yes, those are the problems. But it seems harmless to put the same MKL libraries on the link line multiple times, once for BLAS, once for LAPACK, and once for MKL.

But who actually uses the TriBITS MKL with Trilinos? It not enabled in any of the Trilinos PR builds or the ATDM Trilinos builds which means the there is likely no automated testing set up to support it. (Only 15 of the 116 total TPLs are enabled in any one of the Trilinos PR builds.) That means it likely does not work with Trilinos (i.e. what is not tested on a regular basis is likely broken).

mhoemmen commented 3 years ago

@jjellio ; - P

jjellio commented 3 years ago

Ross - MueLu (and maybe Tacho) can use it. I think in the context of MueLu it's in their example/drivers where they can compare X implementation to Y. (i.e., compare MKL to Kokkos or stuff like that). The codes were pretty crufty.

I generally see TPL_ENABLE_FOO to mean it's safe to include a header that FOO can provide and that you'll get whatever symbols when you link. I know that's pretty basic - some packages clearly promise more - like NetCDF or ParMETIS where the find_package code annotates attributes those packages have.

That said, I think most foos in TPL_ENABLE_FOO are kinda the wild west. We really test the blessed set of BLAS/LAPACK, MPI, NetCDF, PNetCDF, HDF5, CGNS, ParMETIS, METIS, and Boost.

It seems Tribits/CMake could gain some robustness if they provided a something like CMAKE_ENABLE_Foo - where that means at a minimum, cmake will call find_package(Foo) (or whatever the proper name is for it) - scans include/libs for named packages and set CMAKE_HAVE_Foo (or maybe ${PROJECT}_HAVE_Foo. IF that's all we are really doing for some of these TPLs, then we could gain some simplification by have CMake expose a general way to let an arbitrary 'Proejct' arbitrarily find things and have them auto-named ${PROJECT}_blah_blah. Alot of the TPL adding code would go away an CMake would just do it.

This would also make it clearer to a user - google CMAKE_ENABLE_MKL and you discover it's just finding things - and Cmake adopts TriBITS addons like Foo_LIBRARY_DIRS - no reason for TriBITS to provide something so basic if projects / people really just want Cmake to find things.

bartlettroscoe commented 3 years ago

Alot of the TPL adding code would go away an CMake would just do it.

@jjellio, thins are more complicated that this in the general case. I will see if I have time to dialog with Kitware on the TPL inter-dependency issues this FY as part of the TriBITS Refactoring work. For more discussion of this, see https://github.com/TriBITSPub/TriBITS/issues/340#issuecomment-728157279 and comments above that.

github-actions[bot] commented 2 years ago

This issue has had no activity for 365 days and is marked for closure. It will be closed after an additional 30 days of inactivity. If you would like to keep this issue open please add a comment and/or remove the MARKED_FOR_CLOSURE label. If this issue should be kept open even with no activity beyond the time limits you can add the label DO_NOT_AUTOCLOSE. If it is ok for this issue to be closed, feel free to go ahead and close it. Please do not add any comments or change any labels or otherwise touch this issue unless your intention is to reset the inactivity counter for an additional year.

github-actions[bot] commented 2 years ago

This issue was closed due to inactivity for 395 days.