eth-cscs / conflux

Distributed Communication-Optimal LU-factorization Algorithm
BSD 3-Clause "New" or "Revised" License
12 stars 3 forks source link

CMAKE Build Failure #29

Open huttered40 opened 2 years ago

huttered40 commented 2 years ago

Hi, I am trying to build this library on Stampede2, which is essentially a large-scale cluster of Intel KNL nodes. I follow the provided steps and run into the following error:

login2.stampede2(1127)$ cmake -DCONFLUX_BLAS=MKL -DCONFLUX_SCALAPACK=MKL ..
-- Setting build type to 'Release' as none was specified.
-- Selected BLAS backend for CONFLUX: MKL
-- Selected SCALAPACK backend for CONFLUX: MKL
-- The CXX compiler identification is Intel 18.0.2.20180210
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /opt/apps/intel18/impi/18.0.2/bin/mpicxx - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found MPI_CXX: /opt/apps/intel18/impi/18.0.2/bin/mpicxx (found version "3.1") 
-- Found MPI: TRUE (found version "3.1") found components: CXX 
-- Found OpenMP_CXX: -qopenmp (found version "5.0") 
-- Found OpenMP: TRUE (found version "5.0")  
-- cxxopts version 2.2.0
-- Looking for C++ include pthread.h
-- Looking for C++ include pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE  
-- Found OpenMP_CXX: -qopenmp (found version "5.0") 
-- Found OpenMP: TRUE (found version "5.0") found components: CXX 
-- Found MKL: /opt/intel/compilers_and_libraries_2018.2.199/linux/mkl/include  
-- Found OpenMP_CXX: -qopenmp (found version "5.0") 
CMake Error at libs/costa/CMakeLists.txt:10 (include):
  include could not find requested file:

    cmake/build_type.cmake

CMake Error at libs/costa/CMakeLists.txt:11 (include):
  include could not find requested file:

    cmake/adjust_mpiexec_flags.cmake

-- Selected ScaLAPACK backend for COSTA: MKL
-- Found OpenMP_CXX: -qopenmp (found version "5.0") 
-- Found OpenMP_CXX: -qopenmp (found version "5.0") 
CMake Error at libs/costa/CMakeLists.txt:106 (add_subdirectory):
  add_subdirectory given source "src/costa" which is not an existing
  directory.

CMake Error: File /work2/05608/tg849075/conflux/libs/costa/cmake/costa.pc.in does not exist.
CMake Error at libs/costa/CMakeLists.txt:127 (configure_file):
  configure_file Problem configuring file

CMake Error: File /work2/05608/tg849075/conflux/libs/costa/cmake/costaConfig.cmake.in does not exist.
CMake Error at libs/costa/CMakeLists.txt:131 (configure_file):
  configure_file Problem configuring file

-- Configuring incomplete, errors occurred!
See also "/work2/05608/tg849075/conflux/build/CMakeFiles/CMakeOutput.log".

My cmake version is:

/opt/apps/cmake/3.20.2/bin/cmake

Please advise how to fix this error. Do I need to build COSTA? If so, what additional steps are necessary after doing so? Note that I tried building COSTA, but ran into other issues error: invalid entity for this variable list in omp clause.

simonpintarelli commented 2 years ago

Hi, looks like the repo wasn't cloned using --recursive. According to the readme the following should work:

git clone --recursive https://github.com/eth-cscs/conflux && cd conflux
huttered40 commented 2 years ago

Thanks! I have another issue, (see below), but as its related to COSTA, I'll create an issue in that repository:

login2.stampede2(1157)$ make
[  2%] Building CXX object libs/costa/src/costa/CMakeFiles/costa.dir/grid2grid/block.cpp.o
[  4%] Building CXX object libs/costa/src/costa/CMakeFiles/costa.dir/grid2grid/grid2D.cpp.o
[  6%] Building CXX object libs/costa/src/costa/CMakeFiles/costa.dir/grid2grid/interval.cpp.o
[  9%] Building CXX object libs/costa/src/costa/CMakeFiles/costa.dir/grid2grid/scalapack_layout.cpp.o
[ 11%] Building CXX object libs/costa/src/costa/CMakeFiles/costa.dir/grid2grid/communication_data.cpp.o
/work2/05608/tg849075/conflux/libs/costa/src/costa/grid2grid/communication_data.cpp(167): error: invalid entity for this variable list in omp clause
  #pragma omp parallel for shared(mpi_messages, workspace, offset_per_message, buffer)
                                  ^

/work2/05608/tg849075/conflux/libs/costa/src/costa/grid2grid/communication_data.cpp(195): error: invalid entity for this variable list in omp clause
  #pragma omp parallel for shared(idx, package_ticks, mpi_messages, offset_per_message, buffer, workspace)
                                       ^

compilation aborted for /work2/05608/tg849075/conflux/libs/costa/src/costa/grid2grid/communication_data.cpp (code 2)
simonpintarelli commented 2 years ago

Hm, this looks like an issue with the intel compiler. Can you try to comment the two shared statements? I.e.

#pragma omp parallel // shared (..)
huttered40 commented 2 years ago

Ok, yeah that works for now. I had one other issue (see below), but fixed it myself by adding <int> to the definition of the matrix_view class template. This library builds for me and the cholesky_miniapp works as well. Thanks!

[ 81%] Building CXX object tests/unit/CMakeFiles/unit.dir/test_utils.cpp.o
/work2/05608/tg849075/conflux/tests/unit/test_utils.cpp(53): error: too few arguments for class template "conflux::matrix_view"
      conflux::matrix_view<> in_mat(in.data(), n, n, layout);
                           ^

/work2/05608/tg849075/conflux/tests/unit/test_utils.cpp(53): error: expected a declaration
      conflux::matrix_view<> in_mat(in.data(), n, n, layout);
      ^

/work2/05608/tg849075/conflux/tests/unit/test_utils.cpp(54): error: too few arguments for class template "conflux::matrix_view"
      conflux::matrix_view<> result_mat(result.data(), n, n, layout);
                           ^

/work2/05608/tg849075/conflux/tests/unit/test_utils.cpp(54): error: expected a declaration
      conflux::matrix_view<> result_mat(result.data(), n, n, layout);
      ^

/work2/05608/tg849075/conflux/tests/unit/test_utils.cpp(57): error: identifier "in_mat" is undefined
      std::cout << in_mat.to_string() << std::endl;
                   ^

/work2/05608/tg849075/conflux/tests/unit/test_utils.cpp(79): error: identifier "result_mat" is undefined
      std::cout << result_mat.to_string() << std::endl;
                   ^

compilation aborted for /work2/05608/tg849075/conflux/tests/unit/test_utils.cpp (code 2)