flux-framework / PerfFlowAspect

An Aspect Oriented Programming (AOP)-based tool to analyze cross-cutting performance concerns of composite science workflows.
https://perfflowaspect.readthedocs.io
GNU Lesser General Public License v3.0
2 stars 15 forks source link

Build trouble in VSCode #112

Closed vsoch closed 1 year ago

vsoch commented 1 year ago

Hi! I'm trying to follow the logic in the GitHub workflow and I have this Dockerfile:

FROM ubuntu:20.04

LABEL maintainer="Vanessasaurus <@vsoch>"

# Match the default user id for a single system so we aren't root
ARG USERNAME=vscode
ARG USER_UID=1000
ARG USER_GID=1000
ENV USERNAME=${USERNAME}
ENV USER_UID=${USER_UID}
ENV USER_GID=${USER_GID}
ENV DEBIAN_FRONTEND=noninteractive

RUN apt-get update && \
    apt install -y clang llvm-dev libjansson-dev libssl-dev \
    mpi-default-dev wget bison flex make cmake mpich python3 python3-pip sudo llvm-12 
    # Note I didn't get it working with just mpich!
#    openmpi-bin openmpi-common libopenmpi-dev libgtk2.0-dev

# install cuda 12.1 (note uncomment this if you can support on your development machine!)
RUN wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin && \
    mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600 && \
    wget https://developer.download.nvidia.com/compute/cuda/12.1.0/local_installers/cuda-repo-ubuntu2004-12-1-local_12.1.0-530.30.02-1_amd64.deb && \
    dpkg -i cuda-repo-ubuntu2004-12-1-local_12.1.0-530.30.02-1_amd64.deb && \
    cp /var/cuda-repo-ubuntu2004-12-1-local/cuda-*-keyring.gpg /usr/share/keyrings/ && \
    apt-get update && \
    apt-get -y install cuda && \
    ln -s -T /usr/bin/make /usr/bin/gmake

ENV PATH=/usr/local/cuda-12.1/bin:$PATH

# Python helpers
RUN python3 -m pip install --upgrade pip pytest setuptools flake8 rstfmt black
#    sudo update-alternatives --install /usr/bin/llvm-config llvm-config /usr/bin/llvm-config-12 200

RUN ldconfig

# Add the group and user that match our ids
RUN groupadd -g ${USER_GID} ${USERNAME} && \
    adduser --disabled-password --uid ${USER_UID} --gid ${USER_GID} --gecos "" ${USERNAME} && \
    echo "${USERNAME} ALL=(ALL) NOPASSWD: ALL" > /etc/sudoers
USER $USERNAME

And then to build:

export PYTHONPATH=$PWD/src/python:$PYTHONPATH
cd src/c
mkdir -p build install
cd build
export CMAKE_OPTS="-DCMAKE_CXX_COMPILER=clang++ -DLLVM_DIR=/usr/lib/llvm-12/cmake -DCMAKE_BUILD_TYPE=Debug -DCMAKE_INSTALL_PREFIX=../install"
cmake ${CMAKE_OPTS} ..

But the build fails - here is the issue

Determining if the C compiler works failed with the following output:
Change Dir: /workspaces/PerfFlowAspect/src/c/build/CMakeFiles/CMakeTmp

Run Build Command(s):/usr/bin/gmake cmTC_599b9/fast && No such file or directory
Generator: execution of make failed. Make command was: /usr/bin/gmake cmTC_599b9/fast && 

The MPI test test_mpi for CXX in mode normal failed to compile with the following output:
Change Dir: /workspaces/PerfFlowAspect/src/c/build/CMakeFiles/CMakeTmp

Run Build Command(s):/usr/bin/gmake cmTC_21bda/fast && /usr/bin/gmake -f CMakeFiles/cmTC_21bda.dir/build.make CMakeFiles/cmTC_21bda.dir/build
gmake[1]: Entering directory '/workspaces/PerfFlowAspect/src/c/build/CMakeFiles/CMakeTmp'
Building CXX object CMakeFiles/cmTC_21bda.dir/test_mpi.cpp.o
/usr/bin/clang++   -isystem /usr/include/x86_64-linux-gnu/mpich  -fPIE   -flto=auto -ffat-lto-objects -o CMakeFiles/cmTC_21bda.dir/test_mpi.cpp.o -c /workspaces/PerfFlowAspect/src/c/build/CMakeFiles/FindMPI/test_mpi.cpp
clang: error: unsupported argument 'auto' to option 'flto='
clang: warning: optimization flag '-ffat-lto-objects' is not supported [-Wignored-optimization-argument]
gmake[1]: *** [CMakeFiles/cmTC_21bda.dir/build.make:66: CMakeFiles/cmTC_21bda.dir/test_mpi.cpp.o] Error 1
gmake[1]: Leaving directory '/workspaces/PerfFlowAspect/src/c/build/CMakeFiles/CMakeTmp'
gmake: *** [Makefile:121: cmTC_21bda/fast] Error 2

Is there a flag or something I;m missing on a path or a version error? I'm using ubuntu 20.04 and mpich like the workflow file. Thanks!

tpatki commented 1 year ago

@vsoch We've mostly tested this on lassen, and we use the clang-10.0.1-gcc-8.3.1 module there, which I believe is clang built with gcc bindings, I wonder if that's what we need to install here.

I didn't see (or missed your actual make command), hopefully you're not using a parallel make, because that won't work with the C dependencies we have. Can you post the verbose output from make here?

I am assuming your code is in cmTC_599b9/fast ? Not sure what the directory is pointing to. It seems like it is missing some directory as per this, not sure what its looking for:

Run Build Command(s):/usr/bin/gmake cmTC_599b9/fast && No such file or directory

Also see some incorrect clang flags, not sure how those got added, they're not in our codebase:

clang: error: unsupported argument 'auto' to option 'flto='
clang: warning: optimization flag '-ffat-lto-objects' is not supported 

Happy to help debug over a call sometime.

vsoch commented 1 year ago

Hey @tpatki ! I'm trying to follow your CI build in terms of depdendencies and procedure - here is the full output:

# cd perfFlowAspect/
root@cae5ed71d0e1:/perfFlowAspect# cd src/c
root@cae5ed71d0e1:/perfFlowAspect/src/c# mkdir -p build install
root@cae5ed71d0e1:/perfFlowAspect/src/c# cd build
root@cae5ed71d0e1:/perfFlowAspect/src/c/build# export CMAKE_OPTS="-DCMAKE_CXX_COMPILER=clang++ -DLLVM_DIR=/usr/lib/llvm-12/cmake -DCMAKE_BUILD_TYPE=Debug -DCMAKE_INSTALL_PREFIX=../install"
root@cae5ed71d0e1:/perfFlowAspect/src/c/build# cmake ${CMAKE_OPTS} ..
-- The C compiler identification is GNU 9.4.0
-- The CXX compiler identification is Clang 10.0.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/clang++
-- Check for working CXX compiler: /usr/bin/clang++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Setting build type to "Debug"
-- CMAKE_INSTALL_RPATH = /perfFlowAspect/src/c/install/lib
-- Found BISON: /usr/bin/bison (found version "3.5.1") 
-- Found FLEX: /usr/bin/flex (found version "2.6.4") 
-- Found OpenSSL: /usr/lib/x86_64-linux-gnu/libcrypto.so (found version "1.1.1f")  
-- Found ZLIB: /usr/lib/x86_64-linux-gnu/libz.so (found version "1.2.11") 
-- Found MPI_C: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so (found version "3.1") 
-- Found MPI_CXX: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi_cxx.so (found version "3.1") 
-- Found MPI: TRUE (found version "3.1")  
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE  
-- Found CUDA: /usr/local/cuda (found version "12.1") 
-- Adding CXX unit tests
--  [*] Adding test: smoketest
--  [*] Adding test: smoketest2
--  [*] Adding test: smoketest3
--  [*] Adding test: smoketest_class
--  [*] Adding test: smoketest_MPI
--  [*] Adding test: smoketest_MT
--  [*] Adding test: smoketest_cuda
-- Config Dir: 
-- PerfFlowAspect version: "0.1.0"
-- Configuring done
-- Generating done
-- Build files have been written to: /perfFlowAspect/src/c/build
root@cae5ed71d0e1:/perfFlowAspect/src/c/build#           # build
root@cae5ed71d0e1:/perfFlowAspect/src/c/build#           make VERBOSE=1
/usr/bin/cmake -S/perfFlowAspect/src/c -B/perfFlowAspect/src/c/build --check-build-system CMakeFiles/Makefile.cmake 0
/usr/bin/cmake -E cmake_progress_start /perfFlowAspect/src/c/build/CMakeFiles /perfFlowAspect/src/c/build/CMakeFiles/progress.marks
make -f CMakeFiles/Makefile2 all
make[1]: Entering directory '/perfFlowAspect/src/c/build'
make -f parser/CMakeFiles/perfflow_parser.dir/build.make parser/CMakeFiles/perfflow_parser.dir/depend
make[2]: Entering directory '/perfFlowAspect/src/c/build'
[  3%] [FLEX][lexer] Building scanner with flex 2.6.4
cd /perfFlowAspect/src/c/parser && /usr/bin/flex --nodefault -o/perfFlowAspect/src/c/build/parser/lex.yy.cpp lex.l
[  7%] [BISON][parser] Building parser with bison 3.5.1
cd /perfFlowAspect/src/c/parser && /usr/bin/bison -t --report=none --defines=/perfFlowAspect/src/c/build/parser/parser.tab.h -o /perfFlowAspect/src/c/build/parser/parser.tab.cpp parser.y
cd /perfFlowAspect/src/c/build && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /perfFlowAspect/src/c /perfFlowAspect/src/c/parser /perfFlowAspect/src/c/build /perfFlowAspect/src/c/build/parser /perfFlowAspect/src/c/build/parser/CMakeFiles/perfflow_parser.dir/DependInfo.cmake --color=
Dependee "/perfFlowAspect/src/c/build/parser/CMakeFiles/perfflow_parser.dir/DependInfo.cmake" is newer than depender "/perfFlowAspect/src/c/build/parser/CMakeFiles/perfflow_parser.dir/depend.internal".
Dependee "/perfFlowAspect/src/c/build/parser/CMakeFiles/CMakeDirectoryInformation.cmake" is newer than depender "/perfFlowAspect/src/c/build/parser/CMakeFiles/perfflow_parser.dir/depend.internal".
Scanning dependencies of target perfflow_parser
make[2]: Leaving directory '/perfFlowAspect/src/c/build'
make -f parser/CMakeFiles/perfflow_parser.dir/build.make parser/CMakeFiles/perfflow_parser.dir/build
make[2]: Entering directory '/perfFlowAspect/src/c/build'
[ 11%] Building CXX object parser/CMakeFiles/perfflow_parser.dir/perfflow_parser.cpp.o
cd /perfFlowAspect/src/c/build/parser && /usr/bin/clang++  -Dperfflow_parser_EXPORTS -I/perfFlowAspect/src/c/build/parser  -g -fPIC   -std=c++11 -o CMakeFiles/perfflow_parser.dir/perfflow_parser.cpp.o -c /perfFlowAspect/src/c/parser/perfflow_parser.cpp
[ 14%] Building CXX object parser/CMakeFiles/perfflow_parser.dir/parser.tab.cpp.o
cd /perfFlowAspect/src/c/build/parser && /usr/bin/clang++  -Dperfflow_parser_EXPORTS -I/perfFlowAspect/src/c/build/parser  -g -fPIC   -o CMakeFiles/perfflow_parser.dir/parser.tab.cpp.o -c /perfFlowAspect/src/c/build/parser/parser.tab.cpp
[ 18%] Building CXX object parser/CMakeFiles/perfflow_parser.dir/lex.yy.cpp.o
cd /perfFlowAspect/src/c/build/parser && /usr/bin/clang++  -Dperfflow_parser_EXPORTS -I/perfFlowAspect/src/c/build/parser  -g -fPIC   -Wno-deprecated-register -o CMakeFiles/perfflow_parser.dir/lex.yy.cpp.o -c /perfFlowAspect/src/c/build/parser/lex.yy.cpp
[ 22%] Linking CXX shared library libperfflow_parser.so
cd /perfFlowAspect/src/c/build/parser && /usr/bin/cmake -E cmake_link_script CMakeFiles/perfflow_parser.dir/link.txt --verbose=1
/usr/bin/clang++ -fPIC -g -Wl,--version-script=/perfFlowAspect/src/c/parser/libperfflow_parser.map -shared -Wl,-soname,libperfflow_parser.so -o libperfflow_parser.so CMakeFiles/perfflow_parser.dir/perfflow_parser.cpp.o CMakeFiles/perfflow_parser.dir/parser.tab.cpp.o CMakeFiles/perfflow_parser.dir/lex.yy.cpp.o  -Wl,-rpath,::::::::::::::::::::::::::::::::: 
make[2]: Leaving directory '/perfFlowAspect/src/c/build'
[ 22%] Built target perfflow_parser
make -f runtime/CMakeFiles/perfflow_runtime.dir/build.make runtime/CMakeFiles/perfflow_runtime.dir/depend
make[2]: Entering directory '/perfFlowAspect/src/c/build'
cd /perfFlowAspect/src/c/build && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /perfFlowAspect/src/c /perfFlowAspect/src/c/runtime /perfFlowAspect/src/c/build /perfFlowAspect/src/c/build/runtime /perfFlowAspect/src/c/build/runtime/CMakeFiles/perfflow_runtime.dir/DependInfo.cmake --color=
Dependee "/perfFlowAspect/src/c/build/runtime/CMakeFiles/perfflow_runtime.dir/DependInfo.cmake" is newer than depender "/perfFlowAspect/src/c/build/runtime/CMakeFiles/perfflow_runtime.dir/depend.internal".
Dependee "/perfFlowAspect/src/c/build/runtime/CMakeFiles/CMakeDirectoryInformation.cmake" is newer than depender "/perfFlowAspect/src/c/build/runtime/CMakeFiles/perfflow_runtime.dir/depend.internal".
Scanning dependencies of target perfflow_runtime
make[2]: Leaving directory '/perfFlowAspect/src/c/build'
make -f runtime/CMakeFiles/perfflow_runtime.dir/build.make runtime/CMakeFiles/perfflow_runtime.dir/build
make[2]: Entering directory '/perfFlowAspect/src/c/build'
[ 25%] Building CXX object runtime/CMakeFiles/perfflow_runtime.dir/advice_chrome_tracing.cpp.o
cd /perfFlowAspect/src/c/build/runtime && /usr/bin/clang++  -Dperfflow_runtime_EXPORTS  -g -fPIC   -o CMakeFiles/perfflow_runtime.dir/advice_chrome_tracing.cpp.o -c /perfFlowAspect/src/c/runtime/advice_chrome_tracing.cpp
[ 29%] Building CXX object runtime/CMakeFiles/perfflow_runtime.dir/advice_dispatcher.cpp.o
cd /perfFlowAspect/src/c/build/runtime && /usr/bin/clang++  -Dperfflow_runtime_EXPORTS  -g -fPIC   -o CMakeFiles/perfflow_runtime.dir/advice_dispatcher.cpp.o -c /perfFlowAspect/src/c/runtime/advice_dispatcher.cpp
[ 33%] Building CXX object runtime/CMakeFiles/perfflow_runtime.dir/perfflow_runtime.cpp.o
cd /perfFlowAspect/src/c/build/runtime && /usr/bin/clang++  -Dperfflow_runtime_EXPORTS  -g -fPIC   -o CMakeFiles/perfflow_runtime.dir/perfflow_runtime.cpp.o -c /perfFlowAspect/src/c/runtime/perfflow_runtime.cpp
[ 37%] Linking CXX shared library libperfflow_runtime.so
cd /perfFlowAspect/src/c/build/runtime && /usr/bin/cmake -E cmake_link_script CMakeFiles/perfflow_runtime.dir/link.txt --verbose=1
/usr/bin/clang++ -fPIC -g  -shared -Wl,-soname,libperfflow_runtime.so -o libperfflow_runtime.so CMakeFiles/perfflow_runtime.dir/advice_chrome_tracing.cpp.o CMakeFiles/perfflow_runtime.dir/advice_dispatcher.cpp.o CMakeFiles/perfflow_runtime.dir/perfflow_runtime.cpp.o  -Wl,-rpath,::::::::::::::::::::::::::::::::: -ljansson /usr/lib/x86_64-linux-gnu/libssl.so /usr/lib/x86_64-linux-gnu/libcrypto.so 
make[2]: Leaving directory '/perfFlowAspect/src/c/build'
[ 37%] Built target perfflow_runtime
make -f weaver/weave/CMakeFiles/WeavePass.dir/build.make weaver/weave/CMakeFiles/WeavePass.dir/depend
make[2]: Entering directory '/perfFlowAspect/src/c/build'
cd /perfFlowAspect/src/c/build && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /perfFlowAspect/src/c /perfFlowAspect/src/c/weaver/weave /perfFlowAspect/src/c/build /perfFlowAspect/src/c/build/weaver/weave /perfFlowAspect/src/c/build/weaver/weave/CMakeFiles/WeavePass.dir/DependInfo.cmake --color=
Dependee "/perfFlowAspect/src/c/build/weaver/weave/CMakeFiles/WeavePass.dir/DependInfo.cmake" is newer than depender "/perfFlowAspect/src/c/build/weaver/weave/CMakeFiles/WeavePass.dir/depend.internal".
Dependee "/perfFlowAspect/src/c/build/weaver/weave/CMakeFiles/CMakeDirectoryInformation.cmake" is newer than depender "/perfFlowAspect/src/c/build/weaver/weave/CMakeFiles/WeavePass.dir/depend.internal".
Scanning dependencies of target WeavePass
make[2]: Leaving directory '/perfFlowAspect/src/c/build'
make -f weaver/weave/CMakeFiles/WeavePass.dir/build.make weaver/weave/CMakeFiles/WeavePass.dir/build
make[2]: Entering directory '/perfFlowAspect/src/c/build'
[ 40%] Building CXX object weaver/weave/CMakeFiles/WeavePass.dir/perfflow_weave.cpp.o
cd /perfFlowAspect/src/c/build/weaver/weave && /usr/bin/clang++  -DWeavePass_EXPORTS -I/usr/lib/llvm-12/include  -g -fPIC   -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -fno-rtti -std=gnu++14 -o CMakeFiles/WeavePass.dir/perfflow_weave.cpp.o -c /perfFlowAspect/src/c/weaver/weave/perfflow_weave.cpp
[ 44%] Linking CXX shared module libWeavePass.so
cd /perfFlowAspect/src/c/build/weaver/weave && /usr/bin/cmake -E cmake_link_script CMakeFiles/WeavePass.dir/link.txt --verbose=1
/usr/bin/clang++ -fPIC -g  -shared  -o libWeavePass.so CMakeFiles/WeavePass.dir/perfflow_weave.cpp.o   -L/usr/lib/llvm-12/lib  -Wl,-rpath,/usr/lib/llvm-12/lib:/perfFlowAspect/src/c/build/parser: ../../parser/libperfflow_parser.so /usr/lib/x86_64-linux-gnu/libjansson.so 
make[2]: Leaving directory '/perfFlowAspect/src/c/build'
[ 44%] Built target WeavePass
make -f test/CMakeFiles/smoketest_cuda.dir/build.make test/CMakeFiles/smoketest_cuda.dir/depend
make[2]: Entering directory '/perfFlowAspect/src/c/build'
[ 48%] Building NVCC (Device) object test/CMakeFiles/smoketest_cuda.dir/smoketest_cuda_generated_smoketest_cuda_kernel.cu.o
cd /perfFlowAspect/src/c/build/test/CMakeFiles/smoketest_cuda.dir && /usr/bin/cmake -E make_directory /perfFlowAspect/src/c/build/test/CMakeFiles/smoketest_cuda.dir//.
cd /perfFlowAspect/src/c/build/test/CMakeFiles/smoketest_cuda.dir && /usr/bin/cmake -D verbose:BOOL=1 -D build_configuration:STRING=Debug -D generated_file:STRING=/perfFlowAspect/src/c/build/test/CMakeFiles/smoketest_cuda.dir//./smoketest_cuda_generated_smoketest_cuda_kernel.cu.o -D generated_cubin_file:STRING=/perfFlowAspect/src/c/build/test/CMakeFiles/smoketest_cuda.dir//./smoketest_cuda_generated_smoketest_cuda_kernel.cu.o.cubin.txt -P /perfFlowAspect/src/c/build/test/CMakeFiles/smoketest_cuda.dir//smoketest_cuda_generated_smoketest_cuda_kernel.cu.o.Debug.cmake
-- Removing /perfFlowAspect/src/c/build/test/CMakeFiles/smoketest_cuda.dir//./smoketest_cuda_generated_smoketest_cuda_kernel.cu.o
/usr/bin/cmake -E remove /perfFlowAspect/src/c/build/test/CMakeFiles/smoketest_cuda.dir//./smoketest_cuda_generated_smoketest_cuda_kernel.cu.o
-- Generating dependency file: /perfFlowAspect/src/c/build/test/CMakeFiles/smoketest_cuda.dir//smoketest_cuda_generated_smoketest_cuda_kernel.cu.o.NVCC-depend
/usr/local/cuda/bin/nvcc -M -D__CUDACC__ /perfFlowAspect/src/c/test/smoketest_cuda_kernel.cu -o /perfFlowAspect/src/c/build/test/CMakeFiles/smoketest_cuda.dir//smoketest_cuda_generated_smoketest_cuda_kernel.cu.o.NVCC-depend -m64 -Xcompiler ,\"-g\" -ccbin /usr/bin/clang++ -Xcompiler=-Xclang -Xcompiler=-load -Xcompiler=-Xclang -Xcompiler=../../../weaver/weave/libWeavePass.so -DNVCC -I/usr/local/cuda/include -I/usr/lib/x86_64-linux-gnu/openmpi/include/openmpi -I/usr/lib/x86_64-linux-gnu/openmpi/include
error: unable to load plugin '../../../weaver/weave/libWeavePass.so': '../../../weaver/weave/libWeavePass.so: undefined symbol: _ZTVN4llvm24IRBuilderDefaultInserterE'
CMake Error at smoketest_cuda_generated_smoketest_cuda_kernel.cu.o.Debug.cmake:220 (message):
  Error generating
  /perfFlowAspect/src/c/build/test/CMakeFiles/smoketest_cuda.dir//./smoketest_cuda_generated_smoketest_cuda_kernel.cu.o

make[2]: *** [test/CMakeFiles/smoketest_cuda.dir/build.make:65: test/CMakeFiles/smoketest_cuda.dir/smoketest_cuda_generated_smoketest_cuda_kernel.cu.o] Error 1
make[2]: Leaving directory '/perfFlowAspect/src/c/build'
make[1]: *** [CMakeFiles/Makefile2:372: test/CMakeFiles/smoketest_cuda.dir/all] Error 2
make[1]: Leaving directory '/perfFlowAspect/src/c/build'
make: *** [Makefile:130: all] Error 2

These are standard for ubuntu 20.04:

# gcc --version
gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

root@cae5ed71d0e1:/perfFlowAspect/src/c/build# clang --version
clang version 10.0.0-4ubuntu1 
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

which I saw being used here: https://github.com/flux-framework/PerfFlowAspect/blob/4852a48f4ea537296078ab293c731d7a35f704dd/.github/workflows/github-actions.yml#L9 and that's also where I'm trying to mimic the depdenency installs (e.g., cuda).

I'm hoping we can resolve this on an issue and don't need a meeting or call. Basically a reproducible build is one you can "package" in some format, a Dockerfile would be a good start. Requiring an LLNL system (for a developer) isn't ideal because a contributor may not have access to one. Thanks for your help!

slabasan commented 1 year ago

@vsoch Can you try with llvm-10 instead of llvm-12? I think this is the last piece that is on a different version. It was tricky to find a combination of llvm, clang, and cuda versions to work for our CI.

vsoch commented 1 year ago

This is really strange - when I build the exact same container from VSCode, I get a different error:

-- Setting build type to "Debug"
-- CMAKE_INSTALL_RPATH = /workspaces/PerfFlowAspect/src/c/install/lib
-- Could NOT find MPI_CXX (missing: MPI_CXX_WORKS) 
CMake Error at /usr/share/cmake-3.16/Modules/FindPackageHandleStandardArgs.cmake:146 (message):
  Could NOT find MPI (missing: MPI_CXX_FOUND) (found version "3.1")
Call Stack (most recent call first):
  /usr/share/cmake-3.16/Modules/FindPackageHandleStandardArgs.cmake:393 (_FPHSA_FAILURE_MESSAGE)
  /usr/share/cmake-3.16/Modules/FindMPI.cmake:1688 (find_package_handle_standard_args)
  test/CMakeLists.txt:10 (find_package)

-- Configuring incomplete, errors occurred!
See also "/workspaces/PerfFlowAspect/src/c/build/CMakeFiles/CMakeOutput.log".
See also "/workspaces/PerfFlowAspect/src/c/build/CMakeFiles/CMakeError.log".

I tried as root (as I did above) and the vscode user (same issue!) I think this was an original error I saw, so maybe we should debug from here. So it's telling me MPI isn't installed - but I definitely installed mpich. Here are the paths to what I found in /usr with mpi:

$ find . -name *mpi*.so
./lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so
./lib/x86_64-linux-gnu/libmpich.so
./lib/x86_64-linux-gnu/libompitrace.so
./lib/x86_64-linux-gnu/libmpichfort.so
./lib/x86_64-linux-gnu/libmpichcxx.so
./lib/x86_64-linux-gnu/openmpi/lib/libompitrace.so
./lib/x86_64-linux-gnu/openmpi/lib/openmpi3/libompi_dbg_msgq.so
./lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_schizo_ompi.so
./lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_io_ompio.so
./lib/x86_64-linux-gnu/openmpi/lib/libmca_common_ompio.so
./lib/x86_64-linux-gnu/openmpi/lib/libmpi_java.so
./lib/x86_64-linux-gnu/openmpi/lib/ompi_monitoring_prof.so
./lib/x86_64-linux-gnu/openmpi/lib/libmpi_usempi_ignore_tkr.so
./lib/x86_64-linux-gnu/openmpi/lib/libmpi_mpifh.so
./lib/x86_64-linux-gnu/openmpi/lib/libmpi_cxx.so
./lib/x86_64-linux-gnu/openmpi/lib/libmpi.so
./lib/x86_64-linux-gnu/openmpi/lib/libmpi_usempif08.so
./lib/x86_64-linux-gnu/libmca_common_ompio.so
./lib/x86_64-linux-gnu/libmpi_java.so
./lib/x86_64-linux-gnu/libmpi_usempi_ignore_tkr.so
./lib/x86_64-linux-gnu/open-coarrays/openmpi/lib/libcaf_mpi.so
./lib/x86_64-linux-gnu/open-coarrays/openmpi/lib/libcaf_openmpi.so
./lib/x86_64-linux-gnu/libmpi_mpifh.so
./lib/x86_64-linux-gnu/libmpi_cxx.so
./lib/x86_64-linux-gnu/libmpi.so
./lib/x86_64-linux-gnu/libcaf_openmpi.so
./lib/x86_64-linux-gnu/libmpi_usempif08.so
./lib/x86_64-linux-gnu/libmpi++.so

What should I try?

slabasan commented 1 year ago

Let's try ./lib/x86_64-linux-gnu/libmpich.so and ./lib/x86_64-linux-gnu/libmpichcxx.so

vsoch commented 1 year ago

How should I do that with cmake (sorry I don't use it a lot!) I was messing around with something like:

cmake -DCMAKE_CXX_COMPILER=clang++ -DMPI_DIR=/lib/x86_64-linux-gnu/ -DLLVM_DIR=/usr/lib/llvm-10/cmake -DCMAKE_BUILD_TYPE=Debug -DCMAKE_INSTALL_PREFIX=../install ..

Also note that I was installing cuda as you do in your GitHub workflow - should I not do that (maybe it's mucking something up?) I think it's weird I see openmpi and mpich, like maybe the nvidia installed grabbed the first?

slabasan commented 1 year ago
cmake \
-DCMAKE_CXX_COMPILER=clang++ \
-DMPI_CXX=./lib/x86_64-linux-gnu/libmpichcxx.so \
-DMPI_C=./lib/x86_64-linux-gnu/libmpich.so \
-DLLVM_DIR=/usr/lib/llvm-10/cmake \
-DCMAKE_BUILD_TYPE=Debug \
-DCMAKE_INSTALL_PREFIX=../install \
..

And it was a big headache when we brought in cuda into the CI environment. I don't think it should affect your build, especially if we're specific about which one we want the build system to use. IIRC the errors I was running into was making sure it picked up cuda-12, and not the default cuda. Needed to remove old drivers first instead of just replacing them.

vsoch commented 1 year ago

ah the above makes sense! So I tried it for each of llvm-10 and llvm-12 and the error is the same (I'm also not sure if the relative path dot in front of lib should be removed so I tried that too). It's giving the same error. Could this be that it's finding it but it's the wrong version? Likely on the cluster there is specific version of mpich and here we are just getting what ubuntu provides. E.g., I'm looking at (found version "3.1") so does that mean it found something?

slabasan commented 1 year ago

In the CMake output, is it finding the paths you specified?

-- Found MPI_C: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so (found version "3.1") 
-- Found MPI_CXX: /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi_cxx.so (found version "3.1") 
-- Found MPI: TRUE (found version "3.1")  

And when you run VERBOSE=1 make, are you seeing that the build has picked up other mpich libraries?

vsoch commented 1 year ago

It doesn't seem to be finding anything! It's super weird:

$     cmake \
>       -DCMAKE_CXX_COMPILER=clang++ \
>       -DMPI_CXX=/lib/x86_64-linux-gnu/libmpichcxx.so \
>       -DMPI_C=/lib/x86_64-linux-gnu/libmpich.so \
>       -DLLVM_DIR=/usr/lib/llvm-12/cmake \
>       -DCMAKE_BUILD_TYPE=Debug \
>       -DCMAKE_INSTALL_PREFIX=../install ..
-- Setting build type to "Debug"
-- CMAKE_INSTALL_RPATH = /workspaces/PerfFlowAspect/src/c/install/lib
-- Could NOT find MPI_CXX (missing: MPI_CXX_WORKS) 
CMake Error at /usr/share/cmake-3.16/Modules/FindPackageHandleStandardArgs.cmake:146 (message):
  Could NOT find MPI (missing: MPI_CXX_FOUND) (found version "3.1")
Call Stack (most recent call first):
  /usr/share/cmake-3.16/Modules/FindPackageHandleStandardArgs.cmake:393 (_FPHSA_FAILURE_MESSAGE)
  /usr/share/cmake-3.16/Modules/FindMPI.cmake:1688 (find_package_handle_standard_args)
  test/CMakeLists.txt:10 (find_package)

-- Configuring incomplete, errors occurred!
See also "/workspaces/PerfFlowAspect/src/c/build/CMakeFiles/CMakeOutput.log".
See also "/workspaces/PerfFlowAspect/src/c/build/CMakeFiles/CMakeError.log".
slabasan commented 1 year ago

Hi @vsoch, let me take a look at my CMakeCache.txt and confirm the variable to set for the MPI libraries, looks like MPI_C isn't the right one.

slabasan commented 1 year ago

Is openmpi native to ubuntu 20.04? I wonder if we can purge all the mpi libraries before installing mpich.

vsoch commented 1 year ago

I don't believe so - I assumed it was pulled in by the cuda install. Shall I try the build without those early cuda steps and check if it's there?

vsoch commented 1 year ago

oh no it was me being stupid - I installed mpi-default-dev and forgot to remove it - trying again!

vsoch commented 1 year ago

okay this gives us some insights! It seems it was targeting openmpi, because now it's angry that it's missing! :laughing:

cmake \
      -DCMAKE_CXX_COMPILER=clang++ \
      -DMPI_CXX=/lib/x86_64-linux-gnu/libmpichcxx.so \
      -DMPI_C=/lib/x86_64-linux-gnu/libmpich.so \
      -DLLVM_DIR=/usr/lib/llvm-10/cmake \
      -DCMAKE_BUILD_TYPE=Debug \
      -DCMAKE_INSTALL_PREFIX=../install ..
CMake Error in /workspaces/PerfFlowAspect/src/c/build/CMakeFiles/CMakeTmp/CMakeLists.txt:
  Imported target "MPI::MPI_C" includes non-existent path

    "/usr/lib/x86_64-linux-gnu/openmpi/include/openmpi"

  in its INTERFACE_INCLUDE_DIRECTORIES.  Possible reasons include:

  * The path was deleted, renamed, or moved to another location.

  * An install or uninstall procedure did not complete successfully.

  * The installation package was faulty and references files it does not
  provide.

CMake Error at /usr/share/cmake-3.16/Modules/FindMPI.cmake:1194 (try_compile):
  Failed to generate test project build system.
Call Stack (most recent call first):
  /usr/share/cmake-3.16/Modules/FindMPI.cmake:1245 (_MPI_try_staged_settings)
  /usr/share/cmake-3.16/Modules/FindMPI.cmake:1505 (_MPI_check_lang_works)
  test/CMakeLists.txt:10 (find_package)

-- Configuring incomplete, errors occurred!
See also "/workspaces/PerfFlowAspect/src/c/build/CMakeFiles/CMakeOutput.log".
See also "/workspaces/PerfFlowAspect/src/c/build/CMakeFiles/CMakeError.log".
slabasan commented 1 year ago

Hi @vsoch, ok I was able to successfully build PerfFlowAspect in a docker container. Our steps are similar, so not sure why your build is still looking for openmpi.

This was my docker setup (I ran my commands manually in a new ubuntu 20.04 instance:

apt-get update
apt-get install -y clang llvm-dev libjansson-dev libssl-dev wget bison flex make cmake python3 python3-pip llvm-12 
apt-get install -y mpich

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin && \
mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600 && \
wget https://developer.download.nvidia.com/compute/cuda/12.1.0/local_installers/cuda-repo-ubuntu2004-12-1-local_12.1.0-530.30.02-1_amd64.deb && \
dpkg -i cuda-repo-ubuntu2004-12-1-local_12.1.0-530.30.02-1_amd64.deb && \cp /var/cuda-repo-ubuntu2004-12-1-local/cuda-*-keyring.gpg /usr/share/keyrings/ && \
apt-get update && \
apt-get -y install cuda && \
ln -s -T /usr/bin/make /usr/bin/gmake

export PATH=/usr/local/cuda-12.1/bin:$PATH

I can confirm I do not have an openmpi in /usr/lib/x86_64-linux-gnu/openmpi/. I'll leave my docker instance up, let me know if I can look up anything else, so we can compare.

This build line works for me: $ cmake -DCMAKE_CXX_COMPILER=clang++ -DLLVM_DIR=/usr/lib/llvm-10/cmake ..

-- The C compiler identification is GNU 9.4.0
-- The CXX compiler identification is Clang 10.0.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/clang++
-- Check for working CXX compiler: /usr/bin/clang++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Using default build type: "Debug"
-- CMAKE_INSTALL_RPATH = /usr/local/lib
-- Found BISON: /usr/bin/bison (found version "3.5.1") 
-- Found FLEX: /usr/bin/flex (found version "2.6.4") 
-- Found OpenSSL: /usr/lib/x86_64-linux-gnu/libcrypto.so (found version "1.1.1f")  
-- Found MPI_C: /usr/lib/x86_64-linux-gnu/libmpich.so (found version "3.1") 
-- Found MPI_CXX: /usr/lib/x86_64-linux-gnu/libmpichcxx.so (found version "3.1") 
-- Found MPI: TRUE (found version "3.1")  
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE  
-- Found CUDA: /usr/local/cuda-12.1 (found version "12.1") 
-- Adding CXX unit tests
--  [*] Adding test: smoketest
--  [*] Adding test: smoketest2
--  [*] Adding test: smoketest3
--  [*] Adding test: smoketest_class
--  [*] Adding test: smoketest_MPI
--  [*] Adding test: smoketest_MT
--  [*] Adding test: smoketest_cuda
-- Config Dir: 
-- PerfFlowAspect version: "0.1.0"
-- Configuring done
-- Generating done
-- Build files have been written to: /root/PerfFlowAspect/src/c/build
vsoch commented 1 year ago

Trying now! I'm worried this is related to the VSCode environment, because I had a variant working in just docker before. Will report back!

vsoch commented 1 year ago

okay - failed in the .devcontainer. @slabasan do you use/have VSCode and could you reproduce the devcontainer setup? Basically you can put the Dockerfile alongside this devcontainer.json in a .devcontainer directory, e.g.,:

$ tree .devcontainer/
.devcontainer/
├── devcontainer.json
└── Dockerfile

And the content of devcontainer.json

{
    "name": "PerfFlowAspect Development Environment",
    "dockerFile": "Dockerfile",
    "context": "../",

    "customizations": {
      "vscode": {
        "settings": {
          "terminal.integrated.defaultProfile.linux": "bash"
        }
      }
  },
  "postStartCommand": "git config --global --add safe.directory /workspaces/PerfFlowAspect"
}

and open in VSCode, it will ask you if you want to reopen in the container, and if not, you can do View -> Command Palette -> Rebuild in Container. The extension is called "Dev Containers" by Microsoft.

I'm staring to think the build is fine and the specific developer environment is somehow wonky, but I've never seen this before! If VSCode doesn't work, I'll provide a standard Dockerfile instead for a developer environment - not as smooth but better than nothing!

slabasan commented 1 year ago

I’ve never used VSCode, but happy to try and reproduce. Do you have beginning steps I can follow?

vsoch commented 1 year ago

Yeah! It's fairly simple - you basically need VSCode, docker, and the extension. We have a small guide in the flux repository: https://github.com/flux-framework/flux-core/blob/master/vscode.md and let me know if you have specific questions (glad to help, and grateful for you trying to reproduce)!

vsoch commented 1 year ago

okay - I've updated the title to reflect the issues are in VSCode - the regular build with a vanilla Dockerfile works great. To not slow down progress, I'm going to leave this issue open but continue with a development environment that just uses a container:

https://github.com/flux-framework/PerfFlowAspect/pull/114

This was how I did things before we had VSCode - you basically can just build in the container, or bind your code to the container (and make changes on the host that persist in the container). If we ever figure this out or it magically works, great! But if not, we should not be held up because of it.