DARMA-tasking / vt

DARMA/vt => Virtual Transport
Other
35 stars 9 forks source link

`vt` fails to build with the latest `Kokkos` + CUDA 11.8 #2344

Open cz4rs opened 1 week ago

cz4rs commented 1 week ago

Describe the bug vt doesn't seem to work with Kokkos when CUDA is enabled:

kokkos/cmake-build-cuda11_8-debug/install/include/Cuda/Kokkos_Cuda_KernelLaunch.hpp(46): error: identifier "vt::phase::PhaseManager::nextPhaseReduce" is undefined in device code
kokkos/cmake-build-cuda11_8-debug/install/include/Cuda/Kokkos_Cuda_KernelLaunch.hpp(46): error: identifier "vt::phase::PhaseManager::nextPhaseDone" is undefined in device code

Additional context We currently only have one build with Kokkos enabled:

ci/azure/azure-gcc-12-ubuntu-mpich.yml:58:  VT_KOKKOS_ENABLED: 1

Azure nvcc builds have Kokkos disabled + we only build with Serial backend anyways (so we can run the tests on nodes without GPU). We could have a separate Azure job that would cover compilation with Kokkos and CUDA backend enabled (with some explicitly provided architecture).


@nmm0 can you provide additional details?

nmm0 commented 1 day ago

I am using nvcc version 11.8, this only appears in Debug builds

JacobDomagala commented 17 hours ago

I wasn't able to reproduce the issue. Tried on nvcc 12.3 and 11.2.2 with following build script

#!/usr/bin/env bash

set -ex

source_dir="/vt"
build_dir="/build"

mkdir -p "${build_dir}"
pushd "${build_dir}"

export VT=${source_dir}
export VT_BUILD=${build_dir}/vt
mkdir -p "$VT_BUILD"
cd "$VT_BUILD"
# rm -Rf ./*

ccache --clear

export Kokkos_enabled=1

if [ -n "$Kokkos_enabled" ]; then
    export HOST_COMPILER=g++-9
    export CUDA_ROOT="/usr/local/cuda/"

    cd /kokkos/build
    cmake -G "${CMAKE_GENERATOR:-Ninja}" \
      -DCMAKE_EXPORT_COMPILE_COMMANDS=1 \
      -DKokkos_ENABLE_CUDA=ON \
      -DKokkos_ARCH_AMPERE80=ON \
      -DKokkos_ENABLE_CUDA_LAMBDA=ON \
      -DKokkos_ENABLE_CUDA_RELOCATABLE_DEVICE_CODE=ON \
      -DCMAKE_INSTALL_PREFIX="/kokkos/build/install" \
      --fresh \
      ..
    cmake --build . --target install
    echo "Kokkos is enabled."

    export CXX="/kokkos/bin/nvcc_wrapper"
    # export CC="/home/jdomagala/Work/nvcc_wrapper"
    cd "$VT_BUILD"
fi

cmake -G "${CMAKE_GENERATOR:-Ninja}" \
      -DCMAKE_EXPORT_COMPILE_COMMANDS=1 \
      -Dvt_test_trace_runtime_enabled=1 \
      -Dvt_lb_enabled=1 \
      -Dvt_trace_enabled=1 \
      -Dvt_pool_enabled=1 \
      -Dvt_build_extended_tests=1 \
      -Dvt_diagnostics_enabled=1 \
      -Dvt_rdma_tests_enabled=1 \
      -DMI_INTERPOSE:BOOL=ON \
      -DMI_OVERRIDE:BOOL=ON \
      -DCMAKE_BUILD_TYPE=Debug \
      -DMPI_C_COMPILER="${MPICC:-mpicc}" \
      -DMPI_CXX_COMPILER="${MPICXX:-mpicxx}" \
      -DCMAKE_CXX_COMPILER="${CXX:-g++-9}" \
      -DCMAKE_C_COMPILER="${CC:-gcc-9}" \
      -DCMAKE_INSTALL_PREFIX="$VT_BUILD/install" \
      -Dmagistrate_ROOT="/build/magistrate/install" \
      -Dvt_debug_verbose=1 \
      -Dvt_tests_num_nodes=32 \
      -DCMAKE_CXX_STANDARD=17 \
      --fresh \
      "$VT"

cmake --build . -j 16 --target install