LLNL / RAJA

RAJA Performance Portability Layer (C++)
BSD 3-Clause "New" or "Revised" License
460 stars 103 forks source link

cuda_exec has not been declared in RAJA #1268

Open amklinv-nnl opened 2 years ago

amklinv-nnl commented 2 years ago

I am trying to compile the following code based on a raja vector sum example.

#include<cassert>
#include "chai/ManagedArray.hpp"
#include "RAJA/RAJA.hpp"

int main(int argc, char* argv[]) {
    using chai::ManagedArray;
    using RAJA::forall;
    using RAJA::cuda_exec;
    using RAJA::RangeSegment;

    const int n = 1e8;
    const int CUDA_BLOCK_SIZE = 256;
    chai::ManagedArray<double> x(n), y(n), z(n);

    // Initialize x and y
    for(int i=0; i<n; i++) {
        x[i] = i;
        y[i] = n-i;
    }

    // Compute sum of x and y
    forall<cuda_exec<CUDA_BLOCK_SIZE>>(RangeSegment(0, n), 
        [=] RAJA_DEVICE (int i) { 
        z[i] = x[i] + y[i]; 
    });    

    // Assert that the sum is correct
    for(int i=0; i<n; i++) {
        assert(z[i] == n);
    }

    return 0;
}

However, I get a build error:

Consolidate compiler generated dependencies of target axpy
[ 50%] Building CXX object CMakeFiles/axpy.dir/main.cpp.o
/home/amklinv/gpu-programming-models/main/axpy/main.cpp: In function 'int main(int, char**)':
/home/amklinv/gpu-programming-models/main/axpy/main.cpp:8:17: error: 'cuda_exec' has not been declared in 'RAJA'
    8 |     using RAJA::cuda_exec;
      |                 ^~~~~~~~~
/home/amklinv/gpu-programming-models/main/axpy/main.cpp:22:12: error: 'cuda_exec' was not declared in this scope
   22 |     forall<cuda_exec<CUDA_BLOCK_SIZE>>(RangeSegment(0, n),
      |            ^~~~~~~~~
/home/amklinv/gpu-programming-models/main/axpy/main.cpp:22:37: error: no match for 'operator>' (operand types are '<unresolved overloaded function type>' and 'main(int, char**)::<lambda(int)>')
   22 |     forall<cuda_exec<CUDA_BLOCK_SIZE>>(RangeSegment(0, n),
      |     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~
      |     |                                                    |
      |     <unresolved overloaded function type>                main(int, char**)::<lambda(int)>
   23 |         [=] RAJA_DEVICE (int i) {
      |         ~~~~~~~~~~~~~~~~~~~~~~~~~
   24 |         z[i] = x[i] + y[i];
      |         ~~~~~~~~~~~~~~~~~~~
   25 |     });
      |     ~~
make[2]: *** [CMakeFiles/axpy.dir/build.make:76: CMakeFiles/axpy.dir/main.cpp.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:83: CMakeFiles/axpy.dir/all] Error 2
make: *** [Makefile:91: all] Error 2

My spack spec for raja: raja@0.14.0+cuda~examples~exercises~ipo+openmp~rocm+shared~tests build_type=RelWithDebInfo cuda_arch=75

Am I missing something?

artv3 commented 2 years ago

For line 8 , can you try RAJA::cuda_exec?

davidbeckingsale commented 2 years ago

It looks like your code is being compiled as CXX, you need to compile it as CUDA device code, e.g. with nvcc. For CMake, you can try:

set_source_file_properties(main.cpp PROPERTIES LANGUAGE CUDA)

and make sure that you have CUDA in the LANGUAGES section of the CMake project command.

amklinv-nnl commented 2 years ago

Thank you for the advice. I now get an error about --extended-lambda. Have I configured raja incorrectly, or is this something I have to manually add to my flags?

[ 50%] Building CUDA object CMakeFiles/axpy.dir/main.cpp.o
/home/amklinv/spack/opt/spack/linux-ubuntu20.04-skylake/gcc-11.2.0/cuda-11.4.4-7vyozxcmd5tsqdtugol7sc26ydzstqop/bin/nvcc -forward-unknown-to-host-compiler -DCAMP_HAVE_CUDA -isystem=/home/amklinv/spack/opt/spack/linux-ubuntu20.04-skylake/gcc-11.2.0/umpire-6.0.0-vzkbb7g3yc57jqa5xwodjynhrx5z2azs/include -isystem=/home/amklinv/spack/opt/spack/linux-ubuntu20.04-skylake/gcc-11.2.0/chai-2.4.0-6rvkhgeq3bqku2nztmx5vt6co6wxbp5o/include -isystem=/home/amklinv/spack/opt/spack/linux-ubuntu20.04-skylake/gcc-11.2.0/raja-0.14.0-y5e33jodwbkc3xvhykcz6bwfb44q42ch/include -isystem=/home/amklinv/spack/opt/spack/linux-ubuntu20.04-skylake/gcc-11.2.0/cuda-11.4.4-7vyozxcmd5tsqdtugol7sc26ydzstqop/include -isystem=/home/amklinv/spack/opt/spack/linux-ubuntu20.04-skylake/gcc-11.2.0/camp-0.2.2-uh6zbbsk2gjaoxw4lucog7venptpovbw/include -Xcompiler=-fopenmp -std=c++17 -MD -MT CMakeFiles/axpy.dir/main.cpp.o -MF CMakeFiles/axpy.dir/main.cpp.o.d -x cu -c /home/amklinv/gpu-programming-models/main/axpy/main.cpp -o CMakeFiles/axpy.dir/main.cpp.o
/home/amklinv/gpu-programming-models/main/axpy/main.cpp(23): error: __host__ or __device__ annotation on lambda requires --extended-lambda nvcc flag
rhornung67 commented 2 years ago

You shouldn't have to add the --extended-lambda flag. We add it here: https://github.com/LLNL/RAJA/blob/develop/cmake/SetupCompilers.cmake#L46