error: ‘CudaKernel’ is not a member of ‘RAJA::statement’

LLNL / RAJA

RAJA Performance Portability Layer (C++)

BSD 3-Clause "New" or "Revised" License

450 stars 102 forks source link

error: ‘CudaKernel’ is not a member of ‘RAJA::statement’ #1679

Closed lospampa closed 2 weeks ago

lospampa commented 2 weeks ago

Hi there, I am trying to compile the following code to run on an NVIDIA RTX 4090 GPU.

using KERNEL_POL = RAJA::KernelPolicy< RAJA::statement::CudaKernel< RAJA::statement::Tile<1, RAJA::tile_fixed, RAJA::cuda_block_y_loop, RAJA::statement::Tile<0, RAJA::tile_fixed, RAJA::cuda_block_x_loop, RAJA::statement::For<1, RAJA::cuda_thread_y_direct, RAJA::statement::For<0, RAJA::cuda_thread_x_direct, RAJA::statement::Lambda<0>

;

But I am receiving the following errors: error: ‘CudaKernel’ is not a member of ‘RAJA::statement’ 18 | RAJA::statement::CudaKernel< | ^~~~~~

Do you know what is happening here? The application is parallelized with Kernels (it works when using HIP on the AMD platform).

Thank you for the attention.

artv3 commented 2 weeks ago

As a quick check are you able to run the following example: https://github.com/LLNL/RAJA/blob/develop/examples/tut_matrix-multiply.cpp ? An do you see the CUDA example running? If not, RAJA may have not been configured correctly.

lospampa commented 2 weeks ago

I found that the problem was when compiling. I don't know exactly why it is compiling now. I have included the next line in the Makefile. It can be because of the forward or the "-x cu".

"usr/local/cuda-12.4/bin/nvcc -forward-unknown-to-host-compiler -ccbin=/scratch/aflorenzon/llvm/build/bin/clang++ -x cu" Thank you for your time.

artv3 commented 2 weeks ago

Gotcha, good find. I'll close this issue then if things are resolved. Please feel free to reach out with any other questions.

artv3 commented 2 weeks ago

@lospampa I do have one additional suggestion. Although I'm not familiar with the structure of your kernel, it may be more performant if you consider not tiling and instead using global thread id's such as cuda/hip_global_x_direct -- see the documentation: https://raja.readthedocs.io/en/develop/sphinx/user_guide/feature/policies.html#raja-loop-kernel-execution-policies

Also it may be the case we are missing example for this...

lospampa commented 2 weeks ago

@artv3 , thank you for your help. I will try and let you know when I have the results.