alpaka-group / alpaka

Abstraction Library for Parallel Kernel Acceleration :llama:
https://alpaka.readthedocs.io
Mozilla Public License 2.0
349 stars 72 forks source link

Issues Compiling with NVCC Host Compiler #1991

Open GNiendorf opened 1 year ago

GNiendorf commented 1 year ago

I am trying to move my code to a newer version of Alpaka (from 0.7 to 0.9 or above) but I'm running into a compilation issue. I have code which is compiled with NVCC, but with the host compiler version of it (so files with .cc rather than .cu). I get the following error that was not present on the previous version of Alpaka that I was using. Edit - I should mention that this is with -DALPAKA_ACC_CPU_B_SEQ_T_SEQ_ENABLED turned on:

/cvmfs/cms.cern.ch/el8_amd64_gcc10/external/alpaka/develop-20220621-4e96939afa0cdb62448c73ead2bb07e0/include/alpaka/kernel/Traits.hpp:228:115: error: expected primary-expression before ')' token
  228 |             std::is_trivially_copyable_v<TKernelFnObj> || __nv_is_extended_device_lambda_closure_type(TKernelFnObj)
      |                                                                                                                   ^
/cvmfs/cms.cern.ch/el8_amd64_gcc10/external/alpaka/develop-20220621-4e96939afa0cdb62448c73ead2bb07e0/include/alpaka/kernel/Traits.hpp:228:59: error: there are no arguments to '__nv_is_extended_device_lambda_closure_type' that depend on a template parameter, so a declaration of '__nv_is_extended_device_lambda_closure_type' must be available [-fpermissive]
  228 |             std::is_trivially_copyable_v<TKernelFnObj> || __nv_is_extended_device_lambda_closure_type(TKernelFnObj)
      |                                                           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/cvmfs/cms.cern.ch/el8_amd64_gcc10/external/alpaka/develop-20220621-4e96939afa0cdb62448c73ead2bb07e0/include/alpaka/kernel/Traits.hpp:228:59: note: (if you use '-fpermissive', G++ will accept your code, but allowing the use of an undeclared name is deprecated)
/cvmfs/cms.cern.ch/el8_amd64_gcc10/external/alpaka/develop-20220621-4e96939afa0cdb62448c73ead2bb07e0/include/alpaka/kernel/Traits.hpp:229:81: error: expected primary-expression before ')' token
  229 |                 || __nv_is_extended_host_device_lambda_closure_type(TKernelFnObj),
      |                                                                                 ^
/cvmfs/cms.cern.ch/el8_amd64_gcc10/external/alpaka/develop-20220621-4e96939afa0cdb62448c73ead2bb07e0/include/alpaka/kernel/Traits.hpp:229:20: error: there are no arguments to '__nv_is_extended_host_device_lambda_closure_type' that depend on a template parameter, so a declaration of '__nv_is_extended_host_device_lambda_closure_type' must be available [-fpermissive]
  229 |                 || __nv_is_extended_host_device_lambda_closure_type(TKernelFnObj),

It seems like this ifdef below is being satisfied and it's throwing a compilation error since these functions are not defined when using the host compiler of nvcc? Any help is appreciated!

https://github.com/alpaka-group/alpaka/blob/78e984d4633caa52658a106b0f1d974ddbfd3ed9/include/alpaka/kernel/Traits.hpp#L219-L235

psychocoderHPC commented 1 year ago

In newer alpaka version we check the CUDA requirement that a kernel and all arguments passed to the kernel must be trivially copyable. In your case, one of the parameters you passed into the kernel is not trivially copyable. Unfortunately, it is hard to see which member it is if you have many.

For PIConGPU I changed the code in the past and iterated recursively over the template pack parameters to find the not copyable argument.

psychocoderHPC commented 1 year ago

I opened #1992 and will check if I can provide tomorrow an PR to help you with the debugging,

psychocoderHPC commented 1 year ago

As soon as #1993 is merged finding the argument which is not trivially copyable should be easier.

j-stephan commented 1 year ago

While the work in #1993 is certainly useful, isn't this failing for the kernel itself? So the thing that encapsulates operator()(TAcc acc, ...) isn't trivially copyable.

psychocoderHPC commented 1 year ago

While the work in #1993 is certainly useful, isn't this failing for the kernel itself? So the thing that encapsulates operator()(TAcc acc, ...) isn't trivially copyable.

Only user arguments will be checked. The accelerator is created on the device side and will not be copied to the device, therefore the requirement do not count for the accelerator which is not allowed to be copied.