Closed juhigupta0 closed 6 months ago
You are accessing member variables captured by value [=]
in the kernel. C++ tries to access them through the this
pointer, which results in an illegal memory accesss.
To fix this, you can either:
[Ndim_ = Ndim, ...](...)
)static constexpr
. This would make the optimizer propagate their value and avoid any memory accesses.Thank you for pointing it out. Much appreciated.
I am working on an example which I am trying to integrate and execute using the sycl-bench suite. I am getting runtime error upon testing it with AdaptiveCpp for CUDA as well as HIP backend. When the same kernel is executed as the standalone ACpp application, the kernel executes without any error. I have a similar kind of behavior with one of my other applications. One of the similarities between both the applications is that they both are nd_range parallel_for type.
Matrices are currently initialized as identity matrix. Please let me know if you have any idea on the error.
Application code:
Error logs when offloading to CUDA device:
Error logs when offloading to HIP device: