Closed zjin-lcf closed 1 week ago
The upstream seems to be somewhat sceptical about __grid_constant
and the applicability of it: https://discourse.llvm.org/t/rfc-hip-grid-constant-support/67759
I've pinged that thread, it feels like it really belongs in the upstream, once it's done there, we could add support in SYCL quite easily.
There seems no performance improvement whether "__grid_constant__" is added or not.
As of https://github.com/intel/llvm/pull/14332 (just merged), we implicitly add the grid_constant
annotation to all by-value structure-type kernel parameters.
This will eventually have the same effect as manually decorating these parameters with __grid_constant__
in CUDA but we don't give the user explicit control over it.
I say "eventually" as the way we currently lower SYCL functors in the front-end means this information is ultimately lost. We have some in-progress work to address this: see https://github.com/intel/llvm/pull/14434.
Is this an optimization feature for SYCL ?
If the address of a
__global__
function parameter is taken, the compiler will ordinarily make a copy of the kernel parameter in thread local memory and use the address of the copy, to partially support C++ semantics, which allow each thread to modify its own local copy of function parameters. Annotating a__global__
function parameter with__grid_constant__
ensures that the compiler will not create a copy of the kernel parameter in thread local memory, but will instead use the generic address of the parameter itself. Avoiding the local copy may result in improved performance.Reference https://docs.nvidia.com/cuda/cuda-c-programming-guide/#grid-constant