Closed AuroraPerego closed 2 months ago
Device global variables in oneAPI versions 2023.*
require explicitly the device image scope in the constructor, while from 2024.0
this is not needed anymore. For this reason, the required compiler version has been changed to 2024.0
.
Now the only failures in the tests are some warnings (that become errors when compiling the tests) in the SYCL headers
/ footers
.
Now the only failures in the tests are some warnings (that become errors when compiling the tests) in the SYCL
headers
/footers
.
@psychocoderHPC can we remove -Wreserved-identifier -Wold-style-cast
from the flags passed to icpx
?
The offending code is autogenerated, and at least for -Wreserved-identifier
is included before any user or alpaka code, so we cannot disable the warnings with a #pragma
in the code.
Now the only failures in the tests are some warnings (that become errors when compiling the tests) in the SYCL
headers
/footers
.@psychocoderHPC can we remove
-Wreserved-identifier -Wold-style-cast
from the flags passed toicpx
? The offending code is autogenerated, and at least for-Wreserved-identifier
is included before any user or alpaka code, so we cannot disable the warnings with a#pragma
in the code.
offline discussed: yes we can remove the options if nessesary
@alpaka-group/alpaka-maintainers IMO we can merge this PR. Are there any voices against merging it?
OK for me.
Any further developments can happen in follow up PRs.
This tries to fix #2070. This implementation changes the current API. The declaration of device memory is done with the macro
ALPAKA_STATIC_ACC_MEM_GLOBAL(type, name)
that wraps the CUDA/HIP/serial variables in a structalpaka::DevGlobal
and declares asycl::ext::oneapi::experimental::device_global<type>
for SYCL. Theinline
attribute is used to ensure that only one instance of that variable exists across different translation units.name
is used for thememcpy
, whilename.get()
must be used in the kernel to align with the behavior of the SYCL backend. Thememcpy
has been specialized for the device global variables.The test with the SYCL backend failed with the original
KernelExecutionFixture
because it creates a newqueue
instead of using the one used for thememcpy
. I added a constructor that takes in input also thequeue
.Another issue with the test is that being compiled with the flags to enable the SYCL backend, the macro for the device global variable expands to the SYCL one (
sycl::ext::oneapi::experimental::device_global<type>
) and therefore it fails when running on theAccCpuSerial
. I have disabled this accelerator just for this test. Note that this happens also with the CUDA backend (with the macro expanded to__device__ alpaka::DevGlobal<type> name
), but for some strange reason it works on the serial backend.Thanks to @fwyzard for the help :)