I added a queue argument for vecmem::sycl::copy. Not in a very clean way, but this all needs cleaned up (as discussed with @CrossR) and hopefully coming soon in an MR from him or me.
With this I can get the alpaka seeding example to work on the CERN GPU Flex 170:
==> Statistics ...
- read 77602 spacepoints
- created (cpu) 0 seeds
- created (alpaka) 66540 seeds
==>Elapsed times...
Hit reading (cpu) 1012 ms
Seeding (alpaka) 6472 ms
Track params (alpaka) 1847 ms
Wall time 9334 ms
It's very slow compared to the CUDA implementation, but I think this is largely because I had to enable FP64 emulation:
Some Intel GPUs do not support the double type for device code. alpaka will not check this.
You can enable software emulation for double precision types with
I added a queue argument for vecmem::sycl::copy. Not in a very clean way, but this all needs cleaned up (as discussed with @CrossR) and hopefully coming soon in an MR from him or me.
With this I can get the alpaka seeding example to work on the CERN GPU Flex 170:
It's very slow compared to the CUDA implementation, but I think this is largely because I had to enable FP64 emulation: