UoB-HPC / BabelStream

STREAM, for lots of devices written in many programming models
Other
313 stars 109 forks source link

SYCL build fails on Intel ARC GPU #147

Closed ethanjjjjjjj closed 1 year ago

ethanjjjjjjj commented 1 year ago

Running with an Intel ARC a770 on the latest git build of the Intel compute runtime, with the 2022.2.0 OneAPI compiler, I have managed to build and run other SYCL kernels on this device with this combination of runtime and compiler.

Looks to me like the program is trying to launch a double precision kernel on this device, doubles are unsupported by this GPU, and I have been explicit about selecting float in the launch args.

The kernels seem to run okay when commenting out template class SYCLStream<double>; at the end of SYCLStream.cpp and explicitly launching run<float>(); within main.cpp, but I can't find the exact issue which is causing it to try and launch double precision kernels.

cmake -DMODEL=sycl -DSYCL_COMPILER=ONEAPI-DPCPP ..

make

./sycl-stream --device 2 --float

BabelStream Version: 4.0 Implementation: SYCL Running kernels 100 times Precision: float Array size: 134.2 MB (=0.1 GB) Total size: 402.7 MB (=0.4 GB) Using SYCL device Intel(R) Graphics [0x56a0] Driver: 22.43.24558 Reduction kernel config: 2048 groups of size 1024 terminate called after throwing an instance of 'cl::sycl::compile_program_error' what(): The program was built for 1 devices Build program log for 'Intel(R) Graphics [0x56a0]':

error: Double type is not supported on this platform. in kernel: 'typeinfo name for cl::sycl::detail::__pf_kernel_wrapper<sycl_kernels::copy >' error: backend compiler failed build.

ethanjjjjjjj commented 1 year ago

exporting IGC_EnableDPEmulation=1

allows the float kernels to be run

tomdeakin commented 1 year ago

I think this is a bug in Intel’s runtime - the binary should be valid with double precision kernels as long as you don’t submit them to a device which doesn’t support it - which you’re not by using the —float flag.

Thanks for reminding me to raise this as a bug. We asked the community forums (see https://community.intel.com/t5/Intel-DevCloud/Iris-Xe-MAX-node-is-missing-double-precision-support/td-p/1247876) but I should raise it as a proper bug directly. I think I have enough info from your issue to file it.

tomdeakin commented 1 year ago

I've submitted a priority support ticket.

AlexeySachkov commented 1 year ago

Hi folks,

I think this is a bug in Intel’s runtime - the binary should be valid with double precision kernels as long as you don’t submit them to a device which doesn’t support it - which you’re not by using the —float flag.

Yeah, we haven't had implementation for optional kernel features for a while, but the situation has been improved recently with intel/llvm#7302. Starting with DPC++ daily 2022-11-22 build, the compiler automatically splits kernels into several modules based on optional features used in each kernel.

So, with newer version of the toolchain the app should pass without any workarounds. For older versions of the toolchain another workaround is available: pass -fsycl-device-code-split=per_kernel compiler option to outline every kernel into a separate device image to avoid speculative compilation.

ethanjjjjjjj commented 1 year ago

Interesting, thanks both