UoB-HPC / BabelStream

STREAM, for lots of devices written in many programming models
Other
325 stars 112 forks source link

single and double floating-point numbers are confused #98

Closed zjin-lcf closed 3 years ago

zjin-lcf commented 3 years ago

~/BabelStream/build$ ./sycl-stream --float --device 2 BabelStream Version: 3.4 Implementation: SYCL Running kernels 100 times Precision: float Array size: 134.2 MB (=0.1 GB) Total size: 402.7 MB (=0.4 GB) Using SYCL device Intel(R) Iris(R) Xe Graphics terminate called after throwing an instance of 'cl::sycl::compile_program_error' what(): The program was built for 1 devices Build program log for 'Intel(R) Iris(R) Xe Graphics':

error :double type is not supported on this platform in kernel: 'typeinfo name for sycl_kernels::copy' error: backend compiler failed build.

Thanks

tomdeakin commented 3 years ago

Thanks for reporting this. I had forgotten that support for FP64 on the device is optional in SYCL just as it is in OpenCL. In the OpenCL code, we have to check that the double version will work.

I've added a SYCL 1.2.1 solution to the current version of the code in #100 in the issue-98 branch. Can you check if this works on your device please?

SYCL 2020 makes this check a bit nicer, so I'll update #77 with a similar check ready for when we merge that in.

zjin-lcf commented 3 years ago

Sorry, the issue may not be really fixed.

./sycl-stream --float --device device_id error :double type is not supported on this platform in kernel: 'typeinfo name for sycl_kernels::init' error: backend compiler failed build.

error :double type is not supported on this platform in kernel: 'typeinfo name for sycl_kernels::init' error: backend compiler failed build.

error :double type is not supported on this platform in kernel: 'typeinfo name for sycl_kernels::init' error: backend compiler failed build. -11 (CL_BUILD_PROGRAM_FAILURE) Aborted (core dumped)

zjin-lcf commented 3 years ago

Sorry, I didn't check it before you closed the issue.

tomdeakin commented 3 years ago

It auto-closed when I merged the PR, sorry!

tomdeakin commented 3 years ago

Thanks for testing @zjin-lcf. I think the problem is now that both float and double have template instantiations (final two lines of SYCLStream.cpp, so both are being passed to the SYCL compiler at runtime. The compiler must be running before this code selects the device and exits if the feature is not supported.

I need to double check the SYCL specifications to find out if there is anything we can do in application code at compile time; but I suspect not.

tomdeakin commented 3 years ago

@tom91136 also suggested that you could try to enable FP64 emulation on the DevCloud Xe GPUs with the following environment variables:

export OverrideDefaultFP64Settings=1 
export IGC_EnableDPEmulation=1

This should silence the compiler issues and run the float code. The double will probably run, just in emulated mode.

zjin-lcf commented 3 years ago

Thank you for your suggestion. Running the program shows the message :

Validation failed on sum. Error 3.64297e-06

It seems that the option --arraysize does not change the number of array elements.

Could you reproduce that ?

Thanks

tomdeakin commented 3 years ago

Issue #20 summarises the problems with the dot product kernel and single precision. There isn't a good solution that we've found.

zjin-lcf commented 3 years ago

Okay. I assume that the error bounds are different for single precision and double precision. People may just care about bandwidth for the benchmark, though.

tomdeakin commented 3 years ago

Same error bounds (1.0E-8), but that might be a good way to account for the difference. I agree it's off-putting to have an error about correct values. I'll make a note in #20 with this suggestion. Thanks!