Open paboyle opened 6 days ago
On a hunch, I worried about the
-fsycl-device-code-split=per_kernel
flag I've been using. Removing it didn't help, but the sysl-post-link subcommand line changed from
-split=kernel
to
-split=auto
If I instead remove the -split from the sycl-post-link subcommand entirely this succeeds:
Fails:
/opt/aurora/24.086.0/CNDA/oneapi/compiler/eng-20240227/bin/compiler/sycl-post-link -split=auto -emit-only-kernels-as-entry-points -emit-param-info -symbols -emit-exported-symbols -split-esimd -lower-esimd -O3 -spec-const=native -device-globals -o Benchmark_dwf_fp32-sycl-spir64-unknown-unknown.table Benchmark_dwf_fp32-sycl-spir64-unknown-unknown-9d8e06.bc
sycl-post-link: device_global variable '__DeviceType' with property "device_image_scope" is used in more than one device image.
Succeeds
/opt/aurora/24.086.0/CNDA/oneapi/compiler/eng-20240227/bin/compiler/sycl-post-link -emit-only-kernels-as-entry-points -emit-param-info -symbols -emit-exported-symbols -split-esimd -lower-esimd -O3 -spec-const=native -device-globals -o Benchmark_dwf_fp32-sycl-spir64-unknown-unknown.table Benchmark_dwf_fp32-sycl-spir64-unknown-unknown-9d8e06.bc
So it appears the device global strategy used in the sycl -fsanitize=address utility is incompatible with most of the kernel split strategies, including the default ?
And specifically the problem is in variables declared in:
./libdevice/sanitizer_utils.cpp:DeviceGlobal<DeviceType> __DeviceType;
./libdevice/sanitizer_utils.cpp: if (__DeviceType == DeviceType::CPU) {
./libdevice/sanitizer_utils.cpp: } else if (__DeviceType == DeviceType::GPU_PVC) {
./libdevice/sanitizer_utils.cpp: __spirv_ocl_printf(__asan_print_unsupport_device_type, (int)__DeviceType);
when the kernel to device translation gets 'split' I guess it no longer appears as a global?
I can work around this in 2 ways:
-fsycl-device-code-split=off
But this forces a huge and expensive overhead on first kernel call.
-Xarch_host -fsanitize=address
I've successfully run host address sanitization on the problem I was debugging. I had to switch off use of MPI as this was throwing a false positive, but run a single process clean through ASAN with no errors and all leaks understood as allocate once objects (and many MPI leaks which I can do nothing about)
I think this issue is still a problem for device code sanitization, but now not a barrier for my personal need.
Describe the bug
Hi,
When I compile my library with -fsanitize=address, In both CXXFLAGS and LDFLAGS, under icpx on Sunspot, I get:
sycl-post-link: device_global variable '__DeviceType' with property "device_image_scope" is used in more than one device image.
Is this an error in the compiler?
I dug in as much as I could
Added
-v -save-temps
To the link command, and found that the failing sub-command is:
There are lots .o temps created in the current directory when I did this:
Of these (and my own application library) the only one containing __DeviceType is
libsycl-sanitizer-sycl-spir64-unknown-unknown
Feels to me that libsycl-sanitizer is associated with the fail.
I can find a. version of this under the compiler tree as, follows, with the offending variable highlighted:
Any ideas how to avoid this and make the address sanitizer work?
To reproduce
Environment
Linux, Intel PVC
Additional context
No response