Closed yxlao closed 1 year ago
Hi @yxlao . Could you confirm if the issue is visible on Ubuntu 20.04?
I have installed all the latest .deb packages from compute-runtime releases and from the GPGPU driver page.
What about the versions for rest of the related packages; llvm-spirv, opencl-clan, gmmlib?
I can confirm the same issue on Intel Arc A380 and Ubuntu 22.04. I updated to the same kernel 6.0rc3 from https://kernel.ubuntu.com/~kernel-ppa/mainline/v6.0-rc3/amd64 and use drivers/runtime from ghcr.io/intel/llvm/sycl_ubuntu2004_nightly container. I also downloaded /usr/lib/firmware/i915/dg2_dmc_ver2_07.bin and /usr/lib/firmware/i915/dg2_guc_70.4.1.bin from https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/i915 but not sure how to make them to be loaded by current i915 driver:
$ uname -a
Linux sap 6.0.0-060000rc3-generic #202208282331 SMP PREEMPT_DYNAMIC Sun Aug 28 23:34:06 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
$ sudo dmesg | grep -i i915
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-6.0.0-060000rc3-generic root=UUID=00b93c5a-1c6f-4506-835c-39661ff2b7de ro quiet splash i915.force_probe=56a5 vt.handoff=7
[ 0.325396] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.0.0-060000rc3-generic root=UUID=00b93c5a-1c6f-4506-835c-39661ff2b7de ro quiet splash i915.force_probe=56a5 vt.handoff=7
[ 6.949811] i915 0000:03:00.0: [drm] Incompatible option enable_guc=3 - HuC is not supported!
[ 6.950358] i915 0000:03:00.0: vgaarb: deactivate vga console
[ 6.950382] i915 0000:03:00.0: BAR 0: releasing [mem 0xa0000000-0xa0ffffff 64bit]
[ 6.950384] i915 0000:03:00.0: BAR 2: releasing [mem 0x4000000000-0x400fffffff 64bit pref]
[ 6.950421] i915 0000:03:00.0: BAR 2: no space for [mem size 0x200000000 64bit pref]
[ 6.950422] i915 0000:03:00.0: BAR 2: failed to assign [mem size 0x200000000 64bit pref]
[ 6.950424] i915 0000:03:00.0: BAR 0: assigned [mem 0xa0000000-0xa0ffffff 64bit]
[ 6.950475] i915 0000:03:00.0: [drm] Failed to resize BAR2 to 8192M (-ENOSPC)
[ 6.950477] i915 0000:03:00.0: BAR 2: assigned [mem 0x4000000000-0x400fffffff 64bit pref]
[ 6.950500] i915 0000:03:00.0: [drm] Local memory IO size: 0x0000000010000000
[ 6.950501] i915 0000:03:00.0: [drm] Local memory available: 0x000000017c800000
[ 6.950502] i915 0000:03:00.0: [drm] Using a reduced BAR size of 256MiB. Consider enabling 'Resizable BAR' or similar, if available in the BIOS.
[ 6.965145] i915 0000:03:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none
[ 6.969685] i915 0000:03:00.0: [drm] Finished loading DMC firmware i915/dg2_dmc_ver2_06.bin (v2.6)
[ 6.972580] i915 0000:03:00.0: [drm] GuC firmware i915/dg2_guc_70.1.2.bin version 70.1
[ 6.982863] i915 0000:03:00.0: [drm] GuC submission enabled
[ 6.982864] i915 0000:03:00.0: [drm] GuC SLPC enabled
[ 6.983219] i915 0000:03:00.0: [drm] GuC RC: enabled
[ 6.998933] i915 0000:03:00.0: [drm] Reducing the compressed framebuffer size. This may lead to less power savings than a non-reduced-size. Try to increase stolen memory size if available in BIOS.
[ 7.001015] [drm] Initialized i915 1.6.0 20201103 for 0000:03:00.0 on minor 0
[ 7.001140] snd_hda_intel 0000:04:00.0: bound 0000:03:00.0 (ops i915_audio_component_bind_ops [i915])
[ 7.027185] fbcon: i915drmfb (fb0) is primary device
[ 7.092757] i915 0000:03:00.0: [drm] fb0: i915drmfb frame buffer device
[ 7.136250] snd_hda_codec_hdmi hdaudioC0D2: No i915 binding for Intel HDMI/DP codec
$ ls -la /usr/lib/firmware/i915/dg2_*
-rw-r--r-- 1 root root 22416 Aug 31 04:11 /usr/lib/firmware/i915/dg2_dmc_ver2_06.bin
-rw-r--r-- 1 root root 152588 Sep 19 09:46 /usr/lib/firmware/i915/dg2_dmc_ver2_07.bin
-rw-r--r-- 1 root root 365568 Aug 31 04:11 /usr/lib/firmware/i915/dg2_guc_70.1.2.bin
-rw-r--r-- 1 root root 2455519 Sep 19 09:47 /usr/lib/firmware/i915/dg2_guc_70.4.1.bin
but not sure how to make them to be loaded by current i915 driver
Specific kernel versions have always loaded specific FW version, and possibly supported loading some older version(s) for backwards compatibility [1], if one they wanted is missing. They do not load newer FW versions, as they do not know about them (what API changes they have etc). For that you would need a newer i915 module.
[1] Relevant bug: https://gitlab.freedesktop.org/drm/intel/-/issues/6895
Same issue with https://kernel.ubuntu.com/~kernel-ppa/mainline/v6.0-rc6/amd64 kernel. Its i915 still uses old firmware btw:
[ 6.999994] i915 0000:03:00.0: [drm] Finished loading DMC firmware i915/dg2_dmc_ver2_06.bin (v2.6)
[ 7.003732] i915 0000:03:00.0: [drm] GuC firmware i915/dg2_guc_70.1.2.bin version 70.1
I'm not yet morally ready to build i915 myself. Still hope to get it DG2 functional version prebuilt from https://kernel.ubuntu.com/~kernel-ppa/mainline/v6.0-rc6/amd64 in some not so distant future.
Same issue with https://kernel.ubuntu.com/~kernel-ppa/mainline/v6.0-rc7/amd64 kernel and old firmware loaded. When I moved to https://kernel.ubuntu.com/~kernel-ppa/mainline/drm-intel-next/2022-10-01/amd64 and https://kernel.ubuntu.com/~kernel-ppa/mainline/drm-intel-nightly/2022-10-01/amd64 kernels I see that new GuC firmware is loaded but the original issue still persists:
sap@sap:~$ sudo dmesg | grep -i i915
[sudo] password for sap:
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-6.0.0-060000rc2drmintelnext20221001-generic root=UUID=00b93c5a-1c6f-4506-835c-39661ff2b7de ro quiet splash i915.force_probe=56a5 vt.handoff=7
[ 0.331090] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.0.0-060000rc2drmintelnext20221001-generic root=UUID=00b93c5a-1c6f-4506-835c-39661ff2b7de ro quiet splash i915.force_probe=56a5 vt.handoff=7
[ 7.127023] i915 0000:03:00.0: [drm] Incompatible option enable_guc=3 - HuC is not supported!
[ 7.127999] i915 0000:03:00.0: vgaarb: deactivate vga console
[ 7.128043] i915 0000:03:00.0: BAR 0: releasing [mem 0xa0000000-0xa0ffffff 64bit]
[ 7.128052] i915 0000:03:00.0: BAR 2: releasing [mem 0x4000000000-0x400fffffff 64bit pref]
[ 7.128166] i915 0000:03:00.0: BAR 2: no space for [mem size 0x200000000 64bit pref]
[ 7.128172] i915 0000:03:00.0: BAR 2: failed to assign [mem size 0x200000000 64bit pref]
[ 7.128179] i915 0000:03:00.0: BAR 0: assigned [mem 0xa0000000-0xa0ffffff 64bit]
[ 7.128339] i915 0000:03:00.0: [drm] Failed to resize BAR2 to 8192M (-ENOSPC)
[ 7.128349] i915 0000:03:00.0: BAR 2: assigned [mem 0x4000000000-0x400fffffff 64bit pref]
[ 7.128417] i915 0000:03:00.0: [drm] Local memory IO size: 0x0000000010000000
[ 7.128424] i915 0000:03:00.0: [drm] Local memory available: 0x000000017c800000
[ 7.128428] i915 0000:03:00.0: [drm] Using a reduced BAR size of 256MiB. Consider enabling 'Resizable BAR' or similar, if available in the BIOS.
[ 7.148546] snd_hda_codec_hdmi hdaudioC0D2: No i915 binding for Intel HDMI/DP codec
[ 7.161530] i915 0000:03:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none
[ 7.166043] i915 0000:03:00.0: [drm] Finished loading DMC firmware i915/dg2_dmc_ver2_07.bin (v2.7)
[ 7.166706] i915 0000:03:00.0: [drm] GuC error state capture buffer maybe too small: 2097152 < 3147156 (min = 1049052)
[ 7.169335] i915 0000:03:00.0: [drm] GuC firmware i915/dg2_guc_70.4.1.bin version 70.4
[ 7.181260] i915 0000:03:00.0: [drm] GuC submission enabled
[ 7.181262] i915 0000:03:00.0: [drm] GuC SLPC enabled
[ 7.181621] i915 0000:03:00.0: [drm] GuC RC: enabled
[ 7.199319] i915 0000:03:00.0: [drm] Reducing the compressed framebuffer size. This may lead to less power savings than a non-reduced-size. Try to increase stolen memory size if available in BIOS.
[ 7.209966] [drm] Initialized i915 1.6.0 20201103 for 0000:03:00.0 on minor 0
[ 7.210168] snd_hda_intel 0000:04:00.0: bound 0000:03:00.0 (ops i915_audio_component_bind_ops [i915])
[ 7.236070] fbcon: i915drmfb (fb0) is primary device
[ 7.313730] i915 0000:03:00.0: [drm] fb0: i915drmfb frame buffer device
For reference I use ghcr.io/intel/llvm/sycl_ubuntu2004_nightly docker container that currently have this runtime installed:
root@d692967a0182:~/sycl# dpkg --list | grep intel
ii intel-igc-cm 1.0.119 amd64 The Intel(R) C for Metal compiler is a open source compiler that implements C for Metal programming language. C for Metal is a new GPU kernel programming language for Intel HD Graphics.
ii intel-igc-core 1.0.12149.1 amd64 Intel(R) Graphics Compiler for OpenCL(TM)
ii intel-igc-media 1.0.12149.1 amd64 Intel(R) Graphics Compiler for OpenCL(TM)
ii intel-igc-opencl 1.0.12149.1 amd64 Intel(R) Graphics Compiler for OpenCL(TM)
ii intel-igc-opencl-devel 1.0.12149.1 amd64 Intel(R) Graphics Compiler for OpenCL(TM)
ii intel-level-zero-gpu 1.3.24278 amd64 Intel(R) Graphics Compute Runtime for oneAPI Level Zero.
ii intel-media-va-driver:amd64 20.1.1+dfsg1-1 amd64 VAAPI driver for the Intel GEN8+ Graphics family
ii intel-opencl-icd 22.38.24278 amd64 Intel graphics compute runtime for OpenCL
ii libdrm-intel1:amd64 2.4.107-8ubuntu1~20.04.2 amd64 Userspace interface to intel-specific kernel DRM services -- runtime
I have similar problems on Fedora 36, with Fedora built 6.0.0-54 kernel, but using OpenCL.
clQuery runs perfectly, but clpeak just hangs. I was not able to build igc from sources, due to this issue: https://github.com/intel/intel-graphics-compiler/issues/259 . However NEO was build from source as you can see due to version 22.41.0.
[xxx@yyy build]$ ./clpeak -p 0 -d 0
Platform: Intel(R) OpenCL HD Graphics
Device: Intel(R) Graphics [0x5690]
Driver version : 22.41.0 (Linux x64)
Compute units : 512
Clock frequency : 2050 MHz
Global memory bandwidth (GBPS)
When using the ADL-P GPU, everything works fine:
[xxx@yyybuild]$ ./clpeak -p 1 -d 0
Platform: Intel(R) OpenCL HD Graphics
Device: Intel(R) Graphics [0x46a6]
Driver version : 22.41.0 (Linux x64)
Compute units : 96
Clock frequency : 1400 MHz
Global memory bandwidth (GBPS)
float : 44.05
float2 : 43.89
float4 : 45.96
float8 : 45.61
float16 : 46.42
Single-precision compute (GFLOPS)
float : 2097.01
float2 : 2072.03
float4 : 2087.24
float8 : 1325.47
float16 : 1316.19
Half-precision compute (GFLOPS)
half : 4066.52
half2 : 4029.82
half4 : 4067.22
half8 : 4042.22
half16 : 3631.05
No double precision support! Skipped
Integer compute (GIOPS)
int : 701.07
int2 : 477.28
int4 : 455.85
int8 : 433.37
int16 : 526.13
Integer compute Fast 24bit (GIOPS)
int : 696.82
int2 : 477.27
int4 : 455.84
int8 : 433.37
int16 : 526.18
Transfer bandwidth (GBPS)
enqueueWriteBuffer : 21.41
enqueueReadBuffer : 22.32
enqueueWriteBuffer non-blocking : 21.69
enqueueReadBuffer non-blocking : 21.96
enqueueMapBuffer(for read) : 1651907.50
memcpy from mapped ptr : 21.87
enqueueUnmap(after write) : 21474796.00
memcpy to mapped ptr : 21.44
Kernel launch latency : 53.55 us
I have similar problem as @alheinecke. Getting device infomation(like clinfo) is running perfectly. However, it hangs when an instruction is submitted to the command queue.
I am using Fedora36 and Kernel 6.0.0-0.rc6. The libraries I am using are IGC: ea9ac563 gmmlib 4552654e neo: 06817090
IGC and dependent libraries were also built from source using build_ubuntu.md as a reference.
When run clpeak, it hangs below.
Platform: Intel(R) OpenCL HD Graphics
Device: Intel(R) Graphics [0x56a5].
Driver version : 22.41.0 (Linux x64)
Compute units : 128
Clock frequency : 2450 MHz
Global memory bandwidth (GBPS)
backtrace is below
#0 0x00007fcd36c7b53b in sched_yield () at ../sysdeps/unix/syscall-template.S:120
#1 0x00007fcd36619b86 in __gthread_yield () at /usr/include/c++/12/x86_64-redhat-linux/bits/gthr-default.h:693
#2 std::this_thread::yield () at /usr/include/c++/12/bits/std_thread.h:322
#3 NEO::WaitUtils::waitFunctionWithPredicate<unsigned int>(unsigned int const volatile*, unsigned int, std::function<bool (unsigned int, unsigned int)>) (predicate=..., expectedValue=1, pollAddress=0x7fcd370c4000) at /tmp/work/neo/shared/source/utilities/wait_util.h:32
#4 NEO::WaitUtils::waitFunction (expectedValue=1, pollAddress=0x7fcd370c4000) at /tmp/work/neo/shared/source/utilities/wait_util.h:37
#5 NEO::CommandStreamReceiver::baseWaitFunction (this=0x1a9ac10, pollAddress=0x7fcd370c4000, params=..., taskCountToWait=1) at /tmp/work/neo/shared/source/command_stream/command_stream_receiver.cpp:387
#6 0x00007fcd3653b6aa in NEO::CommandStreamReceiverHw<NEO::XeHpgCoreFamily>::waitForTaskCountWithKmdNotifyFallback (this=0x1a9ac10, taskCountToWait=1, flushStampToWait=<optimized out>, useQuickKmdSleep=<optimized out>, throttle=NEO::MEDIUM)
at /tmp/work/neo/shared/source/command_stream/command_stream_receiver_hw_base.inl:905
#7 0x00007fcd360f8606 in NEO::CommandQueue::waitUntilComplete (this=this@entry=0x1c258f0, gpgpuTaskCountToWait=0, copyEnginesToWait=..., flushStampToWait=0, useQuickKmdSleep=useQuickKmdSleep@entry=false, cleanTemporaryAllocationList=true, skipWait=false)
at /tmp/work/neo/opencl/source/command_queue/command_queue.cpp:418
#8 0x00007fcd360f972b in NEO::CommandQueue::waitForAllEngines (this=this@entry=0x1c258f0, blockedQueue=<optimized out>, printfHandler=printfHandler@entry=0x0, cleanTemporaryAllocationsList=cleanTemporaryAllocationsList@entry=true) at /tmp/work/neo/opencl/source/command_queue/command_queue.cpp:1222
#9 0x00007fcd363c6076 in NEO::CommandQueue::waitForAllEngines (printfHandler=0x0, blockedQueue=<optimized out>, this=0x1c258f0) at /tmp/work/neo/opencl/source/command_queue/command_queue.h:217
#10 NEO::CommandQueueHw<NEO::XeHpgCoreFamily>::enqueueBlit<4596u> (this=this@entry=0x1c258f0, multiDispatchInfo=..., numEventsInWaitList=numEventsInWaitList@entry=0, eventWaitList=eventWaitList@entry=0x0, event=<optimized out>, blocking=<optimized out>, bcsCsr=...)
at /tmp/work/neo/opencl/source/command_queue/enqueue_common.h:1318
#11 0x00007fcd363ccbfa in NEO::CommandQueueHw<NEO::XeHpgCoreFamily>::dispatchBcsOrGpgpuEnqueue<4596u, 2ul> (this=this@entry=0x1c258f0, dispatchInfo=..., surfaces=..., builtInOperation=builtInOperation@entry=1, numEventsInWaitList=numEventsInWaitList@entry=0, eventWaitList=eventWaitList@entry=0x0, event=<optimized out>,
blocking=true, csr=...) at /tmp/work/neo/opencl/source/command_queue/enqueue_common.h:1338
#12 0x00007fcd363cd2f2 in NEO::CommandQueueHw<NEO::XeHpgCoreFamily>::enqueueWriteBuffer (this=0x1c258f0, buffer=0x1ba8be0, blockingWrite=1, offset=0, size=<optimized out>, ptr=<optimized out>, mapAllocation=<optimized out>, numEventsInWaitList=0, eventWaitList=0x0, event=0x0)
at /tmp/work/neo/opencl/source/command_queue/enqueue_write_buffer.h:104
#13 0x00007fcd360c89e7 in clEnqueueWriteBuffer (commandQueue=<optimized out>, buffer=<optimized out>, blockingWrite=<optimized out>, offset=<optimized out>, cb=<optimized out>, ptr=<optimized out>, numEventsInWaitList=<optimized out>, eventWaitList=<optimized out>, event=<optimized out>)
at /tmp/work/neo/opencl/source/api/api.cpp:2493
#14 0x00007fcd370e17e0 in clEnqueueWriteBuffer (command_queue=0x1c25900, buffer=0x1ba8bf0, blocking_write=1, offset=0, cb=1515978752, ptr=<optimized out>, num_events_in_wait_list=0, event_wait_list=0x0, event=0x0) at ocl_icd_loader_gen.c:2312
#15 0x0000000000410c47 in clPeak::runGlobalBandwidthTest(cl::CommandQueue&, cl::Program&, device_info_t&) ()
#16 0x000000000040ab25 in clPeak::runAll() ()
#17 0x0000000000407b5d in main ()
it seems no response from device.
Discrete GPU support is not yet complete in upstream kernel: https://www.kernel.org/doc/html/latest/gpu/rfc/index.html
If you're using (or can use) one of the Intel GPU DKMS supported distro / kernel versions, you could try the DKMS instead:
yeah... but I saw multiple reports that it should work and with force_probe it at least loads. I tried that RHEL 8.5 recipe earlier this weeks, it works somehow but the system starts to hang, even clinfo shows a Floating Point Exception when existing... so this doesn't seem to be stable either... granted, it works somehow more :-)
do you know of linux 6.1.0 will have full support?
6.1 hasn't queued a commit to drop force_probe, so I don't think it'll support DG2 OOTB
I don't mind exporting force_probe for now, I mind that with even force_probe nothing works
I don't mind exporting force_probe for now, I mind that with even force_probe nothing works
Needing to use force_probe
means that HW is only partially supported by given (upstream) kernel. For now, I would be more worried about things not working properly with a backport kernel module, when it exposes the device without forcing.
I tried that RHEL 8.5 recipe earlier this weeks, it works somehow but the system starts to hang, even clinfo shows a Floating Point Exception when existing... so this doesn't seem to be stable either...
Did you build kernel module from the backports git repo [1], or install ready-built DKMS kernel package from: https://dgpu-docs.intel.com/installation-guides/index.html
I'm not sure whether each of the backport kernel modules (for different distro kernel versions[2]) is tested as much, but at least ones for newer kernel versions would be closer to what eventually gets upstreamed (i.e. potentially something the kernel driver developers work more with), and what I would imagine compute-runtime to be tested with.
Btw. @JablonskiMateusz, which of backport kernel module versions[2] is most tested with compute-runtime?
And is GPU module backport enough, or would the other backport module(s) [3] also be needed for OpenCL?
(E.g. compute-runtime Level-Zero Sysman parts needs also telemetry module to provide all GPU metrics.)
[1] https://github.com/intel-gpu/intel-gpu-i915-backports
[2] Kernel versions supported currently by the backport modules are:
[2] https://github.com/intel-gpu/intel-gpu-i915-backports#dependencies
I followed the steps here: https://dgpu-docs.intel.com/installation-guides/index.html and I'm using CentOS 8.6... However, I just checked I'm running "4.18.0-372.26.1.el8_6.x86_64" and back-ports only mentions "4.18.0-372.26.1.el8_6.x86_64"...
@eero-t Do you know if https://kernel.ubuntu.com/~kernel-ppa/mainline/drm-intel-nightly kernels are supposed to work for DG2 as well?
@eero-t I followed the steps in that document and it works. Thank you very much.
@eero-t Do you know if https://kernel.ubuntu.com/~kernel-ppa/mainline/drm-intel-nightly kernels are supposed to work for DG2 as well?
That's closer to what'll be in 6.2, so it's better but still needs force_probe. Note that there's no 'nightly' branch upstream anymore, this is an alias to drm-tip.
I followed the steps here: https://dgpu-docs.intel.com/installation-guides/index.html and I'm using CentOS 8.6... However, I just checked I'm running "4.18.0-372.26.1.el8_6.x86_64" and back-ports only mentions "4.18.0-372.26.1.el8_6.x86_64"...
Um. Those numbers match?
Anyway, at least as long as kernel major & major numbers match it should be fine. Small minor version differences may also be fine, but that depends completely on what changes kernel did. DKMS package (at least on Debian) builds kernel module from sources against the kernel headers, whenever DKMS package installed, or new kernel is installed, so you would at least notice internal kernel API mismatch (from DKMS module build failing), but not semantic changes.
@eero-t Do you know if https://kernel.ubuntu.com/~kernel-ppa/mainline/drm-intel-nightly kernels are supposed to work for DG2 as well?
I would assume backports DKMS to work better, at least until drm-tip (which regularly rebases to upstream kernel) does not need force-probing any more.
https://dgpu-docs.intel.com/installation-guides/ubuntu/ubuntu-jammy-arc.html#step-1-add-package-repository BKM worked for me for Ubuntu 22.04. Thank you! I wounder why I not found it earlier (maybe it was only recently updated to include Ubuntu 22.04).
Kernel packages + instructions were added to dgpu site only few weeks ago, before that there were only user space driver packages. Kernel driver backports Git repository has been there a a bit longer.
Note: At least in case of media driver, there's a build time option to select between "production" (non-upstream) i915
GPU driver kernel uAPI, and upstream i915
uAPI, and those are not fully compatible (upstreaming often requires API changes before being accepted, and as I mentioned above, upstreaming of kernel dGPU support is not complete yet). I.e. it's better to install kernel and user-space GPU driver packages from the same repo, to make sure they are compatible.
The issue
I am getting the following error when running SYCL programs on DG2. The same program works perfectly on the 12th gen ADL-P integrated GPU.
I am wondering if some steps in my kernel config, dependency installation, or the CMake file are wrong. All the details are provided below.
System info
Hardware
I am on a laptop with i7-12700H CPU with iGPU (ID:
0x46a6
) and Arc A370M discrete GPU (ID:0x5693
).sycl-ls
works:Software
Ubuntu 22.04 with custom compiled Linux kernel 6.0-rc3. The kernel is compiled with
CONFIG_DRM_I915_FORCE_PROBE="5693"
. I have also tried the latest kernel from drm-intel-next, and it has the same issue..deb
packages from compute-runtime releases and from the GPGPU driver page.Code and CMake
Here's my demo code:
Here's the CMake file
Here's the full commands and outputs:
As you can see, the program works fine on the integrated GPU, but not on the DG2 discrete GPU.