intel / compute-runtime

Intel® Graphics Compute Runtime for oneAPI Level Zero and OpenCL™ Driver
MIT License
1.1k stars 229 forks source link

Xe KMD doesn't work on Linux #642

Closed kode54 closed 4 months ago

kode54 commented 1 year ago

Using the Xe KMD, with an updated xe_drm.h, synced between the latest linux-drm-xe-next kernel and this package, results in OpenCL tasks which cause my A770 LE 16GB to time out, but just for the OpenCL app. The result is that Blender live locks a single core until the process is terminated. Journal shows notices like this:

Apr 25 23:55:49 mrgency kernel: xe 0000:28:00.0: [drm] Ioctl argument check failed at drivers/gpu/drm/xe/xe_exec.c:178: engine->flags & ENGINE_FLAG_BANNED
Apr 25 23:55:49 mrgency kernel: xe 0000:28:00.0: [drm] Ioctl argument check failed at drivers/gpu/drm/xe/xe_exec.c:178: engine->flags & ENGINE_FLAG_BANNED

Using commit of kernel: 3cf57993a3f1 And two patches: https://gist.github.com/kode54/156da2ae09c9c8a591d0cdde7f77f511

Mesa 23.2.0_devel.170325.5c287290d88 built with mesa-tkg-git, with PR 20418 and 22652 applied.

This package from 22.43.24558.r1615.g16db7cc890 using AUR package intel-compute-runtime-git, with makepkg -o and swapping in the xe_drm.h from above, and then makepkg -efi.

JablonskiMateusz commented 1 year ago

what is neo driver version there? Support for Xe KMD was added in 23.05.25593.11

kode54 commented 1 year ago

Do I have to build with a special branch? I was wondering why intel-compute-runtime-git was still producing version 22 binaries. I don't even know if I need an all different set of dependencies to get it working.

Edit: I built the master branch, why is that still version 22?

kode54 commented 1 year ago

Rebuild against 23.09.25812.15.r0.g2e4d3998e2-1, still hangs 100% on a core when it attempts to actually execute anything.

JablonskiMateusz commented 1 year ago

have you enabled cmake flag to enable Xe driver support? NEO_ENABLE_XE_DRM_DETECTION=1

kode54 commented 1 year ago

I have that flag set, otherwise there would not be a device to use. Blender lists my oneAPI devices, but hangs spinning a core when it tries to upload the kernels. It works fine on the i915 driver.

kode54 commented 1 year ago

Yeah, and dmesg emits the following every time that happens:

May 07 20:23:56 mrgency kernel: xe 0000:28:00.0: [drm] Engine reset: guc_id=59
May 07 20:23:56 mrgency kernel: xe 0000:28:00.0: [drm] Timedout job: seqno=4294967169, guc_id=59, flags=0x8

That's -127 cast to u32, which is the initial DMA fence seqno for the Xe driver right now.

JablonskiMateusz commented 1 year ago

is the issue still visible?

eero-t commented 4 months ago

@kode54 See following for working Xe KMD setup: https://github.com/intel/compute-runtime/issues/696

kode54 commented 4 months ago

I no longer own an Intel GPU to test with, sorry.

eero-t commented 4 months ago

@kode54 In that case, could you close this (I cannot)?