intel-analytics / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, MiniCPM, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, GraphRAG, DeepSpeed, vLLM, FastChat, Axolotl, etc.
Apache License 2.0
6.64k stars 1.26k forks source link

Support for Ubuntu 22.04 on Meteor Lake is broken #11605

Open tristan-k opened 3 months ago

tristan-k commented 3 months ago

I recently bought a NUC 14 Core Ultra 5 125H. The documentation to get IPEX-LLM with Intel OneAPI/SYCL running on Ubuntu 22.04 with a Meteor Lake doesn't work. It is not possible to install the DKMS module with intel-i915-dkms because afterwards the system will freeze and lock up. There is no official support for Meteor Lake in Kernel 6.5 which is requiered for IPEX-LLM but one can force load the new Xe drivers with i915.force_probe=!7d55 xe.force_probe=7d55 in the Kernel parameters. Intel has to update its documentation or expand its support to Ubuntu 24.04.

Some more related links:

JinBridger commented 3 months ago

Hi, @tristan-k,

We are now trying to reproduce this issue on device with similar specifications. We will inform you as soon as possible if there is any progress.

Please feel free to ask if there is any further problems : )

JinBridger commented 3 months ago

Hi, @tristan-k,

Could you please try instructions in following link to see if it works? https://github.com/intel-analytics/ipex-llm/issues/11568#issuecomment-2227157685

Please feel free to ask if there is any further problems : )

tristan-k commented 3 months ago

@JinBridger I already did that. It did not make any difference, as previously mentioned in another comment of mine.

JinBridger commented 3 months ago

@JinBridger I already did that. It did not make any difference, as previously mentioned in another comment of mine.

Hi, @tristan-k,

Could you please try to skip installing intel-i915-dkms and see if IPEX-LLM could work?

Please feel free to ask if there is any further problem : )

tristan-k commented 3 months ago

@JinBridger It does not. I alredy tried. The Intel(R) Arc(TM) Graphics is not available for ollama. I also tried to use docker and sycl-ls inside the docker container recognised the iGPU but only with the PCI Device ID 0x7d55 but subsequently wasn't able to use it.

qiuxin2012 commented 3 months ago

@tristan-k Can you show us the result of sudo dmesg | grep i915 and hwinfo --display? Ours:

(xin-llm) arda@xiaoxin04-ubuntu:~/xin/ipex-llm/python/llm/example/GPU/HuggingFace/LLM/glm4$ sudo dmesg | grep i915
[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-6.5.0-35-generic root=UUID=84fbf89b-31fa-4683-b0ee-fcfa74deb21f ro quiet splash i915.force_probe=7d55 vt.handoff=7
[    0.048637] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.5.0-35-generic root=UUID=84fbf89b-31fa-4683-b0ee-fcfa74deb21f ro quiet splash i915.force_probe=7d55 vt.handoff=7
[    2.783516] i915 0000:00:02.0: Force probing unsupported Device ID 7d55, tainting kernel
[    2.783838] i915 0000:00:02.0: [drm] GT0: Incompatible option enable_guc=3 - HuC is not supported!
[    2.784526] i915 0000:00:02.0: [drm] No GSC FW selected, disabling GSC CS and media C6
[    2.784801] i915 0000:00:02.0: [drm] VT-d active for gfx access
[    2.785028] i915 0000:00:02.0: vgaarb: deactivate vga console
[    2.785060] i915 0000:00:02.0: [drm] Using Transparent Hugepages
[    2.806186] i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[    2.815765] i915 0000:00:02.0: [drm] Finished loading DMC firmware i915/mtl_dmc.bin (v2.13)
[    3.682087] i915 0000:00:02.0: [drm] [ENCODER:244:DDI B/PHY B] failed to retrieve link info, disabling eDP
[    3.697690] i915 0000:00:02.0: [drm] GT0: GuC firmware i915/mtl_guc_70.bin version 70.8.0
[    3.709179] i915 0000:00:02.0: [drm] GT0: GUC: submission enabled
[    3.709183] i915 0000:00:02.0: [drm] GT0: GUC: SLPC enabled
[    3.709374] i915 0000:00:02.0: [drm] GT0: GUC: RC enabled
[    3.713570] i915 0000:00:02.0: [drm] GT1: GuC firmware i915/mtl_guc_70.bin version 70.8.0
[    3.713574] i915 0000:00:02.0: [drm] GT1: HuC firmware i915/mtl_huc_gsc.bin version 8.5.1
[    3.738090] i915 0000:00:02.0: [drm] GT1: HuC: authenticated for clear media
[    3.738418] i915 0000:00:02.0: [drm] GT1: GUC: submission enabled
[    3.738419] i915 0000:00:02.0: [drm] GT1: GUC: SLPC enabled
[    3.738486] i915 0000:00:02.0: [drm] GT1: GUC: RC enabled
[    4.852680] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on minor 0
[    4.855370] sof-audio-pci-intel-mtl 0000:00:1f.3: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915])
[    4.864343] fbcon: i915drmfb (fb0) is primary device
[    5.984036] i915 0000:00:02.0: [drm] fb0: i915drmfb frame buffer device

(xin-llm) arda@xiaoxin04-ubuntu:~/xin/ipex-llm/python/llm/example/GPU/HuggingFace/LLM/glm4$ hwinfo --display
26: PCI 02.0: 0300 VGA compatible controller (VGA)              
  [Created at pci.386]
  Unique ID: _Znp.D2pOT0W7IB1
  SysFS ID: /devices/pci0000:00/0000:00:02.0
  SysFS BusID: 0000:00:02.0
  Hardware Class: graphics card
  Model: "Intel VGA compatible controller"
  Vendor: pci 0x8086 "Intel Corporation"
  Device: pci 0x7d55 
  SubVendor: pci 0x17aa "Lenovo"
  SubDevice: pci 0x3cc9 
  Revision: 0x08
  Driver: "i915"
  Driver Modules: "i915"
  Memory Range: 0x408c000000-0x408cffffff (ro,non-prefetchable)
  Memory Range: 0x4000000000-0x400fffffff (ro,non-prefetchable)
  Memory Range: 0x000c0000-0x000dffff (rw,non-prefetchable,disabled)
  IRQ: 184 (4812860 events)
  Module Alias: "pci:v00008086d00007D55sv000017AAsd00003CC9bc03sc00i00"
  Driver Info #0:
    Driver Status: i915 is active
    Driver Activation Cmd: "modprobe i915"
  Config Status: cfg=new, avail=yes, need=no, active=unknown

Primary display adapter: #26
tristan-k commented 1 month ago

Here is the output without the intel-i915-dkms module:

sudo dmesg | grep i915
[    4.104785] i915 0000:00:02.0: [drm] VT-d active for gfx access
[    4.129376] i915 0000:00:02.0: vgaarb: deactivate vga console
[    4.129416] i915 0000:00:02.0: [drm] Using Transparent Hugepages
[    4.143206] i915 0000:00:02.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[    4.152518] i915 0000:00:02.0: [drm] Finished loading DMC firmware i915/mtl_dmc.bin (v2.17)
[    4.158828] i915 0000:00:02.0: [drm] GT0: GuC firmware i915/mtl_guc_70.bin (70.12.1) is recommended, but only i915/mtl_guc_70.bin (70.8.0) was found
[    4.158835] i915 0000:00:02.0: [drm] GT0: Consider updating your linux-firmware pkg or downloading from https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/i915
[    4.159698] i915 0000:00:02.0: [drm] GT1: GuC firmware i915/mtl_guc_70.bin (70.12.1) is recommended, but only i915/mtl_guc_70.bin (70.8.0) was found
[    4.159700] i915 0000:00:02.0: [drm] GT1: Consider updating your linux-firmware pkg or downloading from https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/i915
[    4.184781] i915 0000:00:02.0: [drm] GT0: GuC firmware i915/mtl_guc_70.bin version 70.8.0
[    4.200853] i915 0000:00:02.0: [drm] GT0: GUC: submission enabled
[    4.200858] i915 0000:00:02.0: [drm] GT0: GUC: SLPC enabled
[    4.201057] i915 0000:00:02.0: [drm] GT0: GUC: RC enabled
[    4.210848] mei_gsc_proxy 0000:00:16.0-0f73db04-97ab-4125-b893-e904ad0d5464: bound 0000:00:02.0 (ops i915_gsc_proxy_component_ops [i915])
[    4.211339] i915 0000:00:02.0: [drm] GT1: GuC firmware i915/mtl_guc_70.bin version 70.8.0
[    4.211342] i915 0000:00:02.0: [drm] GT1: HuC firmware i915/mtl_huc_gsc.bin version 8.5.4
[    4.238118] i915 0000:00:02.0: [drm] GT1: HuC: authenticated for clear media
[    4.238597] i915 0000:00:02.0: [drm] GT1: GUC: submission enabled
[    4.238602] i915 0000:00:02.0: [drm] GT1: GUC: SLPC enabled
[    4.238717] i915 0000:00:02.0: [drm] GT1: GUC: RC enabled
[    4.240514] i915 0000:00:02.0: [drm] Protected Xe Path (PXP) protected content support initialized
[    4.277538] [drm] Initialized i915 1.6.0 20230929 for 0000:00:02.0 on minor 1
[    4.280572] i915 display info: display version: 14
[    4.280575] i915 display info: cursor_needs_physical: no
[    4.280577] i915 display info: has_cdclk_crawl: yes
[    4.280578] i915 display info: has_cdclk_squash: yes
[    4.280579] i915 display info: has_ddi: yes
[    4.280581] i915 display info: has_dp_mst: yes
[    4.280582] i915 display info: has_dsb: yes
[    4.280583] i915 display info: has_fpga_dbg: yes
[    4.280584] i915 display info: has_gmch: no
[    4.280585] i915 display info: has_hotplug: yes
[    4.280586] i915 display info: has_hti: no
[    4.280587] i915 display info: has_ipc: yes
[    4.280589] i915 display info: has_overlay: no
[    4.280596] i915 display info: has_psr: yes
[    4.280597] i915 display info: has_psr_hw_tracking: no
[    4.280598] i915 display info: overlay_needs_physical: no
[    4.280599] i915 display info: supports_tv: no
[    4.280601] i915 display info: has_hdcp: yes
[    4.280602] i915 display info: has_dmc: yes
[    4.280603] i915 display info: has_dsc: yes
[    4.307449] snd_hda_intel 0000:00:1f.3: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915])
[    4.314740] fbcon: i915drmfb (fb0) is primary device
[    4.314808] i915 0000:00:02.0: [drm] fb0: i915drmfb frame buffer device
[    4.383895] i915 0000:00:02.0: [drm] GT1: Loaded GSC firmware i915/mtl_gsc_1.bin (cv1.0, r102.0.0.1655, svn 1)
[    4.404069] i915 0000:00:02.0: [drm] GT1: HuC: authenticated for all workloads
[  673.503694] i915 0000:00:02.0: Using 41-bit DMA addresses
$ sudo hwinfo --display
26: PCI 02.0: 0300 VGA compatible controller (VGA)              
  [Created at pci.386]
  Unique ID: _Znp.oQng9K+95x3
  SysFS ID: /devices/pci0000:00/0000:00:02.0
  SysFS BusID: 0000:00:02.0
  Hardware Class: graphics card
  Device Name: "Onboard - Video"
  Model: "Intel VGA compatible controller"
  Vendor: pci 0x8086 "Intel Corporation"
  Device: pci 0x7d55 
  SubVendor: pci 0x1043 "ASUSTeK Computer Inc."
  SubDevice: pci 0x88c8 
  Revision: 0x08
  Driver: "i915"
  Driver Modules: "i915"
  Memory Range: 0x5017000000-0x5017ffffff (ro,non-prefetchable)
  Memory Range: 0x4000000000-0x400fffffff (ro,non-prefetchable)
  Memory Range: 0x000c0000-0x000dffff (rw,non-prefetchable,disabled)
  IRQ: 208 (101475 events)
  Module Alias: "pci:v00008086d00007D55sv00001043sd000088C8bc03sc00i00"
  Driver Info #0:
    Driver Status: i915 is active
    Driver Activation Cmd: "modprobe i915"
  Driver Info #1:
    Driver Status: xe is active
    Driver Activation Cmd: "modprobe xe"
  Config Status: cfg=new, avail=yes, need=no, active=unknown

Primary display adapter: #26

$ sudo inxi -G
Graphics:
  Device-1: Intel driver: i915 v: kernel
  Display: server: X.Org v: 1.22.1.1 driver: gpu: i915 note:  X driver n/a
    resolution: 3840x2160~60Hz
  OpenGL: renderer: Mesa Intel Arc Graphics (MTL)
    v: 4.6 Mesa 23.2.1-1ubuntu3.1~22.04.2
$ source /opt/intel/oneapi/setvars.sh

:: initializing oneAPI environment ...
   bash: BASH_VERSION = 5.1.16(1)-release
   args: Using "$@" for setvars.sh arguments: 
:: advisor -- latest
:: ccl -- latest
:: compiler -- latest
:: dal -- latest
:: debugger -- latest
:: dev-utilities -- latest
:: dnnl -- latest
:: dpcpp-ct -- latest
:: dpl -- latest
:: ipp -- latest
:: ippcp -- latest
:: mkl -- latest
:: mpi -- latest
:: tbb -- latest
:: vtune -- latest
:: oneAPI environment initialized ::

$ sycl-ls
[opencl:cpu][opencl:0] Intel(R) OpenCL, Intel(R) Core(TM) Ultra 5 125H OpenCL 3.0 (Build 0) [2024.18.7.0.11_160000]

As soon as I install the intel-i915-dkms the system becomes instable.

Here is the output with intel-i915-dkms installed:

$ sudo dmesg | grep i915
               use xe.force_probe='7d55' and i915.force_probe='!7d55'
[    3.613393] i915 0000:00:02.0: Using 18 cores (0-17) for kthreads
[    3.613412] i915 0000:00:02.0: [drm] GT count: 2, enabled: 2
[    3.614033] i915 0000:00:02.0: [drm] VT-d active for gfx access
[    3.631580] i915 0000:00:02.0: vgaarb: deactivate vga console
[    3.631609] i915 0000:00:02.0: [drm] Using Transparent Hugepages
[    3.656351] i915 0000:00:02.0: [drm] Finished loading DMC firmware i915/mtl_dmc_ver2_12.bin (v2.12)
[    3.675512] i915 0000:00:02.0: [drm] GT0: GuC firmware i915/mtl_guc_70.6.8.bin version 70.6.8
[    3.688901] i915 0000:00:02.0: [drm] GT0: GUC: submission enabled
[    3.688906] i915 0000:00:02.0: [drm] GT0: GUC: SLPC enabled
[    3.689073] i915 0000:00:02.0: [drm] GT0: GUC: RC enabled
[    3.694386] i915 0000:00:02.0: [drm] GT1: GuC firmware i915/mtl_guc_70.6.8.bin version 70.6.8
[    3.711745] i915 0000:00:02.0: [drm] GT1: GUC: submission enabled
[    3.711750] i915 0000:00:02.0: [drm] GT1: GUC: SLPC enabled
[    3.711811] i915 0000:00:02.0: [drm] GT1: GUC: RC enabled
[    3.744675] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on minor 1
[    3.766379] snd_hda_intel 0000:00:1f.3: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915])
$ sudo hwinfo --display
26: PCI 02.0: 0300 VGA compatible controller (VGA)
  [Created at pci.386]
  Unique ID: _Znp.oQng9K+95x3
  SysFS ID: /devices/pci0000:00/0000:00:02.0
  SysFS BusID: 0000:00:02.0
  Hardware Class: graphics card
  Device Name: "Onboard - Video"
  Model: "Intel VGA compatible controller"
  Vendor: pci 0x8086 "Intel Corporation"
  Device: pci 0x7d55
  SubVendor: pci 0x1043 "ASUSTeK Computer Inc."
  SubDevice: pci 0x88c8
  Revision: 0x08
  Driver: "i915"
  Driver Modules: "i915"
  Memory Range: 0x5017000000-0x5017ffffff (ro,non-prefetchable)
  Memory Range: 0x4000000000-0x400fffffff (ro,non-prefetchable)
  Memory Range: 0x000c0000-0x000dffff (rw,non-prefetchable,disabled)
  IRQ: 208 (71 events)
  Module Alias: "pci:v00008086d00007D55sv00001043sd000088C8bc03sc00i00"
  Driver Info #0:
    Driver Status: xe is active
    Driver Activation Cmd: "modprobe xe"
  Driver Info #1:
    Driver Status: i915 is active
    Driver Activation Cmd: "modprobe i915"
  Config Status: cfg=new, avail=yes, need=no, active=unknown

Primary display adapter: #26

$ sudo inxi -G
Graphics:
  Device-1: Intel driver: i915
    v: backported to 6.8.0-40 from (cde40e5cc005a) using backports I915_24.3.23_PSB_240419.26
  Display: server: X.org v: 1.21.1.4 with: Xwayland v: 22.1.1 driver: gpu: i915
    note:  X driver n/a tty: 214x52
  Message: GL data unavailable in console for root.
$ source /opt/intel/oneapi/setvars.sh

:: initializing oneAPI environment ...
   -bash: BASH_VERSION = 5.1.16(1)-release
   args: Using "$@" for setvars.sh arguments:
:: advisor -- latest
:: ccl -- latest
:: compiler -- latest
:: dal -- latest
:: debugger -- latest
:: dev-utilities -- latest
:: dnnl -- latest
:: dpcpp-ct -- latest
:: dpl -- latest
:: ipp -- latest
:: ippcp -- latest
:: mkl -- latest
:: mpi -- latest
:: tbb -- latest
:: vtune -- latest
:: oneAPI environment initialized ::

$ sycl-ls
[opencl:cpu][opencl:0] Intel(R) OpenCL, Intel(R) Core(TM) Ultra 5 125H OpenCL 3.0 (Build 0) [2024.18.7.0.11_160000]
[opencl:gpu][opencl:1] Intel(R) OpenCL Graphics, Intel(R) Arc(TM) Graphics OpenCL 3.0 NEO  [24.22.29735.27]
[level_zero:gpu][level_zero:0] Intel(R) Level-Zero, Intel(R) Arc(TM) Graphics 1.3 [1.3.29735]
sudo xpu-smi discovery -d 0
+-----------+--------------------------------------------------------------------------------------+
| Device ID | Device Information                                                                   |
+-----------+--------------------------------------------------------------------------------------+
| 0         | Device Type: GPU                                                                     |
|           | Device Name: Intel(R) Arc(TM) Graphics                                               |
|           | PCI Device ID: 0x7d55                                                                |
|           | Vendor Name: Intel(R) Corporation                                                    |
|           | SOC UUID: 00000000-0000-0200-0000-00087d558086                                       |
|           | Serial Number: unknown                                                               |
|           | Core Clock Rate: 2200 MHz                                                            |
|           | Stepping: C0                                                                         |
|           | SKU Type: N/A                                                                        |
|           |                                                                                      |
|           | Driver Version: I915_24.3.23_PSB_240419.26                                           |
|           | Kernel Version: 6.8.0-40-generic                                                     |
|           | GFX Firmware Name: GFX                                                               |
|           | GFX Firmware Version: unknown                                                        |
|           | GFX Firmware Status: unknown                                                         |
|           |                                                                                      |
|           | PCI BDF Address: 0000:00:02.0                                                        |
|           | PCI Slot: N/A                                                                        |
|           | PCIe Generation: -1                                                                  |
|           | PCIe Max Link Width: -1                                                              |
|           |                                                                                      |
|           | Memory Physical Size: 0.00 MiB                                                       |
|           | Max Mem Alloc Size: 4095.99 MiB                                                      |
|           | ECC State: N/A                                                                       |
|           | Number of Memory Channels: N/A                                                       |
|           | Memory Bus Width: N/A                                                                |
|           | Max Hardware Contexts: 65536                                                         |
|           | Max Command Queue Priority: 0                                                        |
|           |                                                                                      |
|           | Number of EUs: 112                                                                   |
|           | Number of Tiles: 1                                                                   |
|           | Number of Slices: 1                                                                  |
|           | Number of Sub Slices per Slice: 7                                                    |
|           | Number of Threads per EU: 8                                                          |
|           | Physical EU SIMD Width: 8                                                            |
|           | Number of Media Engines: 2                                                           |
|           | Number of Media Enhancement Engines: 1                                               |
|           |                                                                                      |
|           | Number of Xe Link ports: N/A                                                         |
|           | Max Tx/Rx Speed per Xe Link port: N/A                                                |
|           | Number of Lanes per Xe Link port: N/A                                                |
+-----------+--------------------------------------------------------------------------------------+
tristan-k commented 1 month ago

On Ubuntu 24.04 without the intel-i915-dkms I can use the OpenCL backend but not the Level Zero. It seems that the 6.8.0-45-generic kernel exposes at least one backend to the iGPU.

$ sycl-ls 
[opencl:cpu][opencl:0] Intel(R) OpenCL, Intel(R) Core(TM) Ultra 5 125H OpenCL 3.0 (Build 0) [2024.18.7.0.11_160000]
[opencl:gpu][opencl:1] Intel(R) OpenCL Graphics, Intel(R) Graphics [0x7d55] OpenCL 3.0 NEO  [23.43.027642]
$ sudo dmesg | grep i915
[    3.091427] i915 0000:00:02.0: [drm] VT-d active for gfx access
[    3.111467] i915 0000:00:02.0: vgaarb: deactivate vga console
[    3.111497] i915 0000:00:02.0: [drm] Using Transparent Hugepages
[    3.123337] i915 0000:00:02.0: vgaarb: VGA decodes changed: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[    3.132024] i915 0000:00:02.0: [drm] Finished loading DMC firmware i915/mtl_dmc.bin (v2.21)
[    3.148692] i915 0000:00:02.0: [drm] GT0: GuC firmware i915/mtl_guc_70.bin version 70.20.0
[    3.160847] i915 0000:00:02.0: [drm] GT0: GUC: submission enabled
[    3.160850] i915 0000:00:02.0: [drm] GT0: GUC: SLPC enabled
[    3.161042] i915 0000:00:02.0: [drm] GT0: GUC: RC enabled
[    3.169243] mei_gsc_proxy 0000:00:16.0-0f73db04-97ab-4125-b893-e904ad0d5464: bound 0000:00:02.0 (ops i915_gsc_proxy_component_ops [i915])
[    3.169583] i915 0000:00:02.0: [drm] GT1: GuC firmware i915/mtl_guc_70.bin version 70.20.0
[    3.169585] i915 0000:00:02.0: [drm] GT1: HuC firmware i915/mtl_huc_gsc.bin version 8.5.4
[    3.196242] i915 0000:00:02.0: [drm] GT1: HuC: authenticated for clear media
[    3.196676] i915 0000:00:02.0: [drm] GT1: GUC: submission enabled
[    3.196679] i915 0000:00:02.0: [drm] GT1: GUC: SLPC enabled
[    3.196747] i915 0000:00:02.0: [drm] GT1: GUC: RC enabled
[    3.200188] i915 0000:00:02.0: [drm] Protected Xe Path (PXP) protected content support initialized
[    3.225485] [drm] Initialized i915 1.6.0 20230929 for 0000:00:02.0 on minor 1
[    3.228034] i915 display info: display version: 14
[    3.228037] i915 display info: cursor_needs_physical: no
[    3.228038] i915 display info: has_cdclk_crawl: yes
[    3.228039] i915 display info: has_cdclk_squash: yes
[    3.228039] i915 display info: has_ddi: yes
[    3.228040] i915 display info: has_dp_mst: yes
[    3.228041] i915 display info: has_dsb: yes
[    3.228041] i915 display info: has_fpga_dbg: yes
[    3.228042] i915 display info: has_gmch: no
[    3.228043] i915 display info: has_hotplug: yes
[    3.228044] i915 display info: has_hti: no
[    3.228044] i915 display info: has_ipc: yes
[    3.228045] i915 display info: has_overlay: no
[    3.228046] i915 display info: has_psr: yes
[    3.228046] i915 display info: has_psr_hw_tracking: no
[    3.228047] i915 display info: overlay_needs_physical: no
[    3.228048] i915 display info: supports_tv: no
[    3.228049] i915 display info: has_hdcp: yes
[    3.228049] i915 display info: has_dmc: yes
[    3.228050] i915 display info: has_dsc: yes
[    3.254319] snd_hda_intel 0000:00:1f.3: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915])
[    3.262304] fbcon: i915drmfb (fb0) is primary device
[    3.334957] i915 0000:00:02.0: [drm] fb0: i915drmfb frame buffer device
[    3.338572] i915 0000:00:02.0: [drm] GT1: Loaded GSC firmware i915/mtl_gsc_1.bin (cv1.0, r102.0.0.1655, svn 1)
[    3.358730] i915 0000:00:02.0: [drm] GT1: HuC: authenticated for all workloads
[  429.749853] i915 0000:00:02.0: Using 41-bit DMA addresses
$ sudo inxi -G
Graphics:
  Device-1: Intel Meteor Lake-P [Intel Arc Graphics] driver: i915 v: kernel
  Display: server: X.Org v: 23.2.6 with: Xwayland v: 23.2.6 driver:
    dri: iris gpu: i915 resolution: 3840x2160~60Hz
  API: EGL v: 1.5 drivers: iris,swrast platforms: x11,surfaceless,device
  API: OpenGL v: 4.6 compat-v: 4.5 vendor: intel mesa v: 24.0.9-0ubuntu0.1
    renderer: Mesa Intel Arc Graphics (MTL)
  API: Vulkan v: 1.3.275 drivers: N/A surfaces: xcb,xlib
$ sudo hwinfo --display
26: PCI 02.0: 0300 VGA compatible controller (VGA)              
  [Created at pci.386]
  Unique ID: _Znp.oQng9K+95x3
  SysFS ID: /devices/pci0000:00/0000:00:02.0
  SysFS BusID: 0000:00:02.0
  Hardware Class: graphics card
  Device Name: "Onboard - Video"
  Model: "Intel VGA compatible controller"
  Vendor: pci 0x8086 "Intel Corporation"
  Device: pci 0x7d55 
  SubVendor: pci 0x1043 "ASUSTeK Computer Inc."
  SubDevice: pci 0x88c8 
  Revision: 0x08
  Driver: "i915"
  Driver Modules: "i915"
  Memory Range: 0x5017000000-0x5017ffffff (ro,non-prefetchable)
  Memory Range: 0x4000000000-0x400fffffff (ro,non-prefetchable)
  Memory Range: 0x000c0000-0x000dffff (rw,non-prefetchable,disabled)
  IRQ: 212 (1996142 events)
  Module Alias: "pci:v00008086d00007D55sv00001043sd000088C8bc03sc00i00"
  Driver Info #0:
    Driver Status: i915 is active
    Driver Activation Cmd: "modprobe i915"
  Driver Info #1:
    Driver Status: xe is active
    Driver Activation Cmd: "modprobe xe"
  Config Status: cfg=new, avail=yes, need=no, active=unknown
qiuxin2012 commented 1 month ago
image

This level_zero:gpu is the wanted backend, opencl is not supported. It looks like you have got level_zero if you installed intel-i915-dkms. But you said the system becomes instable., what did you mean instable, system crash?

tristan-k commented 1 month ago

I'm unable to load anything related to the gpu. I can however run xpu-smi. For example I tried to run vkpeak and clpeak but the applications just freeze. If I then try to reboot, the whole system locks up and I will have to do a hard reset. I'm pretty sure that the issue is caused by the intel-i915-dkms because there are other reports.

syafie-nzm commented 1 month ago

hi @tristan-k, i had exactly the same experience with you, been following your comments from here. Installed the drivers according to the ipex guideline, whenever i run any workload on iGPU, it got freeze and system locks up, eventually leads to hard reset on the system. Try READ and use this Installing Clients GPU, from my side it worked. Please SKIP the out of tree driver installation (idk why but it worked for me). After installed GPU packages, do check with sycl-ls, and see whether the level_zero:gpu got installed

tristan-k commented 1 month ago

Please SKIP the out of tree driver installation (idk why but it worked for me).

This actually works but it's embarrassing that Intel doesnt properly test it's own backported drivers with default Kernels shipped in Ubuntu 22.04 even though it's a supported distro. I suspect that there are other things which wont work without the drivers. For example intel_gpu_top ( apt install intel-gpu-tools) isnt able to identify the compute engine.