strongtz / i915-sriov-dkms

dkms module of Linux i915 driver with SR-IOV support
924 stars 111 forks source link

Seems like 11th Gen Rocket Lake will not be supported #27

Closed icanc0 closed 11 months ago

icanc0 commented 1 year ago

Running lspci -v -s 00:02 results in:

root@hyper:~# lspci -v -s 00:02
00:02.0 VGA compatible controller: Intel Corporation Device 4c8a (rev 04) (prog-if 00 [VGA controller])
        DeviceName: Onboard - Video
        Subsystem: Gigabyte Technology Co., Ltd Device d000
        Flags: bus master, fast devsel, latency 0, IRQ 206, IOMMU group 0
        Memory at 6012000000 (64-bit, non-prefetchable) [size=16M]
        Memory at 4000000000 (64-bit, prefetchable) [size=256M]
        I/O ports at 5000 [size=64]
        Expansion ROM at 000c0000 [virtual] [disabled] [size=128K]
        Capabilities: [40] Vendor Specific Information: Len=0c <?>
        Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00
        Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable+ 64bit-
        Capabilities: [d0] Power Management version 2
        Capabilities: [100] Process Address Space ID (PASID)
        Capabilities: [200] Address Translation Service (ATS)
        Capabilities: [300] Page Request Interface (PRI)
        Kernel driver in use: i915
        Kernel modules: i915

[320] Single Root I/O Virtualization didn't show up under Capabilities And the kernel msg: (ignore the firmware warning, i was trying to see if it was a firmware issue)

root@hyper:~# dmesg | grep i915
[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-6.1.0-1-pve root=/dev/mapper/pve-root ro intel_iommu=on i915.enable_guc=7
[    0.056710] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.1.0-1-pve root=/dev/mapper/pve-root ro intel_iommu=on i915.enable_guc=7
[    7.257756] i915 0000:00:02.0: [drm] VT-d active for gfx access
[    7.257765] i915 0000:00:02.0: vgaarb: deactivate vga console
[    7.257826] i915 0000:00:02.0: [drm] Using Transparent Hugepages
[    7.259141] i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=io+mem
[    7.260045] mei_hdcp 0000:00:16.0-b638ab7e-94e2-4ea2-a552-d1c54b627f04: bound 0000:00:02.0 (ops i915_hdcp_component_ops [i915])
[    7.260592] i915 0000:00:02.0: [drm] Finished loading DMC firmware i915/rkl_dmc_ver2_03.bin (v2.3)
[    7.264886] i915 0000:00:02.0: [drm] GuC firmware i915/tgl_guc_70.bin (70.5) is recommended, but only i915/tgl_guc_70.1.1.bin (70.1) was found
[    7.264891] i915 0000:00:02.0: [drm] Consider updating your linux-firmware pkg or downloading from https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/i915
[    7.360326] i915 0000:00:02.0: [drm] *ERROR* Zero GuC log crash dump size!
[    7.360329] i915 0000:00:02.0: [drm] *ERROR* Zero GuC log debug size!
[    7.362541] i915 0000:00:02.0: [drm] GuC firmware i915/tgl_guc_70.1.1.bin version 70.1.1
[    7.362545] i915 0000:00:02.0: [drm] HuC firmware i915/tgl_huc.bin version 7.9.3
[    7.364814] i915 0000:00:02.0: [drm] HuC authenticated
[    7.365026] i915 0000:00:02.0: [drm] GuC submission enabled
[    7.365028] i915 0000:00:02.0: [drm] GuC SLPC enabled
[    7.365346] i915 0000:00:02.0: [drm] GuC RC: enabled
[    7.365728] mei_pxp 0000:00:16.0-fbf6fcf1-96cf-4e2e-a6a6-1bab8cbe36b1: bound 0000:00:02.0 (ops i915_pxp_tee_component_ops [i915])
[    7.365814] i915 0000:00:02.0: [drm] Protected Xe Path (PXP) protected content support initialized
[    7.498455] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on minor 0
[    7.537423] snd_hda_intel 0000:00:1f.3: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915])
[    7.593904] fbcon: i915drmfb (fb0) is primary device
[    7.633391] i915 0000:00:02.0: [drm] fb0: i915drmfb frame buffer device

Searching for VFs only shows the Nvidia GPU I've passed through. (Look at the vender, its a GTX 745)

root@hyper:~# dmesg | grep vf
[    6.916969] vfio-pci 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
[    6.917054] vfio_pci: add [10de:1382[ffffffff:ffffffff]] class 0x000000/00000000
[    6.964346] vfio_pci: add [10de:0fbc[ffffffff:ffffffff]] class 0x000000/00000000
[    7.259152] vfio-pci 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:owns=none

And then i checked on the intel website: image image

Don't you love it when the intel website conflicts with itself? Tiger Lake is almost half a year older than Rocket Lake.

raldone01 commented 1 year ago

I could get my tiger lake to generate the pci interface (virtual functions) but in a windows guest the device fails with error 43.

dmesg | grep i915 ``` # dmesg | grep i915 [ 0.000000] Command line: root=/dev/mapper/main-root rootflags=subvol=/@ rw loglevel=3 quiet resume=/dev/mapper/main-swap initrd=intel-ucode.img initrd=initramfs-linux-zen.img intel_iommu=on i915.enable_guc=7 [ 0.049432] Kernel command line: root=/dev/mapper/main-root rootflags=subvol=/@ rw loglevel=3 quiet resume=/dev/mapper/main-swap initrd=intel-ucode.img initrd=initramfs-linux-zen.img intel_iommu=on i915.enable_guc=7 [ 4.552290] i915: loading out-of-tree module taints kernel. [ 4.553839] i915: module verification failed: signature and/or required key missing - tainting kernel [ 4.759977] i915 0000:00:02.0: Running in SR-IOV PF mode [ 4.760444] i915 0000:00:02.0: [drm] VT-d active for gfx access [ 4.760530] i915 0000:00:02.0: vgaarb: deactivate vga console [ 4.760599] i915 0000:00:02.0: [drm] Using Transparent Hugepages [ 4.767210] i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem [ 4.768069] mei_hdcp 0000:00:16.0-b638ab7e-94e2-4ea2-a552-d1c54b627f04: bound 0000:00:02.0 (ops i915_hdcp_component_ops [i915]) [ 4.769773] i915 0000:00:02.0: [drm] Finished loading DMC firmware i915/tgl_dmc_ver2_12.bin (v2.12) [ 4.928388] i915 0000:00:02.0: [drm] Missing GuC-Err-Cap reglist Class(1):Compute(4)! [ 4.928394] i915 0000:00:02.0: [drm] Missing GuC-Err-Cap reglist Instance(2):Compute(4)! [ 4.930152] i915 0000:00:02.0: [drm] Missing GuC-Err-Cap reglist Class(1):Compute(4)! [ 4.930154] i915 0000:00:02.0: [drm] Missing GuC-Err-Cap reglist Instance(2):Compute(4)! [ 4.930596] i915 0000:00:02.0: [drm] GuC firmware i915/tgl_guc_70.1.1.bin version 70.1.1 [ 4.930598] i915 0000:00:02.0: [drm] HuC firmware i915/tgl_huc_7.9.3.bin version 7.9.3 [ 4.931348] i915 0000:00:02.0: [drm] Missing GuC-Err-Cap reglist Class(1):Compute(4)! [ 4.931349] i915 0000:00:02.0: [drm] Missing GuC-Err-Cap reglist Instance(2):Compute(4)! [ 4.934386] i915 0000:00:02.0: [drm] HuC authenticated [ 4.934723] i915 0000:00:02.0: [drm] GuC submission enabled [ 4.934724] i915 0000:00:02.0: [drm] GuC SLPC enabled [ 4.935114] i915 0000:00:02.0: [drm] GuC RC: enabled [ 4.936262] mei_pxp 0000:00:16.0-fbf6fcf1-96cf-4e2e-a6a6-1bab8cbe36b1: bound 0000:00:02.0 (ops i915_pxp_tee_component_ops [i915]) [ 4.936398] i915 0000:00:02.0: [drm] Protected Xe Path (PXP) protected content support initialized [ 6.277893] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on minor 0 [ 6.288976] sof-audio-pci-intel-tgl 0000:00:1f.3: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915]) [ 6.289216] i915 0000:00:02.0: 7 VFs could be associated with this PF [ 6.291730] fbcon: i915drmfb (fb0) is primary device [ 6.291733] i915 0000:00:02.0: [drm] fb0: i915drmfb frame buffer device [ 243.058490] i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=io+mem [ 243.058645] i915 0000:00:02.1: enabling device (0000 -> 0002) [ 243.058747] i915 0000:00:02.1: Running in SR-IOV VF mode [ 243.058962] i915 0000:00:02.1: GuC interface version 0.1.0.0 [ 243.059526] i915 0000:00:02.1: [drm] VT-d active for gfx access [ 243.059581] i915 0000:00:02.1: [drm] Using Transparent Hugepages [ 243.060442] i915 0000:00:02.1: GuC interface version 0.1.0.0 [ 243.060780] i915 0000:00:02.1: GuC firmware PRELOADED version 1.0 submission:SR-IOV VF [ 243.060784] i915 0000:00:02.1: HuC firmware PRELOADED [ 243.063723] i915 0000:00:02.1: [drm] Protected Xe Path (PXP) protected content support initialized [ 243.063748] i915 0000:00:02.1: [drm] PMU not supported for this GPU. [ 243.063878] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.1 on minor 1 [ 243.064262] i915 0000:00:02.0: Enabled 1 VFs [ 501.934934] i915 0000:00:02.0: VF1 FLR [ 502.058688] i915 0000:00:02.0: VF1 FLR [ 508.812227] i915 0000:00:02.0: VF1 FLR [ 800.086777] i915 0000:00:02.0: VF1 FLR [ 801.325477] i915 0000:00:02.1: Running in SR-IOV VF mode [ 801.325698] i915 0000:00:02.1: GuC interface version 0.1.0.0 [ 801.326072] i915 0000:00:02.1: [drm] VT-d active for gfx access [ 801.326113] i915 0000:00:02.1: [drm] Using Transparent Hugepages [ 801.326824] i915 0000:00:02.1: GuC interface version 0.1.0.0 [ 801.327184] i915 0000:00:02.1: GuC firmware PRELOADED version 1.0 submission:SR-IOV VF [ 801.327187] i915 0000:00:02.1: HuC firmware PRELOADED [ 801.329326] i915 0000:00:02.1: [drm] Protected Xe Path (PXP) protected content support initialized [ 801.329339] i915 0000:00:02.1: [drm] PMU not supported for this GPU. [ 801.329421] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.1 on minor 1 ```
sudo lspci -v -s 00:02 sudo lspci -v -s 00:02  ✔  16:42:08 00:02.0 VGA compatible controller: Intel Corporation TigerLake-LP GT2 [Iris Xe Graphics] (rev 01) (prog-if 00 [VGA controller]) DeviceName: Onboard - Video Subsystem: Samsung Electronics Co Ltd Device c1a5 Flags: bus master, fast devsel, latency 0, IRQ 175, IOMMU group 1 Memory at 601c000000 (64-bit, non-prefetchable) [size=16M] Memory at 4000000000 (64-bit, prefetchable) [size=256M] I/O ports at 3000 [size=64] Expansion ROM at 000c0000 [virtual] [disabled] [size=128K] Capabilities: [40] Vendor Specific Information: Len=0c Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00 Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable+ 64bit- Capabilities: [d0] Power Management version 2 Capabilities: [100] Process Address Space ID (PASID) Capabilities: [200] Address Translation Service (ATS) Capabilities: [300] Page Request Interface (PRI) Capabilities: [320] Single Root I/O Virtualization (SR-IOV) Kernel driver in use: i915 Kernel modules: i915 00:02.1 VGA compatible controller: Intel Corporation TigerLake-LP GT2 [Iris Xe Graphics] (rev 01) (prog-if 00 [VGA controller]) Subsystem: Samsung Electronics Co Ltd Device c1a5 Flags: bus master, fast devsel, latency 0, IRQ 187, IOMMU group 17 Memory at 4010000000 (64-bit, non-prefetchable) [disabled] [size=16M] Memory at 4020000000 (64-bit, prefetchable) [virtual] [size=512M] Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00 Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable+ 64bit- Kernel driver in use: i915 Kernel modules: i915

Capabilities: [320] Single Root I/O Virtualization (SR-IOV) is listed.

sudo bash -c "echo 1 > /sys/devices/pci0000:00/0000:00:02.0/sriov_numvfs"
My vm xml ```xml win11 87a95230-fd5d-4b48-9cab-60e63022a158 8388608 8388608 8 hvm destroy restart destroy /usr/bin/qemu-system-x86_64
sieskei commented 1 year ago

Are you add hyper-v enlightenments?

raldone01 commented 1 year ago

What do you mean? I have not done anything related to hyper-v.

sieskei commented 1 year ago

add in libvirt xml section feature

raldone01 commented 1 year ago

Should I change:

<hyperv mode="custom">
      <relaxed state="on"/>
      <vapic state="on"/>
      <spinlocks state="on" retries="8191"/>
    </hyperv>

to

<hyperv mode="">
      <relaxed state="on"/>
      <vapic state="on"/>
      <spinlocks state="on" retries="8191"/>
    </hyperv>

Fails with error:


Error changing VM configuration: XML error: Invalid value for attribute 'mode' in element 'hyperv': ''.

Traceback (most recent call last):
  File "/usr/share/virt-manager/virtManager/addhardware.py", line 345, in change_config_helper
    define_func(**define_args)
  File "/usr/share/virt-manager/virtManager/details/details.py", line 1353, in change_cb
    return self.vm.define_xml(newxml)
  File "/usr/share/virt-manager/virtManager/object/libvirtobject.py", line 347, in define_xml
    self._redefine_xml_internal(origxml, newxml)
  File "/usr/share/virt-manager/virtManager/object/libvirtobject.py", line 374, in _redefine_xml_internal
    self._define(newxml)
  File "/usr/share/virt-manager/virtManager/object/domain.py", line 1137, in _define
    self.conn.define_domain(xml)
  File "/usr/share/virt-manager/virtManager/connection.py", line 554, in define_domain
    return self._backend.defineXML(xml)
  File "/usr/lib/python3.10/site-packages/libvirt.py", line 4495, in defineXML
    raise libvirtError('virDomainDefineXML() failed')
libvirt.libvirtError: XML error: Invalid value for attribute 'mode' in element 'hyperv': ''.

or

<hyperv>
    </hyperv>

Or should I remove it completely?

sieskei commented 1 year ago

Pff I’m on the phone and app cut tags. <hyperv mode="passthrough"></hyperv>

Also are you use cpu host?

raldone01 commented 1 year ago

Thank you! Do you mean this: <cpu mode="host-passthrough" check="none" migratable="on"/>

Do you know which intel drivers work for my windows guest? A link would be nice :upside_down_face:.

My config ```xml win11 87a95230-fd5d-4b48-9cab-60e63022a158 8388608 8388608 8 hvm destroy restart destroy /usr/bin/qemu-system-x86_64
icanc0 commented 1 year ago

How did you even get virtual functions to show up? Have you also tested multiple VFs?

raldone01 commented 1 year ago
sudo bash -c "echo 7 > /sys/devices/pci0000:00/0000:00:02.0/sriov_numvfs"
bash: line 1: echo: write error: Device or resource busy

Apparently windows is using it somewhat. Need to shutdown guest before I can change the number of VFs.

Woops I have a tiger lake: CPU: 11th Gen Intel i7-1165G7 (8) @ 4.700GHz I have read to many xxx lake identifiers today. GPU: Intel TigerLake-LP GT2 [Iris Xe Graphics]

icanc0 commented 1 year ago

yea, as expected, the driver you loaded says tgl according to the logs.

raldone01 commented 1 year ago

Screenshot_20230302_172409

@sieskei Thank you its working!

raldone01 commented 1 year ago

@sieskei Should I remove the QXL Display or the spice video card? @icanotc Sorry for hijacking your issue.

sieskei commented 1 year ago

Screenshot_20230302_172409

@sieskei Thank you its working!

No problem, that’s good news! If you remove spice/virt gpu you will not have graphical output. Now you can try looking glass and virtual minitor to get better performance.

Btw how many virtual functions you set? Can you share your final libvirt xml?

raldone01 commented 1 year ago

I set 1 VF but I can create up to 7. I have not tried to use more than 1 though.

My 'working' libvirt xml ```xml win11 87a95230-fd5d-4b48-9cab-60e63022a158 8388608 8388608 4 hvm destroy restart destroy /usr/bin/qemu-system-x86_64

I am currently writing a guide. It will eventually be available here.

@sieskei do you have a working setup? Does hibernation/suspension work for you?

sieskei commented 1 year ago

No, when I’m running with iGPU host and guest are super, super laggy. Mouse/Kbd are totaly freezing.

This is my ticket. https://github.com/strongtz/i915-sriov-dkms/issues/42

What is your distro?

raldone01 commented 1 year ago

Ahh I already 👍 your issue. 🙃 Actually my host and guest are super laggy too but only when starting and when shutting down otherwise performance is OK.

I am using arch btw.

What distor/cpu do you use?

sieskei commented 1 year ago

I’m on Fedora 37 with kernel 6.1.14 and mesa 23.01.

Laptop is asus rog zephyros m16 2021. 11800H, 40Gb ram, gt1

raldone01 commented 1 year ago

Did you use their intel-lts kernel or the dkms module?

sieskei commented 1 year ago

dkms module only, you?

sieskei commented 1 year ago

@raldone01 are you sure is working properly? I noticed that when iGPU is not use (task manager) does not laggy. When 3d rendering is needed then it freeze.

image

image

Usage is either 0 or 100. Can you check?

raldone01 commented 1 year ago

The dkms module stopped working after a kernel update. (sound driver and i915 did not agree on something) So I now built intel-linux-lts myself. I have the same issue as you. Either 0 or 100 percent usage. Sometimes I get 20%. What program do you test it with?

jubjubrsx commented 11 months ago

I can verify that Rocketlake doesnt work, I pretty much get the same results on my MSI B560 with a 12 x 11th Gen Intel(R) Core(TM) i5-11600

I have 10gb fiber nic I have in the pci slot shows SR-IOV but not the onboard video. 01:00.0 0200: 8086:10fb (rev 01) Subsystem: 8086:7a12 Flags: bus master, fast devsel, latency 0, IRQ 16, IOMMU group 13 Memory at 80900000 (64-bit, non-prefetchable) [size=1M] I/O ports at 4020 [disabled] [size=32] Memory at 80b04000 (64-bit, non-prefetchable) [size=16K] Expansion ROM at 80a80000 [disabled] [size=512K] Capabilities: [40] Power Management version 3 Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+ Capabilities: [70] MSI-X: Enable+ Count=64 Masked- Capabilities: [a0] Express Endpoint, MSI 00 Capabilities: [e0] Vital Product Data Capabilities: [100] Advanced Error Reporting Capabilities: [140] Device Serial Number 90-e2-ba-ff-ff-e8-30-10 Capabilities: [150] Alternative Routing-ID Interpretation (ARI) Capabilities: [160] Single Root I/O Virtualization (SR-IOV) Kernel driver in use: ixgbe Kernel modules: ixgbe

the most infuriating part of this is that I tried a intel nuc at work that had a 11th gen in it (tiger lake) and it worked just fine.

icanc0 commented 11 months ago

I actually asked on the intel forum, and the response is that 11th Gen desktop (Rocket Lake) is the only platform not supporting graphics splitting since 5th Gen Broadwell. I asked why, but the response was, "This is a top-level secret and we can't tell ya". They really want me to upgrade to a 12th gen huh?

I'm going to close this issue now, if the arc GPUs didn't disappoint me, this definitely did.