strongtz / i915-sriov-dkms

dkms module of Linux i915 driver with SR-IOV support
797 stars 90 forks source link

[Issue]: PVE 8.2.2 with Kernel 6.5.13-5-pve compiling without issues, still no VFs visible #172

Open barrio5 opened 1 month ago

barrio5 commented 1 month ago

Hi,

I'm running PVE 8.2.2 with Kernel 6.5.13-5-pve and DKMS is compiling without any issues but no VFs are visible. What am I doing wrong? See logs:

root@proxmox:~# sudo dmesg | grep i915

[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-6.5.13-5-pve root=/dev/mapper/pve-root ro quiet intel_iommu=on iommu=pt i915.enable_guc=3 i915.max_vfs=7 [ 0.116784] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.5.13-5-pve root=/dev/mapper/pve-root ro quiet intel_iommu=on iommu=pt i915.enable_guc=3 i915.max_vfs=7 [ 7.855694] i915: unknown parameter 'max_vfs' ignored [ 7.856977] i915 0000:00:02.0: [drm] VT-d active for gfx access [ 7.857143] i915 0000:00:02.0: vgaarb: deactivate vga console [ 7.857362] i915 0000:00:02.0: [drm] Using Transparent Hugepages [ 7.858035] i915 0000:00:02.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=io+mem [ 7.859211] i915 0000:00:02.0: [drm] Finished loading DMC firmware i915/adls_dmc_ver2_01.bin (v2.1) [ 7.859445] mei_hdcp 0000:00:16.0-b638ab7e-94e2-4ea2-a552-d1c54b627f04: bound 0000:00:02.0 (ops i915_hdcp_ops [i915]) [ 7.869486] i915 0000:00:02.0: [drm] GT0: GuC firmware i915/tgl_guc_70.bin version 70.20.0 [ 7.869492] i915 0000:00:02.0: [drm] GT0: HuC firmware i915/tgl_huc.bin version 7.9.3 [ 7.872579] i915 0000:00:02.0: [drm] GT0: HuC: authenticated for all workloads [ 7.872921] i915 0000:00:02.0: [drm] GT0: GUC: submission enabled [ 7.872923] i915 0000:00:02.0: [drm] GT0: GUC: SLPC enabled [ 7.873301] i915 0000:00:02.0: [drm] GT0: GUC: RC enabled [ 7.873833] mei_pxp 0000:00:16.0-fbf6fcf1-96cf-4e2e-a6a6-1bab8cbe36b1: bound 0000:00:02.0 (ops i915_pxp_tee_component_ops [i915]) [ 7.874006] i915 0000:00:02.0: [drm] Protected Xe Path (PXP) protected content support initialized [ 7.986054] [drm] Initialized i915 1.6.0 20201103 for 0000:00:02.0 on minor 0 [ 7.987255] snd_hda_intel 0000:00:1f.3: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915]) [ 7.988647] i915 0000:03:00.0: [drm] VT-d active for gfx access [ 7.988667] i915 0000:03:00.0: [drm] Local memory IO size: 0x000000017c800000 [ 7.988668] i915 0000:03:00.0: [drm] Local memory available: 0x000000017c800000 [ 8.002533] fbcon: i915drmfb (fb0) is primary device [ 8.007656] i915 0000:03:00.0: [drm] Finished loading DMC firmware i915/dg2_dmc_ver2_08.bin (v2.8) [ 8.014341] i915 0000:03:00.0: [drm] GT0: GuC firmware i915/dg2_guc_70.bin version 70.20.0 [ 8.014343] i915 0000:03:00.0: [drm] GT0: HuC firmware i915/dg2_huc_gsc.bin version 7.10.15 [ 8.037165] i915 0000:03:00.0: [drm] GT0: GUC: submission enabled [ 8.037165] i915 0000:03:00.0: [drm] GT0: GUC: SLPC enabled [ 8.037391] i915 0000:03:00.0: [drm] GT0: GUC: RC enabled [ 8.049746] i915 0000:00:02.0: [drm] fb0: i915drmfb frame buffer device [ 8.077867] [drm] Initialized i915 1.6.0 20201103 for 0000:03:00.0 on minor 1 [ 8.078697] snd_hda_intel 0000:04:00.0: bound 0000:03:00.0 (ops i915_audio_component_bind_ops [i915]) [ 8.079243] i915 0000:03:00.0: [drm] Cannot find any crtc or sizes [ 8.101129] i915 0000:03:00.0: [drm] Cannot find any crtc or sizes [ 8.116620] mei_gsc i915.mei-gscfi.768: FW not ready: resetting: dev_state = 2 pxp = 0 [ 8.116641] mei_gsc i915.mei-gscfi.768: unexpected reset: dev_state = ENABLED fw status = 00000345 84670000 00000000 00000000 E0020002 00000000 [ 8.117407] mei_gsc i915.mei-gsc.768: FW not ready: resetting: dev_state = 2 pxp = 2 [ 8.117433] mei_gsc i915.mei-gsc.768: unexpected reset: dev_state = ENABLED fw status = 00000345 84670000 00000000 00000000 E0020002 00000000 [ 8.525178] i915 0000:03:00.0: [drm] GT0: HuC: authenticated for all workloads [ 8.525183] mei_pxp i915.mei-gsc.768-fbf6fcf1-96cf-4e2e-a6a6-1bab8cbe36b1: bound 0000:03:00.0 (ops i915_pxp_tee_component_ops [i915]) [ 12.297403] i915 0000:00:02.0: driver does not support SR-IOV configuration via sysfs [ 12.297420] i915 0000:00:02.0: driver does not support SR-IOV configuration via sysfs

johntdavis84 commented 1 month ago

What's the output of lspci -nn?

You've got devices using the i915 driver at PCIe addresses 00:02 and 00:03, and it's trying to set up SR-IOV on device 00:02 and running into an issue with the driver for that device.

So, we need to know what those devices are. :)

barrio5 commented 1 month ago

00:00.0 Host bridge [0600]: Intel Corporation Device [8086:4630] (rev 05) 00:01.0 PCI bridge [0604]: Intel Corporation 12th Gen Core Processor PCI Express x16 Controller #1 [8086:460d] (rev 05) 00:02.0 VGA compatible controller [0300]: Intel Corporation Alder Lake-S GT1 [UHD Graphics 730] [8086:4692] (rev 0c) 00:06.0 PCI bridge [0604]: Intel Corporation 12th Gen Core Processor PCI Express x4 Controller #0 [8086:464d] (rev 05) 00:0a.0 Signal processing controller [1180]: Intel Corporation Platform Monitoring Technology [8086:467d] (rev 01) 00:0e.0 RAID bus controller [0104]: Intel Corporation Volume Management Device NVMe RAID Controller [8086:467f] 00:14.0 USB controller [0c03]: Intel Corporation Device [8086:7a60] (rev 11) 00:14.2 RAM memory [0500]: Intel Corporation Device [8086:7a27] (rev 11) 00:14.3 Network controller [0280]: Intel Corporation Device [8086:7a70] (rev 11) 00:15.0 Serial bus controller [0c80]: Intel Corporation Device [8086:7a4c] (rev 11) 00:15.1 Serial bus controller [0c80]: Intel Corporation Device [8086:7a4d] (rev 11) 00:15.2 Serial bus controller [0c80]: Intel Corporation Device [8086:7a4e] (rev 11) 00:16.0 Communication controller [0780]: Intel Corporation Device [8086:7a68] (rev 11) 00:17.0 SATA controller [0106]: Intel Corporation Device [8086:7a62] (rev 11) 00:1a.0 PCI bridge [0604]: Intel Corporation Device [8086:7a48] (rev 11) 00:1c.0 PCI bridge [0604]: Intel Corporation Device [8086:7a38] (rev 11) 00:1c.2 PCI bridge [0604]: Intel Corporation Device [8086:7a3a] (rev 11) 00:1d.0 PCI bridge [0604]: Intel Corporation Device [8086:7a36] (rev 11) 00:1f.0 ISA bridge [0601]: Intel Corporation Device [8086:7a06] (rev 11) 00:1f.3 Audio device [0403]: Intel Corporation Device [8086:7a50] (rev 11) 00:1f.4 SMBus [0c05]: Intel Corporation Device [8086:7a23] (rev 11) 00:1f.5 Serial bus controller [0c80]: Intel Corporation Device [8086:7a24] (rev 11) 01:00.0 PCI bridge [0604]: Intel Corporation Device [8086:4fa1] (rev 01) 02:01.0 PCI bridge [0604]: Intel Corporation Device [8086:4fa4] 02:04.0 PCI bridge [0604]: Intel Corporation Device [8086:4fa4] 03:00.0 VGA compatible controller [0300]: Intel Corporation DG2 [Arc A380] [8086:56a5] (rev 05) 04:00.0 Audio device [0403]: Intel Corporation DG2 Audio Controller [8086:4f92] 05:00.0 SATA controller [0106]: JMicron Technology Corp. JMB58x AHCI SATA controller [197b:0585] 06:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD Black 2018/SN750 / PC SN720 NVMe SSD [15b7:5002] 08:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller I226-V [8086:125c] (rev 06)

johntdavis84 commented 1 month ago

Great! :)

00:02.0 VGA compatible controller [0300]: Intel Corporation Alder Lake-S GT1 [UHD Graphics 730] [8086:4692] (rev 0c)
03:00.0 VGA compatible controller [0300]: Intel Corporation DG2 [Arc A380] [8086:56a5] (rev 05)

Just to confirm, you're trying to use the 00:02.0 iGPU device?

Also, I apologize, I should have asked this before. Could you please paste the output of the following commands?

uname -a
sudo dkms status

This looks very similar to issue #96 . There wasn't a resolution on that, but it looks like the problem might have been that a kernel update happened after the driver was built, which the above commands will help us figure out (hopefully).

barrio5 commented 1 month ago

no worries :)

yes I'm trying to use the 00:02.0 iGPU device

uname -a: Linux proxmox 6.5.13-5-pve #1 SMP PREEMPT_DYNAMIC PMX 6.5.13-5 (2024-04-05T11:03Z) x86_64 GNU/Linux

sudo dkms status: i915-sriov-dkms/6.5.13-5: added

johntdavis84 commented 1 month ago

Thanks! That looks good so far, actually. Take a look at the Host (Proxmox) configuration section of this tutorial. Unless you have secure boot enabled, you're good to go with 6.5.13-5. :) https://www.derekseaman.com/2023/11/proxmox-ve-8-1-windows-11-vgpu-vt-d-passthrough-with-intel-alder-lake.html

I'd suggest reading through the entire host config section before trying anything else, but it looks like you need to finish Step 4: "Let’s now build the new kernel and check the status. Validate that it shows installed."

EDIT: I wanted to add that I've successfully used this tutorial to install the DKMS driver on Proxmox with kernel 6.5.13-5.

moelllerniklas commented 1 month ago

I had the same issue. For me putting intel_iommu=on iommu=pt i915.enable_guc=3 i915.max_vfs=7 in /etc/kernel/cmdline instead of /etc/default/grub and running proxmox-boot-tool refresh && reboot fixed it.

johntdavis84 commented 1 month ago

no worries :)

yes I'm trying to use the 00:02.0 iGPU device

uname -a: Linux proxmox 6.5.13-5-pve #1 SMP PREEMPT_DYNAMIC PMX 6.5.13-5 (2024-04-05T11:03Z) x86_64 GNU/Linux

sudo dkms status: i915-sriov-dkms/6.5.13-5: added

Were you able to get this going with the tutorial I linked?

xxxsen commented 1 month ago

i met the same issue, after re-install this module and exec these following commands, it works. hope it would help.

sudo update-grub
sudo update-initramfs -u