strongtz / i915-sriov-dkms

dkms module of Linux i915 driver with SR-IOV support
914 stars 110 forks source link

Boot problem on Debian guest (on proxmox host), works after reloading i915.ko #133

Open kugel- opened 7 months ago

kugel- commented 7 months ago

Hello,

I installed this in a Debian VM (I understand that I do need this driver on the host as well as the VM right?).

I experience the following problem: On boot, the i915 driver fails to initialize the VF properly. However, if I reload i915.ko after boot it works fine. The guest is bootet with i915.enable_guc=3 and runs on Debian's 6.5 kernel (from backports).

dmesg (reloading i915.ko after ~40s)

[    0.000000] Command line: initrd=\529e6f9543d943f9ac13a7647402bb07\6.5.0-0.deb12.4-amd64\initrd.img-6.5.0-0.deb12.4-amd64 root=UUID=110984f8-050b-4901-8423-e91d1ab28452 ro rootflags=subvol=rootfs quiet systemd.machine_id=529e6f9543d943f9ac13a7647402bb07 i915.enable_guc=3
[    0.018983] Kernel command line: initrd=\529e6f9543d943f9ac13a7647402bb07\6.5.0-0.deb12.4-amd64\initrd.img-6.5.0-0.deb12.4-amd64 root=UUID=110984f8-050b-4901-8423-e91d1ab28452 ro rootflags=subvol=rootfs quiet systemd.machine_id=529e6f9543d943f9ac13a7647402bb07 i915.enable_guc=3
[    3.646912] i915 0000:01:00.0: [drm] *ERROR* Device is non-operational; MMIO access returns 0xFFFFFFFF!
[    3.650742] i915 0000:01:00.0: Device initialization failed (-5)
[    3.650790] i915: probe of 0000:01:00.0 failed with error -5
[   41.900817] i915: loading out-of-tree module taints kernel.
[   41.900874] i915: module verification failed: signature and/or required key missing - tainting kernel
[   42.229506] i915 0000:01:00.0: Running in SR-IOV VF mode
[   42.230221] i915 0000:01:00.0: [drm] GT0: GUC: interface version 0.1.4.1
[   42.230823] i915 0000:01:00.0: [drm] VT-d active for gfx access
[   42.230847] i915 0000:01:00.0: [drm] Using Transparent Hugepages
[   42.232788] i915 0000:01:00.0: [drm] GT0: GUC: interface version 0.1.4.1
[   42.233836] i915 0000:01:00.0: GuC firmware PRELOADED version 1.4 submission:SR-IOV VF
[   42.233838] i915 0000:01:00.0: HuC firmware PRELOADED
[   42.236859] i915 0000:01:00.0: [drm] Protected Xe Path (PXP) protected content support initialized
[   42.236864] i915 0000:01:00.0: [drm] PMU not supported for this GPU.
[   42.237008] [drm] Initialized i915 1.6.0 20201103 for 0000:01:00.0 on minor 1

I build the driver today from the latest commit, that is:

commit cdb1399821e942db6fcc2b8322da72b517a9bc0d (HEAD -> master, origin/master, origin/HEAD)
Merge: 1d0bb2d ba79550
Author: Sophon <wuxilin123@gmail.com>
Date:   Sat Nov 25 16:08:39 2023 +0800

    Merge pull request #126 from labdiynez/master

    Accept GuC version 1.0 or 1.4
kugel- commented 7 months ago

I wonder if something is missing from my initrd that's later available on the rootfs? Any module dependency perhaps?

mio-19 commented 7 months ago

Maybe the system is loading the original i915 driver of Debian

1402366912 commented 1 month ago

i have this problem too. when i set the model q35 it shows the same as yours ,i am using 6.5.0-0.deb12.4-amd64 too. when i change the model to i440fx it ,it shows sudo dmesg|grep i915 [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-6.5.0-0.deb12.4-amd64 root=/dev/mapper/debian--vg-root ro quiet splash i915.enable_guc=3 [ 0.022749] Kernel command line: BOOT_IMAGE=/vmlinuz-6.5.0-0.deb12.4-amd64 root=/dev/mapper/debian--vg-root ro quiet splash i915.enable_guc=3 however The GPU was not loaded correctly cpu:intel N100 pve kernel 6.5.13-3-pve I have also test in 6.1

q35 : [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-6.1.0-22-amd64 root=/dev/mapper/debian--vg-root ro quiet splash i915.enable_guc=3 [ 0.018968] Kernel command line: BOOT_IMAGE=/vmlinuz-6.1.0-22-amd64 root=/dev/mapper/debian--vg-root ro quiet splash i915.enable_guc=3 [ 2.740477] [drm:fw_domains_get_with_fallback [i915]] ERROR gt: timed out waiting for forcewake ack request. [ 2.740594] i915 0000:01:00.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by fw_domains_get_with_fallback+0x1d8/0x230 [i915] [ 3.341323] [drm:fw_domains_get_with_fallback [i915]] ERROR gt: timed out waiting for forcewake ack request. [ 3.341427] i915 0000:01:00.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by fw_domains_get_with_fallback+0x1d8/0x230 [i915] [ 3.341583] i915 0000:01:00.0: drm_WARN_ON(s_en != 0x1) [ 3.341610] WARNING: CPU: 1 PID: 138 at drivers/gpu/drm/i915/gt/intel_sseu.c:278 intel_sseu_info_init+0xc17/0xda0 [i915] [ 3.341698] Modules linked in: i915(+) hid_generic usbhid hid drm_buddy video wmi i2c_algo_bit sr_mod(+) cdrom drm_display_helper sd_mod t10_pi cec ahci xhci_pci crc64_rocksoft crc64 rc_core libahci crc_t10dif crct10dif_generic ttm xhci_hcd virtio_gpu virtio_dma_buf drm_shmem_helper virtio_scsi drm_kms_helper ehci_pci virtio_net uhci_hcd net_failover failover ehci_hcd crct10dif_pclmul crct10dif_common psmouse crc32_pclmul libata drm scsi_mod virtio_pci virtio_pci_legacy_dev crc32c_intel virtio_pci_modern_dev i2c_i801 i2c_smbus scsi_common usbcore lpc_ich virtio usb_common virtio_ring floppy(+) button [ 3.341739] RIP: 0010:intel_sseu_info_init+0xc17/0xda0 [i915] [ 3.341845] ? intel_sseu_info_init+0xc17/0xda0 [i915] [ 3.341927] ? intel_sseu_info_init+0xc17/0xda0 [i915] [ 3.341995] ? intel_sseu_info_init+0xc17/0xda0 [i915] [ 3.342064] ? fwtable_read32+0x96/0x220 [i915] [ 3.342129] ? intel_uncore_forcewake_for_reg+0x45/0xf0 [i915] [ 3.342191] intel_gt_init_mmio+0x1f/0x30 [i915] [ 3.342264] i915_driver_probe+0x46b/0xe20 [i915] [ 3.342371] i915_init+0x1f/0x7f [i915] [ 3.943074] [drm:fw_domains_get_with_fallback [i915]] ERROR gt: timed out waiting for forcewake ack request. [ 3.943180] i915 0000:01:00.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by fw_domains_get_with_fallback+0x1d8/0x230 [i915] [ 4.543876] [drm:fw_domains_get_with_fallback [i915]] ERROR gt: timed out waiting for forcewake ack request. [ 4.543968] i915 0000:01:00.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by fw_domains_get_with_fallback+0x1d8/0x230 [i915] [ 5.144657] [drm:fw_domains_get_with_fallback [i915]] ERROR gt: timed out waiting for forcewake ack request. [ 5.144751] i915 0000:01:00.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by fw_domains_get_with_fallback+0x1d8/0x230 [i915] [ 5.144884] i915 0000:01:00.0: [drm] L3 bank mask is all zero! [ 5.745450] [drm:fw_domains_get_with_fallback [i915]] ERROR gt: timed out waiting for forcewake ack request. [ 5.745541] i915 0000:01:00.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by fw_domains_get_with_fallback+0x1d8/0x230 [i915] [ 6.346297] [drm:fw_domains_get_with_fallback [i915]] ERROR render: timed out waiting for forcewake ack to clear. [ 6.346417] i915 0000:01:00.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by fw_domains_get_with_fallback+0x20c/0x230 [i915] [ 6.947378] [drm:fw_domains_get_with_fallback [i915]] ERROR gt: timed out waiting for forcewake ack request. [ 6.947489] i915 0000:01:00.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by fw_domains_get_with_fallback+0x1d8/0x230 [i915] [ 7.548212] [drm:fw_domains_get_with_fallback [i915]] ERROR gt: timed out waiting for forcewake ack request. [ 7.548327] i915 0000:01:00.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by fw_domains_get_with_fallback+0x1d8/0x230 [i915] [ 8.149056] [drm:fw_domains_get_with_fallback [i915]] ERROR gt: timed out waiting for forcewake ack request. [ 8.149178] i915 0000:01:00.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by fw_domains_get_with_fallback+0x1d8/0x230 [i915] [ 8.750033] [drm:fw_domains_get_with_fallback [i915]] ERROR gt: timed out waiting for forcewake ack request. [ 8.750146] i915 0000:01:00.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by fw_domains_get_with_fallback+0x1d8/0x230 [i915] [ 9.350856] [drm:fw_domains_get_with_fallback [i915]] ERROR render: timed out waiting for forcewake ack to clear. [ 9.350972] i915 0000:01:00.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by fw_domains_get_with_fallback+0x20c/0x230 [i915] [ 9.951680] [drm:fw_domains_get_with_fallback [i915]] ERROR render: timed out waiting for forcewake ack to clear. [ 9.951777] i915 0000:01:00.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by fw_domains_get_with_fallback+0x20c/0x230 [i915] [ 10.552488] [drm:fw_domains_get_with_fallback [i915]] ERROR render: timed out waiting for forcewake ack to clear. [ 10.552591] i915 0000:01:00.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by fw_domains_get_with_fallback+0x20c/0x230 [i915] [ 11.153300] [drm:fw_domains_get_with_fallback [i915]] ERROR render: timed out waiting for forcewake ack to clear. [ 11.153395] i915 0000:01:00.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by fw_domains_get_with_fallback+0x20c/0x230 [i915] [ 11.754085] [drm:fw_domains_get_with_fallback [i915]] ERROR render: timed out waiting for forcewake ack to clear. [ 11.754214] i915 0000:01:00.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by fw_domains_get_with_fallback+0x20c/0x230 [i915] [ 12.354960] [drm:fw_domains_get_with_fallback [i915]] ERROR render: timed out waiting for forcewake ack to clear. [ 12.355069] i915 0000:01:00.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by fw_domains_get_with_fallback+0x20c/0x230 [i915] [ 12.955769] [drm:fw_domains_get_with_fallback [i915]] ERROR gt: timed out waiting for forcewake ack request. [ 12.955868] i915 0000:01:00.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by fw_domains_get_with_fallback+0x1d8/0x230 [i915] [ 13.556559] [drm:fw_domains_get_with_fallback [i915]] ERROR gt: timed out waiting for forcewake ack request. [ 13.556657] i915 0000:01:00.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by fw_domains_get_with_fallback+0x1d8/0x230 [i915] [ 14.157344] [drm:fw_domains_get_with_fallback [i915]] ERROR gt: timed out waiting for forcewake ack request. [ 14.157435] i915 0000:01:00.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by fw_domains_get_with_fallback+0x1d8/0x230 [i915] [ 14.758124] [drm:fw_domains_get_with_fallback [i915]] ERROR render: timed out waiting for forcewake ack to clear. [ 14.758218] i915 0000:01:00.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by fw_domains_get_with_fallback+0x20c/0x230 [i915] [ 15.358893] [drm:fw_domains_get_with_fallback [i915]] ERROR gt: timed out waiting for forcewake ack request. [ 15.358993] i915 0000:01:00.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by fw_domains_get_with_fallback+0x1d8/0x230 [i915] [ 15.359835] i915 0000:01:00.0: [drm] ERROR rcs'0 reset request timed out: {request: 00000004, RESET_CTL: ffffffff} [ 15.360553] i915 0000:01:00.0: [drm] ERROR rcs'0 reset request timed out: {request: 00000004, RESET_CTL: ffffffff} [ 15.361270] i915 0000:01:00.0: [drm] ERROR bcs'0 reset request timed out: {request: 00000004, RESET_CTL: ffffffff} [ 15.364043] i915 0000:01:00.0: [drm] ERROR rcs'0 reset request timed out: {request: 00000004, RESET_CTL: ffffffff} [ 15.364760] i915 0000:01:00.0: [drm] ERROR bcs'0 reset request timed out: {request: 00000004, RESET_CTL: ffffffff} [ 15.967482] [drm:fw_domains_get_with_fallback [i915]] ERROR gt: timed out waiting for forcewake ack request. [ 15.967591] i915 0000:01:00.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by fw_domains_get_with_fallback+0x1d8/0x230 [i915] [ 16.568307] [drm:fw_domains_get_with_fallback [i915]] ERROR gt: timed out waiting for forcewake ack request. [ 16.568407] i915 0000:01:00.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by fw_domains_get_with_fallback+0x1d8/0x230 [i915] [ 17.169120] [drm:fw_domains_get_with_fallback [i915]] ERROR gt: timed out waiting for forcewake ack request. [ 17.169212] i915 0000:01:00.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by fw_domains_get_with_fallback+0x1d8/0x230 [i915] [ 17.769900] [drm:fw_domains_get_with_fallback [i915]] ERROR gt: timed out waiting for forcewake ack request. [ 17.770005] i915 0000:01:00.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by fw_domains_get_with_fallback+0x1d8/0x230 [i915] [ 18.370694] [drm:fw_domains_get_with_fallback [i915]] ERROR gt: timed out waiting for forcewake ack request. [ 18.370787] i915 0000:01:00.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by fw_domains_get_with_fallback+0x1d8/0x230 [i915] [ 18.971482] [drm:fw_domains_get_with_fallback [i915]] ERROR gt: timed out waiting for forcewake ack request. [ 18.971580] i915 0000:01:00.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by fw_domains_get_with_fallback+0x1d8/0x230 [i915] [ 19.572272] [drm:fw_domains_get_with_fallback [i915]] ERROR gt: timed out waiting for forcewake ack request. [ 19.572370] i915 0000:01:00.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by fw_domains_get_with_fallback+0x1d8/0x230 [i915] [ 20.173071] [drm:fw_domains_get_with_fallback [i915]] ERROR gt: timed out waiting for forcewake ack request. [ 20.173164] i915 0000:01:00.0: [drm:add_taint_for_CI [i915]] CI tainted:0x9 by fw_domains_get_with_fallback+0x1d8/0x230 [i915] [ 20.173293] i915 0000:01:00.0: [drm] ERROR Failed to map the ggtt page table [ 20.223208] i915 0000:01:00.0: Device initialization failed (-12) [ 20.223238] i915: probe of 0000:01:00.0 failed with error -12 and i440fx sudo dmesg|grep i915 [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-6.5.0-0.deb12.4-amd64 root=/dev/mapper/debian--vg-root ro quiet splash i915.enable_guc=3 [ 0.022749] Kernel command line: BOOT_IMAGE=/vmlinuz-6.5.0-0.deb12.4-amd64 root=/dev/mapper/debian--vg-root ro quiet splash i915.enable_guc=3

Does anyone have any ideas?

1402366912 commented 1 month ago

maybe as an alternative we canautomatically run scripts at startup

!/bin/bash

modprobe -r i915 sleep 5
modprobe i915

1402366912 commented 1 month ago

要设置在系统启动后的30秒执行该脚本,你可以使用 systemdsystemctl 命令和定时器来实现。下面是如何设置的步骤:

  1. 创建定时器文件: 创建一个文件 reload-i915.timer,其中包含定时器的配置。打开终端,输入以下命令:

    sudo nano /etc/systemd/system/reload-i915.timer

    在打开的编辑器中,粘贴以下内容:

    [Unit]
    Description=Reload i915 module 30 seconds after boot
    
    [Timer]
    OnBootSec=30s
    Unit=reload-i915.service
    
    [Install]
    WantedBy=multi-user.target

    这个配置文件定义了一个定时器,在系统启动后30秒触发,并指定执行 reload-i915.service 服务单元。

  2. 创建服务文件: 创建一个文件 reload-i915.service,用于定义在定时器触发时要执行的操作。在终端中输入以下命令:

    sudo nano /etc/systemd/system/reload-i915.service

    在打开的编辑器中,粘贴以下内容:

    [Unit]
    Description=Reload i915 module
    
    [Service]
    Type=oneshot
    ExecStart=/bin/bash -c '/sbin/modprobe -r i915 && sleep 5 && /sbin/modprobe i915'
    
    [Install]
    WantedBy=multi-user.target

    这个配置文件定义了一个一次性服务,它在执行时会卸载并重新加载 i915 模块。

  3. 启用定时器和服务: 完成上述文件的编辑后,保存并关闭编辑器。接下来,通过以下命令启用定时器和服务:

    sudo systemctl daemon-reload
    sudo systemctl enable reload-i915.timer

    这些命令会重新加载 systemd 管理的单元文件并启用定时器,使其在系统启动后的30秒触发。

现在,系统每次启动后的30秒钟,将会尝试重新加载 i915 模块,以解决可能的初始化问题。请测试一下,看看是否能解决你遇到的问题。

1402366912 commented 1 month ago

this is working well

1402366912 commented 1 month ago

root@debian:~# sudo dmesg|grep i915 [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-6.5.0-0.deb12.4-amd64 root=/dev/mapper/debian--vg-root ro quiet splash i915.enable_guc=3 [ 0.025350] Kernel command line: BOOT_IMAGE=/vmlinuz-6.5.0-0.deb12.4-amd64 root=/dev/mapper/debian--vg-root ro quiet splash i915.enable_guc=3 [ 4.301536] i915 0000:01:00.0: [drm] ERROR Device is non-operational; MMIO access returns 0xFFFFFFFF! [ 4.307042] i915 0000:01:00.0: Device initialization failed (-5) [ 4.307163] i915: probe of 0000:01:00.0 failed with error -5 [ 35.431814] i915: loading out-of-tree module taints kernel. [ 35.431880] i915: module verification failed: signature and/or required key missing - tainting kernel [ 35.980978] i915 0000:01:00.0: Running in SR-IOV VF mode [ 35.981735] i915 0000:01:00.0: [drm] GT0: GUC: interface version 0.1.9.0 [ 35.982505] i915 0000:01:00.0: [drm] VT-d active for gfx access [ 35.982535] i915 0000:01:00.0: [drm] Using Transparent Hugepages [ 35.984844] i915 0000:01:00.0: [drm] GT0: GUC: interface version 0.1.9.0 [ 35.985794] i915 0000:01:00.0: GuC firmware PRELOADED version 1.9 submission:SR-IOV VF [ 35.985798] i915 0000:01:00.0: HuC firmware PRELOADED [ 35.988686] i915 0000:01:00.0: [drm] Protected Xe Path (PXP) protected content support initialized [ 35.988694] i915 0000:01:00.0: [drm] PMU not supported for this GPU. [ 35.988861] [drm] Initialized i915 1.6.0 20201103 for 0000:01:00.0 on minor 1