strongtz / i915-sriov-dkms

dkms module of Linux i915 driver with SR-IOV support
1.04k stars 126 forks source link

Host hangs on suspend #206

Open lucker999 opened 3 weeks ago

lucker999 commented 3 weeks ago

On 6.6.53-1-arch-lts with intel_iommu=on i915.enable_guc=3 i915.max_vfs=7 and cat /sys/devices/pci0000:00/0000:00:02.0/sriov_numvfs 2 the host system hangs out once a suspend command issued (either with button, lid closed or as systemctl suspend. The stranger thing is there is nothing in the logs after that, not even a suspend command. I tried the module after the issues #175 and #204 have been fixed. The memory leak stopped, but even if no VM is running (even virt-manager has not been launched) the system hangs with black screen and only way to revive the box is to hard reset.

bbaa-bbaa commented 3 weeks ago

Can you try to suspend normally with all VFs turned off? (with echo 0 > /sys/devices/pci0000:00/0000:00:02.0/sriov_numvfs )

lucker999 commented 2 weeks ago

Yes, with VF turned off PC suspends and resumes normally as expected

bbaa-bbaa commented 2 weeks ago

I have no ideas about that. Can you try PR #207 based on 6.6 branch?

lucker999 commented 2 weeks ago

Just tried with arch-6.11.2-5 on two boxes (i7-1360P Iris Xe and i5-12450H UHD Graphics (48EU)) with the same outcome: once VF is enabled any suspend attempts resulting in black screen and unresponsive box. The only way is to force reboot. I believe this is a new issue because a month or two ago I gave the module a try and the only problem I noticed was memory leak. But I am not sure, I just might not suspend any machine while I tried.

bbaa-bbaa commented 2 weeks ago

Did you notice the kernel panic when the system hung? If your computer have a PS/2 keyboard (The built-in keyboard of a laptop usually is a PS/2 keyboard) the cap locks indicator will blink.

lucker999 commented 2 weeks ago

I don't think so, because both boxes went rather quiet after black screen, fans almost stopped, no CPU load. But I'm not sure if this is an evidence, never witnessed a kernel panic close, so I only assuming. Alas, no PS/2 port, the both boxes quite new, no blinking noticed on laptop as well, that's I'm sure of.

sheaahhoi1 commented 1 day ago

@lucker999 1. I encountered a similar situation on PVE, but I know where the problem is, the machine is not connected to the display device where it was originally stored, and I can't access the PVE site, but the PING passes.

  1. When I moved to another place to connect the display device and restarted the machine, everything was fine, PING passed and at the same time I visited the PVE website, and the display device (PVE image appeared) did not have any error messages. 3.Maybe i915-sriov-dkms preset this point in the preset modification.
  2. Solution (1) must be connected to the display device / HDMI cheat image (2) ask for guidance to modify the default connection to the display device problem.