QaidVoid / Complete-Single-GPU-Passthrough

Single GPU VFIO Passthrough Guide
788 stars 38 forks source link

black screen when launching the vm #3

Closed SneekeeStache closed 3 years ago

SneekeeStache commented 3 years ago

when i launch the windows 10 vm i got a black screen specs: intel core i 7 4770 20 gb of ram ddr3 rx 570 8gb Z97 PC Mate

QaidVoid commented 3 years ago

Please post the log file, which you can find at /var/log/libvirt/qemu/{vm_name}.log

SneekeeStache commented 3 years ago

win10.log

QaidVoid commented 3 years ago

You forgot to change the CPU Model to host-passthrough. Untick Copy host CPU Configuration in CPUs section, and put host-passthrough in model.

SneekeeStache commented 3 years ago

i did that but i still have a black screen with no signal win10.log

QaidVoid commented 3 years ago

Please post your configuration file. You can get it with sudo virsh dumpxml win10 > win10.xml

SneekeeStache commented 3 years ago

win10.zip

QaidVoid commented 3 years ago

Copying CPU host or using any other CPU models seemed to work. It seems that you might not have enabled VT-D. I forgot to mention that in the guide, my bad. Run

ls /sys/kernel/iommu_groups

If you don't see any output, then you need to enable VT-D in your BIOS settings.

SneekeeStache commented 3 years ago

photo_2020-12-27_12-57-43 2020-12-27_13-00 everything was already enabled

QaidVoid commented 3 years ago

I'm not really sure what might be the issue. You can try a few more options. Check PCI devices mapped in IOMMU groups: https://wiki.archlinux.org/index.php/PCI_passthrough_via_OVMF#Ensuring_that_the_groups_are_valid

If the goup containing GPU devices contains anything else (ignoring PCI Bridge), you need ACS override patch: https://wiki.archlinux.org/index.php/PCI_passthrough_via_OVMF#Bypassing_the_IOMMU_groups_(ACS_override_patch)

If IOMMU group looks good, try updating qemu and libvirt to latest version. If it doesn't work post the lspci -nnk output. And, also post your libvirt start and stop script.

SneekeeStache commented 3 years ago

lspci -nnk.txt start.txt stop.txt qemu   | 5.2.0-2 libvirt   | 1:6.5.0-3

IOMMU Group 1: 00:01.0 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor PCI Express x16 Controller [8086:0c01] (rev 06) 01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] [1002:67df] (rev ef) 01:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere HDMI Audio [Radeon RX 470/480 / 570/580/590] [1002:aaf0]

QaidVoid commented 3 years ago

Everything looks fine. I'm not really sure what might be the issue.

JulianBonk commented 3 years ago

I am having similar issues.

Specs: Ryzen 5 3600 16GB ram 5700xt (Powercolor red devil) b450-a pro max (msi)

when i try to launch my VM it just goes to a black screen and a line is appended to the log file: 2021-01-20 14:37:09.207+0000: shutting down, reason=failed

win10log.txt win10xml.txt startsh.txt stopsh.txt IOMMU.txt

[julian@julianPC ~]$ ls /sys/kernel/iommu_groups 0 1 10 11 12 13 14 15 16 17 2 3 4 5 6 7 8 9 `[julian@julianPC ~]$ qemu-system-x86_64 --version

QEMU emulator version 5.2.0 Copyright (c) 2003-2020 Fabrice Bellard and the QEMU Project developers [julian@julianPC ~]$ libvirtd --version libvirtd (libvirt) 6.5.0 `

Oh an btw: in your wiki you wrote:

Remove Channel Spice, Display Spice, Video XQL ...

when it should be

Remove Channel Spice, Display Spice, Video QXL

QaidVoid commented 3 years ago

In start script, try unloading all the modules used by amdgpu (except maybe amd_iommu_v2 module)

lsmod

If it doesn't work, try running the start script manually using SSH Client and if it doesn't produce any error, try booting into VM from there. Also, try unloading all the modules used by amdgpu here, too.. Unfortunately, the commands only works as superuser, so:

doas sh /etc/libvirt/hooks/qemu.d/win10/prepare/begin/start.sh
doas virsh start win10
JulianBonk commented 3 years ago

i added the following lines to start.sh modprobe -r gpu_shed modprobe -r i2c_algo_bit modprobe -r ttm modprobe -r drm_kms_helper modprobe -r drm

lsmod.txt

If i ssh into my system while it is in its blackscreen status and run virsh list there is no ID listed. If i use the command as SU sudo virsh list the ssh terminal freezes for some reason....

for your second suggestion:


[julian@julianPC ~]$ sudo virsh net-start default
Network default started

[julian@julianPC ~]$ sudo virsh start win10
error: Failed to start domain win10
error: Device 0000:01:00.1 not found: could not access /sys/bus/pci/devices/0000:01:00.1/config: No such file or directory

i know that "usually" the gpu should have the address 0000:01:00.1 but in my case it is 0000:2b:00.1 is there a way to change this? my pcie devices are the gpu itselft an two pcie-4port usb cards (that i actually dont need anymore). I will see if the address changes with the USB cards removed

UPDATE: with the USB cards removed the address changed to 0000:28:00 after making the necessary changes in the start and stop script the terminal still freezes upon the sudo virsh start win10 command.

QaidVoid commented 3 years ago

after making the necessary changes in the start and stop script the terminal still freezes upon the sudo virsh start win10 command.

Did you try running start script manually? I'm guessing it's something failing/taking time on start script.

QaidVoid commented 3 years ago

Closing due to inactivity..