bryansteiner / gpu-passthrough-tutorial

GNU General Public License v3.0
1.4k stars 91 forks source link

i can't get this working on ubuntu 20.04 stock kernel #8

Closed c00kie55 closed 4 years ago

c00kie55 commented 4 years ago

i am trying to passthrough my Quadro K5200 but creating vm hangs also if i try this: sudo virsh nodedev-detach pci_0000_81_00_0 nothing happens and system becomes unstable

sudo lspci | grep -i NVIDIA 03:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1080] (rev a1) 03:00.1 Audio device: NVIDIA Corporation GP104 High Definition Audio Controller (rev a1) 81:00.0 VGA compatible controller: NVIDIA Corporation GK110GL [Quadro K5200] (rev a1) 81:00.1 Audio device: NVIDIA Corporation GK110 High Definition Audio Controller (rev a1)

dmesg | grep IOMMU [ 0.166485] DMAR: IOMMU enabled [ 0.322075] DMAR-IR: IOAPIC id 3 under DRHD base 0xfbffe000 IOMMU 0 [ 0.322077] DMAR-IR: IOAPIC id 0 under DRHD base 0xc3ffc000 IOMMU 1 [ 0.322079] DMAR-IR: IOAPIC id 2 under DRHD base 0xc3ffc000 IOMMU 1

uname -r = 5.4.0-37-generic

grub settings = GRUB_CMDLINE_LINUX_DEFAULT="intel_iommu=on"

./iommu.sh .. IOMMU Group 31 80:05.4 PIC [0800]: Intel Corporation Xeon E7 v2/Xeon E5 v2/Core i7 IOAPIC [8086:0e2c] (rev 04) IOMMU Group 32 81:00.0 VGA compatible controller [0300]: NVIDIA Corporation GK110GL [Quadro K5200] [10de:103c] (rev a1) IOMMU Group 32 81:00.1 Audio device [0403]: NVIDIA Corporation GK110 High Definition Audio Controller [10de:0e1a] (rev a1) IOMMU Group 33 ff:08.0 System peripheral [0880]: Intel Corporation Xeon E7 v2/Xeon E5 v2/Core i7 QPI Link 0 [8086:0e80] (rev 04) IOMMU Group 34 ff:09.0 System peripheral [0880]: Intel Corporation Xeon E7 v2/Xeon E5 v2/Core i7 QPI Link 1 [8086:0e90] (rev 04) ..

tree /etc/libvirt/hooks/ /etc/libvirt/hooks/ ├── kvm.conf ├── qemu └── qemu.d └── WIN10HD ├── prepare │   └── begin │   └── bind_vfio.sh └── release └── end └── unbind_vfio.sh

maybe this dosen't work becource vfio now is in kernel? cat /etc/libvirt/hooks/qemu.d/WIN10HD/prepare/begin/bind_vfio.sh

!/bin/bash

Load the config file

source "/etc/libvirt/hooks/kvm.conf"

Load vfio

modprobe vfio modprobe vfio_iommu_type1 modprobe vfio_pci

Unbind gpu from nvidia and bind to vfio

virsh nodedev-detach $VIRSH_GPU_VIDEO virsh nodedev-detach $VIRSH_GPU_AUDIO

Unbind ssd from nvme and bind to vfio

virsh nodedev-detach $VIRSH_NVME_SSD

c00kie55 commented 4 years ago

this script seams to do the job now i just need to convert it to your guide so i also can have my Quadro available to the host when not in use

!/bin/sh

DEVS="0000:81:00.0 0000:81:00.1"

if [ ! -z "$(ls -A /sys/class/iommu)" ]; then for DEV in $DEVS; do echo "Removing pci slot: $DEV" echo $DEV > /sys/bus/pci/devices/$DEV/driver/unbind echo "vfio-pci" > /sys/bus/pci/devices/$DEV/driver_override done fi

bryansteiner commented 4 years ago

This is what your hooks directory structure looks like according to your post:

tree /etc/libvirt/hooks/ /etc/libvirt/hooks/ ├── kvm.conf ├── qemu └── qemu.d └── WIN10HD ├── prepare │ └── begin │ └── bind_vfio.sh └── release └── end └── unbind_vfio.sh

Now take a look at mine:

$ tree /etc/libvirt/hooks/
/etc/libvirt/hooks/
├── kvm.conf
├── qemu
└── qemu.d
    └── win10
        ├── prepare
        │   └── begin
        │       └── bind_vfio.sh
        │       └── ...
        └── release
            └── end
                └── unbind_vfio.sh
                └── ...

Notice how my hook scripts are located inside the VM lifecycle directory (not alongside it). Also your end directory should be located inside release

c00kie55 commented 4 years ago

they look the same don't they? i think the treewiev is confusing one of us.

$ ls /etc/libvirt/hooks/qemu.d/WIN10HD/prepare/begin bind_vfio.sh

also this is not working modprobe vfio modprobe vfio_iommu_type1 modprobe vfio_pci virsh nodedev-detach pci_0000_81_00_0

bryansteiner commented 4 years ago

To me they don't look the same. Perhaps you should write out what you think the parent and child directories are to see if we agree or not.

Also make sure that all of your hook scripts are executable: chmod +x script-name

c00kie55 commented 4 years ago

thanks i already made the scripts executable

i think the directory structure should be clear with this ls command:

$ ls /etc/libvirt/hooks/qemu.d/WIN10HD/prepare/begin bind_vfio.sh

$ ls /etc/libvirt/hooks/qemu.d/WIN10HD/release/end/ unbind_vfio.sh

tree

c00kie55 commented 4 years ago

i am certainly not an expert... but i this might explain something. https://forum.level1techs.com/t/ubuntu-20-04-missing-kernel-modules-for-vfio-pci-and-vfio-iommu-type1/156327

i am guessing that i should add some grub parameters and drop the modprobe vfio modprobe vfio_iommu_type1 modprobe vfio_pci from the script?

bryansteiner commented 4 years ago

Ok now that we're on the same page...

i am guessing that i should add some grub parameters and drop the modprobe vfio modprobe vfio_iommu_type1 modprobe vfio_pci from the script?

Dropping the modprobe statements from the script wouldn't really make a difference even if you used grub parameters to load these modules during boot-time... so this most likely isn't the source of your issues. Just make sure that the output of lsmod | grep 'vfio' has these modules.

If I were to guess, I'd assume your script is hanging because you're trying to detach your Quadro gpu from the nvidia drivers while it's still in-use. You can confirm this by running the command nvidia-smi and including the output here.

AlexTo commented 4 years ago

Hi, I can't get this working on 20.04 as well. Is there an updated guide for 20.04 ? :( I can confirm that I follow the instructions exactly but I think it has something to do with breaking changes in 20.04 regarding VFIO Thanks