bryansteiner / gpu-passthrough-tutorial

GNU General Public License v3.0
1.4k stars 91 forks source link

virt-manager cannot find PCIe devices from name #17

Closed ayan-iiitd closed 3 years ago

ayan-iiitd commented 3 years ago

My file structure -

.
├── kvm.conf

├── qemu

└── qemu.d

    └── win10

        ├── prepare

        │   └── begin

        │       ├── alloc_hugepages.sh

        │       ├── bind_vfio.sh

        │       └── cpu_mode_performance.sh

        └── release

            └── end

                ├── cpu_mode_ondemand.sh

                ├── dealloc_hugepages.sh

                └── unbind_vfio.sh

All files can be executed -

root@test_multi_gpu_system:/etc/libvirt/hooks# chmod +x kvm.conf
root@test_multi_gpu_system:/etc/libvirt/hooks# chmod +x /etc/libvirt/hooks/qemu
root@test_multi_gpu_system:/etc/libvirt/hooks# chmod +x /etc/libvirt/hooks/qemu.d/win10/prepare/begin/*
root@test_multi_gpu_system:/etc/libvirt/hooks# chmod +x /etc/libvirt/hooks/qemu.d/win10/prepare/begin/alloc_hugepages.sh 
root@test_multi_gpu_system:/etc/libvirt/hooks# chmod +x /etc/libvirt/hooks/qemu.d/win10/prepare/begin/bind_vfio.sh
root@test_multi_gpu_system:/etc/libvirt/hooks# chmod +x /etc/libvirt/hooks/qemu.d/win10/prepare/begin/cpu_mode_performance.sh
root@test_multi_gpu_system:/etc/libvirt/hooks# chmod +x /etc/libvirt/hooks/qemu.d/win10/release/end/*
root@test_multi_gpu_system:/etc/libvirt/hooks# chmod +x /etc/libvirt/hooks/qemu.d/win10/release/end/cpu_mode_ondemand.sh
root@test_multi_gpu_system:/etc/libvirt/hooks# chmod +x /etc/libvirt/hooks/qemu.d/win10/release/end/dealloc_hugepages.sh
root@test_multi_gpu_system:/etc/libvirt/hooks# chmod +x /etc/libvirt/hooks/qemu.d/win10/release/end/unbind_vfio.sh
root@test_multi_gpu_system:/etc/libvirt/hooks# 

My IOMMU groups -

IOMMU Group 0 00:00.0 Host bridge [0600]: Intel Corporation 8th Gen Core Processor Host Bridge/DRAM Registers [8086:3ec2] (rev 07)
IOMMU Group 10 00:1f.6 Ethernet controller [0200]: Intel Corporation Ethernet Connection (2) I219-V [8086:15b8]
IOMMU Group 11 04:00.0 USB controller [0c03]: ASMedia Technology Inc. ASM2142 USB 3.1 Host Controller [1b21:2142]
IOMMU Group 1 00:01.0 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x16) [8086:1901] (rev 07)
IOMMU Group 1 00:01.1 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x8) [8086:1905] (rev 07)
IOMMU Group 1 01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP102 [GeForce GTX 1080 Ti] [10de:1b06] (rev a1)
IOMMU Group 1 02:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP102 [GeForce GTX 1080 Ti] [10de:1b06] (rev a1)
IOMMU Group 2 00:02.0 VGA compatible controller [0300]: Intel Corporation UHD Graphics 630 (Desktop) [8086:3e92]
IOMMU Group 3 00:08.0 System peripheral [0880]: Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th/8th Gen Core Processor Gaussian Mixture Model [8086:1911]
IOMMU Group 4 00:14.0 USB controller [0c03]: Intel Corporation 200 Series/Z370 Chipset Family USB 3.0 xHCI Controller [8086:a2af]
IOMMU Group 4 00:14.2 Signal processing controller [1180]: Intel Corporation 200 Series PCH Thermal Subsystem [8086:a2b1]
IOMMU Group 5 00:16.0 Communication controller [0780]: Intel Corporation 200 Series PCH CSME HECI #1 [8086:a2ba]
IOMMU Group 6 00:17.0 SATA controller [0106]: Intel Corporation 200 Series PCH SATA controller [AHCI mode] [8086:a282]
IOMMU Group 7 00:1c.0 PCI bridge [0604]: Intel Corporation 200 Series PCH PCI Express Root Port #1 [8086:a290] (rev f0)
IOMMU Group 8 00:1c.4 PCI bridge [0604]: Intel Corporation 200 Series PCH PCI Express Root Port #5 [8086:a294] (rev f0)
IOMMU Group 9 00:1f.0 ISA bridge [0601]: Intel Corporation Z370 Chipset LPC/eSPI Controller [8086:a2c9]
IOMMU Group 9 00:1f.2 Memory controller [0580]: Intel Corporation 200 Series/Z370 Chipset Family Power Management Controller [8086:a2a1]
IOMMU Group 9 00:1f.3 Audio device [0403]: Intel Corporation 200 Series PCH HD Audio [8086:a2f0]
IOMMU Group 9 00:1f.4 SMBus [0c05]: Intel Corporation 200 Series/Z370 Chipset Family SMBus Controller [8086:a2a3]

I am trying to pass both 1080Tis to the VM, so my kvm.conf, bind_vfio.sh and unbind_vfio.sh are a little different and also I do not know if the VM will automatically detect the added keyboard and mouse -

(base) root@test_multi_gpu_system:/etc/libvirt/hooks# cat kvm.conf
VIRSH_GPU_VIDEO_A=pci_0000_01_00_0
VIRSH_GPU_VIDEO_B=pci_0000_02_00_0
VIRSH_GPU_AUDIO_A=pci_0000_01_00_1 
VIRSH_GPU_AUDIO_B=pci_0000_02_00_2
VIRSH_DEFAULT_AUDIO=pci_0000_00_1f_3
VIRSH_GPU_USB_A=pci_0000_00_14_0
VIRSH_GPU_USB_B=pci_0000_04_00_0
VIRSH_GPU_SERIAL_A=pci_0000_01_00_3
VIRSH_GPU_SERIAL_B=pci_0000_02_00_3
(base) root@test_multi_gpu_system:/etc/libvirt/hooks# cat qemu.d/win10/prepare/begin/bind_vfio.sh
#!/bin/bash

## Load the config file
source "/etc/libvirt/hooks/kvm.conf"

## Load vfio
modprobe vfio
modprobe vfio_iommu_type1
modprobe vfio_pci

## Unbind gpu from nvidia and bind to vfio
virsh nodedev-detach VIRSH_GPU_VIDEO_A
virsh nodedev-detach VIRSH_GPU_VIDEO_B
virsh nodedev-detach VIRSH_GPU_AUDIO_A
virsh nodedev-detach VIRSH_GPU_AUDIO_B
virsh nodedev-detach VIRSH_DEFAULT_AUDIO
virsh nodedev-detach VIRSH_GPU_USB
virsh nodedev-detach VIRSH_GPU_SERIAL_A
virsh nodedev-detach VIRSH_GPU_SERIAL_B
(base) root@test_multi_gpu_system:/etc/libvirt/hooks# cat qemu.d/win10/release/end/unbind_vfio.sh
#!/bin/bash

## Load the config file
source "/etc/libvirt/hooks/kvm.conf"

## Unbind gpu from vfio and bind to nvidia
virsh nodedev-reattach VIRSH_GPU_VIDEO_A
virsh nodedev-reattach VIRSH_GPU_VIDEO_B
virsh nodedev-reattach VIRSH_GPU_AUDIO_A
virsh nodedev-reattach VIRSH_GPU_AUDIO_B
virsh nodedev-reattach VIRSH_DEFAULT_AUDIO
virsh nodedev-reattach VIRSH_GPU_USB_A
virsh nodedev-reattach VIRSH_GPU_USB_B
virsh nodedev-reattach VIRSH_GPU_SERIAL_A
virsh nodedev-reattach VIRSH_GPU_SERIAL_B

## Unload vfio
modprobe -r vfio_pci
modprobe -r vfio_iommu_type1
modprobe -r vfio
(base) root@test_multi_gpu_system:/etc/libvirt/hooks# 

Here, after begining installation virt-manager says that the names don not exist -

Traceback (most recent call last):
  File "/usr/share/virt-manager/virtManager/asyncjob.py", line 75, in cb_wrapper
    callback(asyncjob, *args, **kwargs)
  File "/usr/share/virt-manager/virtManager/createvm.py", line 2089, in _do_async_install
    guest.installer_instance.start_install(guest, meter=meter)
  File "/usr/share/virt-manager/virtinst/install/installer.py", line 542, in start_install
    domain = self._create_guest(
  File "/usr/share/virt-manager/virtinst/install/installer.py", line 491, in _create_guest
    domain = self.conn.createXML(install_xml or final_xml, 0)
  File "/usr/lib/python3/dist-packages/libvirt.py", line 4034, in createXML
    if ret is None:raise libvirtError('virDomainCreateXML() failed', conn=self)
libvirt.libvirtError: Hook script execution failed: internal error: Child process (LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin /etc/libvirt/hooks/qemu win10 prepare begin -) unexpected exit status 1: error: Could not find matching device 'VIRSH_GPU_VIDEO_A'
error: Node device not found: no node device with matching name 'VIRSH_GPU_VIDEO_A'
error: Could not find matching device 'VIRSH_GPU_VIDEO_B'
error: Node device not found: no node device with matching name 'VIRSH_GPU_VIDEO_B'
error: Could not find matching device 'VIRSH_GPU_AUDIO_A'
error: Node device not found: no node device with matching name 'VIRSH_GPU_AUDIO_A'
error: Could not find matching device 'VIRSH_GPU_AUDIO_B'
error: Node device not found: no node device with matching name 'VIRSH_GPU_AUDIO_B'
error: Could not find matching device 'VIRSH_DEFAULT_AUDIO'
error: Node device not found: no node device with matching name 'VIRSH_DEFAULT_AUDIO'
error: Could not find matching device 'VIRSH_GPU_USB'
error: Node device not found: no node device with matching name ''

Any help is appreciated.

bryansteiner commented 3 years ago

You need to use $ whenever you reference a variable in bash.

Declaration:

VIRSH_GPU_VIDEO_A=pci_0000_01_00_0 VIRSH_GPU_VIDEO_B=pci_0000_02_00_0 ...

Reference:

virsh nodedev-reattach $VIRSH_GPU_VIDEO_A virsh nodedev-reattach $VIRSH_GPU_VIDEO_B ...