Open megheaiulian opened 12 months ago
Hi @megheaiulian, have you tried using the pci
device type in LXD? https://documentation.ubuntu.com/lxd/to/latest/reference/devices_pci/
I guess it would be something like lxc config device add {vm} audio pci address=0000:c3:00.1
.
Yes that would obviously work.
In https://github.com/canonical/lxd/blob/e90ae16cc7c573d5a6ad37783e888190afda9ffd/lxd/instance/drivers/driver_qemu.go#L4487 there is code that tries to grab other devices from the same iommu group.
For some reason this is not working in this case.
Could it be because it is prefixed with consumer
and the code at https://github.com/canonical/lxd/blob/e90ae16cc7c573d5a6ad37783e888190afda9ffd/lxd/instance/drivers/driver_qemu.go#L4495C15-L4495C24 checks for a prefix matching the pciSlotName.
This is how devices in that iommu group look for me:
Can you please send the output of ls /sys/bus/pci/devices/0000:c3:00.0/iommu_group/devices
since that is the dir whose contents get iterated over.
It shows only the gpu device and not the audio component:
@megheaiulian what is the output of cat /proc/cmdline
? Check that you have iommu=pt amd_iommu=on
in the output. Also can you show me the output of uname -r
. Also, the way PCIe devices are set up on the motherboard can affect IOMMU grouping. The IOMMU groups are essentially how the system's hardware is compartmentalized for DMA (Direct Memory Access) protection. The layout and distribution can vary based on the motherboard's firmware or the physical configuration of the PCIe slots. If possible, try changing the slot in which the GPU is installed and try ls /sys/bus/pci/devices/<pci_base_addr>/iommu_group/devices
and see if you have two pci addresses... Also, what's your motherboard firmware and GPU driver version? Sometimes, firmware updates can resolve hardware compatibility issues or improve IOMMU groupings. Ideally, try to see if these are the last versions and if not try to do the update and check ls /sys/bus/pci/devices/<pci_base_addr>/iommu_group/devices
again.
As an example, with GPU, I have:
$ lspci
...
42:00.0 VGA compatible controller: NVIDIA Corporation GA106 [GeForce RTX 3060 Lite Hash Rate] (rev a1)
42:00.1 Audio device: NVIDIA Corporation Device 228e (rev a1)
So just like you, I have an audio component in it. For the rest of the informations:
$ cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-6.2.0-39-generic root=UUID=c37564b1-a9d6-464c-b413-0d895acf7c9f ro quiet splash iommu=pt amd_iommu=on pci=assign-busses kvm_amd.npt=1 kvm_amd.avic=1 kvm.ignore_msrs=1 vt.handoff=7
$ uname -r
6.2.0-39-generic
Unfortunately, I don't have an AMD card for the testing of the driver version. Lastly, on my side, I have:
ls /sys/bus/pci/devices/0000:42:00.0/iommu_group/devices
0000:42:00.0 0000:42:00.1
Having these groupped under iommu_group/devices
, I have no problem for the passthrough of multiple components.
@gabrielmougard I have a correct iommu setup. I am able to passthrough correctly to windows or linux vms directly with qemu.
The issue is only that LXD is not able to pickup the audio component of that card and add it as qemu device because it's not under ls /sys/bus/pci/devices/0000:c3:00.0/iommu_group/devices
.
Instead it seems to be ls /sys/bus/pci/devices/0000:c3:00.0/iommu_group/consumer:pci:0000:c3:00.1
.
Could be something in the kernel specific to amd cards ...
Possibly. This is hard to know. @mihalicyn did you experience such an issue with an AMD card?
When using a gpu of type physical for a VM LXD does not pass through all the component devices of the gpu.
For example using this device configuration
produces this qemu config:
The gpu used here (a RX66000) has a audio component at
0000:c3:00.1
that is not passed through. Without it the AMD drivers will not initialize correctly.Adding
raw.qemu: -device vfio-pci,host=c3:00.1,bus=qemu_pcie5
makes this work but it is not very intuitive.