Closed Moonlight63 closed 3 years ago
Hey! I'm facing the same issue. I have a R9 270x for the host and a 1070ti for the guest. When I try to:
virsh nodedev-detach $VIRSH_GPU_VIDEO
It just hangs. My nvidia-smi is as follows:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 465.31 Driver Version: 465.31 CUDA Version: 11.3 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:06:00.0 Off | N/A |
| 43% 47C P8 12W / 180W | 6MiB / 8119MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 611 G /usr/lib/Xorg 4MiB |
+-----------------------------------------------------------------------------+
The only think I think might be creating an issue would be xorg locking the GPU somehow, but I haven't seen it mentioned anywhere else. I'll try to poke around some stuff and if I get around to solve it I'll post here. If you have any suggestions they are highly appreciated.
So I've tried some things and tried to understand what's happening in each step. I noticed something. When I boot my PC I have the following:
╰─$ lspci -nnk | grep -A3 -e VGA
03:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Curacao XT / Trinidad XT [Radeon R7 370 / R9 270X/370X] [1002:6810]
Subsystem: PC Partner Limited / Sapphire Technology Device [174b:e270]
Kernel driver in use: radeon
Kernel modules: radeon, amdgpu
--
06:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104 [GeForce GTX 1070 Ti] [10de:1b82] (rev a1)
Subsystem: ZOTAC International (MCO) Ltd. Device [19da:2445]
Kernel driver in use: nvidia
Kernel modules: nouveau, nvidia_drm, nvidia
After I run the command ╰─$ sudo virsh nodedev-detach pci_0000_06_00_0
, I opened another terminal and looked for the kernels of the GPUs. The output was:
╰─$ lspci -k | grep -A3 -e VGA
03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Curacao XT / Trinidad XT [Radeon R7 370 / R9 270X/370X]
Subsystem: PC Partner Limited / Sapphire Technology Device e270
Kernel driver in use: radeon
Kernel modules: radeon, amdgpu
--
06:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1070 Ti] (rev a1)
Subsystem: ZOTAC International (MCO) Ltd. Device 2445
Kernel modules: nouveau, nvidia_drm, nvidia
06:00.1 Audio device: NVIDIA Corporation GP104 High Definition Audio Controller (rev a1)
As you can see, before, it was using the drivers nvidia, after running the command, there are no drivers in use. I guess it means that there is a problems loading the vfio-pci drivers, but I'm not sure. I'll try to look into it.
Are you sure that after the detach command is run, that the nvidia driver is actually detached? Like can you run that by hand and make sure that the driver is actually detached? If not then I had a similar issue.
Ok, so I ended up getting everything to work. A friend of mine on discord and I sat down to figure it out and we made some very interesting discoveries. Bear in mind, this is what the solution was in my case, but you will have to do your own testing to see if this works for you. Problem number one: When gnome boots with the nvidia GPU attached, it likes to grab onto the GPU and never let go, regaurdless of what you put in your xorg.conf file. The way I figured this out was by using the command lsof, which prints a list of ever single open file. So, by using grep I found this little gem:
... lots of stuff up here ...
gnome-she 15558 user mem REG 195,0 762 /dev/nvidia1
... more stuff down here ...
where /dev/nvidia1 in my case was a reference to my second gpu. Since you only have one nvidia card, it would likely be nvidia0.
Even after running the detach command, gnome shell is still using the gpu, aka, the file /dev/nvidia1 is still 'open', despite my xorg.conf telling it not to. For reference, here is my xorg.conf:
# nvidia-xconfig: X configuration file generated by nvidia-xconfig
# nvidia-xconfig: version 460.73.01
Section "ServerFlags"
Option "AutoAddGPU" "off"
EndSection
Section "ServerLayout"
Identifier "Layout0"
Screen 0 "Screen0" 0 0
InputDevice "Keyboard0" "CoreKeyboard"
InputDevice "Mouse0" "CorePointer"
Option "Xinerama" "0"
EndSection
Section "Files"
EndSection
Section "InputDevice"
# generated from default
Identifier "Mouse0"
Driver "mouse"
Option "Protocol" "auto"
Option "Device" "/dev/psaux"
Option "Emulate3Buttons" "no"
Option "ZAxisMapping" "4 5"
EndSection
Section "InputDevice"
# generated from default
Identifier "Keyboard0"
Driver "kbd"
EndSection
Section "Monitor"
Identifier "Monitor0"
VendorName "Unknown"
ModelName "SAC DP"
HorizSync 30.0 - 222.0
VertRefresh 30.0 - 144.0
Option "DPMS"
EndSection
Section "Device"
Identifier "Device0"
Driver "nvidia"
VendorName "NVIDIA Corporation"
BoardName "GeForce GTX 1080"
BusID "PCI:1:0:0"
EndSection
Section "Screen"
Identifier "Screen0"
Device "Device0"
Monitor "Monitor0"
DefaultDepth 24
Option "Stereo" "0"
Option "nvidiaXineramaInfoOrder" "DFP-6"
Option "metamodes" "DP-4: 2560x1440_144 +2560+0, DP-0: 2560x1440_144 +0+0, DP-2: 2560x1440_144 +5120+0"
Option "SLI" "Off"
Option "MultiGPU" "On"
Option "BaseMosaic" "off"
SubSection "Display"
Depth 24
EndSubSection
EndSection
The relevant parts there being the device section where I set my 1080 as the only GPU and never make a second device with the 1070, and the "AutoAddGPU" "off" option. But, gnome still uses it for some reason.
The way I solved this was by setting up my system like any other gpu passthrough system before it. Load vfio drivers at boot and bind them to the second GPU, and then rebind the nvidia driver after boot. Here is a quick refresher, but I am doing this from memory so please correct me if I miss something:
Step 1: Edit /etc/initramfs-tools/modules and add the lines
vfio
vfio_pci
vfio_iommu_type1
vfio_virqfd
This will load the drivers on boot.
Step 2: Create script in /etc/initramfs-tools/scripts/init-top called bind_vfio.sh ( or whatever you want ) with this content:
#!/bin/sh
DEVS="0000:02:00.0 0000:02:00.1" # <--- Change these to your specific device ids
for DEV in $DEVS; do
echo "vfio-pci" > /sys/bus/pci/devices/$DEV/driver_override
done
modprobe -i vfio-pci # <--- Not actually sure if you need this, but it works
Step 3:
sudo chmod 755 /etc/initramfs-tools/scripts/init-top/bind_vfio.sh
sudo chown root:root /etc/initramfs-tools/scripts/init-top/bind_vfio.sh
sudo update-initramfs -u
I had to reboot my system before running the update command, otherwise it got hung up on something, not really sure why.
More details on regular vfio setups can be found here: https://forum.level1techs.com/t/vfio-in-2019-pop-os-how-to-general-guide-though-draft/142287
Once that is all done you would basically be ready to do GPU passthrough as normal..... BUT WE WANT TO USE THE GPU ON THE HOST NOW, RIGHT?! Yes, we do, so now what?
If you run the same lsof command as before, you will not see any reference to the second GPU, because it was never attached to gnome, and that's good. So now you can run virsh nodedev-reattach pci_0000_02_00_0
and the nvidia driver should re-bind. To help while visualize whats going on, you can open another terminal and use watch -n 0.1 lspci -s 0000:02:00.* -nk
to see what drivers the card is using in real time.
But hold on, running the reattach command isn't enough. If you try to open something like blender and go to edit > preferences > system > cuda, your GPU wont be listed. Similarly, if you check nvidia-smi
it also wont be there. That is because, on the first reattach, you need to then run nvidia-xconfig --query-gpu-info
to force the driver to recognize the 'new' GPU.
At this point, you can use the GPU on the host, then run virsh nodedev-detach pci_0000_02_00_0
to detach the driver, it should the also re-bind to vfio and you can use it on a host.
BUT WAIT! THERE'S MORE!
Unfortunately, our testing uncovered another problem.... The audio driver...snd_hda_intel
If you ever reattach the audio driver to the host, you will never be able to unbind it... at least... I couldn't. This is a problem, because by default when you close the VM that is using the device, it will release it's resource including the audio device, and the audio device will re-bind to snd_hda_intel. At that point you will have to actually reboot so that the vfio driver can re-bind to the audio device. I suspect that this is what causes the majority of the detach script hanging problem.
I haven't found a very good solution to this other than to just blacklist that driver. Unfortunately for me, that also prevents my primary GPU from outputting audio over hdmi as well. It isnt a huge deal for me because I use headphones anyway, but it still isn't a very clean solution. A better solution would be to blacklist the driver on 'just' this one device at this specific pci address, but I haven't figured out how to do that.
So in the end, I finished with was the following:
Step 1:
Create the file /etc/modprobe.d/blacklist-intel-snd.conf
with:
blacklist snd_hda_intel
install snd_hda_intel /bin/false
This will prevent the intel audio driver from ever loading. Run the update-initramfs command again.
Step 2:
Create a new systemd service in /etc/systemd/system/bind-second-gpu.service
with the following:
[Unit]
Description=Bind my second GPU to Nvidia only after gnome has started
PartOf=graphical-session.target
[Service]
ExecStart=bash -c 'sleep 10; virsh nodedev-reattach pci_0000_02_00_0; sleep 2; nvidia-xconfig --query-gpu-info'
Type=oneshot
[Install]
WantedBy=graphical-session.target
After that, run sudo systemctl enable bind-second-gpu.service
This will auto bind the second GPU back to the host on a fresh boot.
I then proceeded with the regular qemu hooks with the bind script being:
#!/bin/bash
## Load the config file
source "/etc/libvirt/hooks/kvm.conf"
## Unbind gpu from nvidia and bind to vfio
virsh nodedev-detach $VIRSH_GPU_VIDEO
virsh nodedev-detach $VIRSH_GPU_AUDIO
And unbind:
#!/bin/bash
## Load the config file
source "/etc/libvirt/hooks/kvm.conf"
## Unbind gpu from vfio and bind to nvidia
virsh nodedev-reattach $VIRSH_GPU_VIDEO
virsh nodedev-reattach $VIRSH_GPU_AUDIO
The reattach on the audio device really does nothing because there is no driver for it to attach to anymore.
THIS WILL ONLY LET YOU USE THE SECOND GPU FOR CUDA ON HOST! THERE IS NO VIDEO OUTPUT WHEN REATTACHED! It would be cool if we could get that working but I don't think you could without starting a second xsession that could be killed when you want to switch. Also, if the second gpu is being used, and you try to start a VM, the VM won't start until the GPU is freed. For example, if I am running a render in Blender using that GPU, I have to close Blender before the VM will start.
OK, I think that is everytihng. Please remember to replace my PCI ID pci_0000_02_00_0
With whatever you are using. Thank you @NineBallAYAYA for staying up til 5am 3 nights in a row to help me diagnose this, lol
Came across Moonlight63's comment here, it ended up being very helpful. I figured out that you can block snd_hda_intel
from loading on a single device using a udev rule, just create a file at /etc/udev/rules.d/90-vfio-gpu-audio.rules
and drop this in there. Make sure to replace both occurrences of the ID with your GPU's audio device ID:
SUBSYSTEM=="pci", KERNEL=="0000:0b:00.1", PROGRAM="/bin/sh -c 'echo -n 0000:0b:00.1 > /sys/bus/pci/drivers/snd_hda_intel/unbind'"
In addition, to get this working with Dracut instead of initramfs-tools, you can create a directory at /usr/lib/dracut/modules.d/64bind-vfio
and put the initramfs script in there. Then, you'll need to create another script file in that same directory named module-setup.sh
and paste this in there:
#!/bin/sh
check() {
return 0
}
depends() {
return 0
}
install() {
# Replace the filename here after $moddir/ with the name you gave the other script!
inst_hook pre-trigger 64 "$moddir/force-vfio-pci.sh"
}
After that, you should be able to add it just like a standard Dracut module.
(creating a new issue with the same body as my reply to the old one) Originally posted by @Moonlight63 in https://github.com/bryansteiner/gpu-passthrough-tutorial/issues/16#issuecomment-843735151
Hello, Thank you for the guide. I have been running passthrough for a while, but just did a fresh install of Pop and thought it might be nice to be able to use my second gpu when not running VMs. I am having the same issues as others here.
Running
virsh nodedev-detach $VIRSH_GPU_VIDEO
where, for mecauses a hang.
My gpus are on there own IOMMU groups
I have modified my xorg.conf so that it only uses the 1080 for host, and I have disabled AutoAddGPU, and I have a 3 monitor setup with all 3 plugged into the 3 displayports on the 1080, my full config is this:
And finally I have verified that the 1070 is not being used by anything with nvidia-smi:
As a side note, my CPU doesn't list the virtualization option as VT-d, but rather calls it by it's full name in dmesg:
Possibly because it's a xeon? Just thought I would mention it for others who come here.
Anyway, as far as I can tell, I've done everything mentioned and I can't find anything else that would be stopping the unload. Any ideas? Yes my scripts are executable, and I've been trying to just run the commands one by one in terminal to see if I can find an error exit, but since virsh nodedev-detach never completes and just hangs, no error is reported. Any help is greatly appreciated.