pop-os / system76-power

Power profile management for Linux
GNU General Public License v3.0
597 stars 74 forks source link

Desktop AMD/Nvidia Dual GPU breaks SteamVR #156

Open robobenklein opened 4 years ago

robobenklein commented 4 years ago

Distribution (run cat /etc/os-release):

Pop!_OS 20.04

Related Application and/or Package Version (run apt policy $PACKAGE NAME):

SteamVR beta 1.12.4

Issue/Bug Description:

I can no longer run SteamVR and make use of a secondary Nvidia GPU at the same time. (Not even for CUDA or headless workloads.)

When the graphics mode is set to hybrid (so that the desktop still runs on my primary AMD GPU) SteamVR fails to launch. If I set it to nvidia all the desktop rendering goes to the Nvidia GPU and my overall performance suffers and I can't make use of my primary GPU.

When I set the mode to integrated the primary AMD card is used for everything and SteamVR works, however I can no longer make use of the Nvidia card because the driver is unloaded.

Steps to reproduce (if you know):

Attempt to launch SteamVR while under different GPU switching modes with two different GPUs.

Expected behavior:

Either hybrid mode should not cause SteamVR to fail or integrated mode needs to be changed or extended as a new mode to allow usage of the secondary card as a headless compute unit.

Other Notes:

System: AMD Ryzen 2700 Primary GPU: Vega 56 Secondary GPU: GTX980 VR Headset: HTC Vive

Running unmodified Pop Shell, no 3rd party (non-System76) drivers installed.

I realize the intent here is mainly for laptop GPU switching, but having system76-power installed should not have caused a regression in desktop behaviour. Previously I was able to make use of the AMD card as a primary GPU normally and only use the Nvidia GPU for CUDA.

crawfxrd commented 4 years ago

SteamVR fails to launch

Do you have logs?

robobenklein commented 4 years ago

Switch mode to hybrid and reboot, start steam, then vr:

image

Error code is 307, I can't immediately point out any specific error in the logs that's obvious. (log files https://gist.github.com/robobenklein/f03af509608552f70c828c84fee94070 )

Perhaps I need to debug the vrcompositor binary? My suspicion is that it's getting some kind of information about available display units and incorrectly choosing the "dedicated" graphics.

crawfxrd commented 4 years ago

Can you also provide the output of /var/log/gpu-manager.log while in hybrid mode?

A workaround should be to switch to integrated mode and comment out the alias nvidia off entry in /etc/modprobe.d/system76-power.conf. This will allow you to load the modules (after enabling power to the NVIDIA GPU).

robobenklein commented 4 years ago

GPU manager log: https://gist.github.com/robobenklein/6a28eceddf6cb05c00181850cc5f56d9

crawfxrd commented 4 years ago

That looks correct. The only differences from before the update should be writing /usr/share/X11/xorg.conf.d/11-nvidia-offload.conf and setting the power control to auto.

In hybrid mode, do

sudo bash -c 'echo off > /etc/prime-discrete'

and reboot. This will remove the offload configuration file and attempt to unload the drivers.

If you don't intend to use the NVIDIA GPU for PRIME render offload, this is the simplest workaround.