Open jhnphm opened 1 year ago
Hello @jhnphm, please copy your system information from Steam (Steam
-> Help
-> System Information
) and put it in a gist, then include a link to the gist in this issue report.
Hello @jhnphm, please copy your system information from Steam (
Steam
->Help
->System Information
) and put it in a gist, then include a link to the gist in this issue report.
I've copied it into the updated post above in sysinfo.log but also here: https://gist.github.com/jhnphm/f9e45d04d374cb9613386ac094b5e50a
Thanks, AMDVLK has a history of breaking other Vulkan driver implementations. If you remove / disable AMDVLK and use mesa/RADV instead, are you able to reproduce this scenario?
12:42:52.860029: pressure-vessel-wrap[27962]: I: Vulkan ICD #0 at /usr/share/vulkan/icd.d/amd_icd32.json: /usr/lib32/amdvlk32.so
AMDVLK is still in the mix in your test.
Ah, left the 32-bit amdvlk in the mix. New test:
For reference, this is a working run w/ the NVIDIA GPU unbound, run w/o prime-run: slr-app837470-t20221007T134748.log steam-837470.log
For apples to apples, nonworking run, NVIDIA GPU bound, w/o prime-run: slr-app837470-t20221007T135131.log steam-837470.log
A working NVIDIA GPU bound, w/o prime-run, on Proton 5.0:
steam-837470.log (couldn't find the steam runtime logfiles for some reason)
A working NVIDIA GPU bound, w/ prime-run, on Proton 5.0:
slr-app1420170-t20221007T135842.log
Basically combination of 5.13+ AND the NVIDIA GPU bound to the host but not necessarily active (doesn't make a difference whether prime-run is used or not) breaks.
Actually, I'm not even able to launch winecfg in the prefix w/ the NVIDIA GPU bound:
john@thor [02:27:47 PM] [~]
-> % export GAMEID=837470
john@thor [02:28:11 PM] [~]
-> % WINEPREFIX=~/.steam/steam/steamapps/compatdata/$GAMEID/pfx/ WINEARCH=win64 .steam/steam/steamapps/common/Proton\ 7.0/dist/bin/wine64 'winecfg.exe'
wineserver: using server-side synchronization.
wine: RLIMIT_NICE is <= 20, unable to use setpriority safely
wine: Unhandled page fault on execute access to 00007F2D614EF3D0 at address 00007F2D614EF3D0 (thread 00cc), starting debugger...
00c4:err:winediag:nodrv_CreateWindow Application tried to create a window, but no driver could be loaded.
00c4:err:winediag:nodrv_CreateWindow The explorer process failed to start.
john@thor [02:28:15 PM] [~]
Installing vulkan-mesa-layers/lib32-vulkan-mesa-layers (https://bbs.archlinux.org/viewtopic.php?id=279672) helps running winecfg and untitled goose game directly w/ proton, but it still breaks if prime-run is enabled or if it's run through steam w/ the common error signature:
00c4:err:winediag:nodrv_CreateWindow Application tried to create a window, but no driver could be loaded.
00c4:err:winediag:nodrv_CreateWindow The explorer process failed to start.
Potentially related: https://www.reddit.com/r/linux_gaming/comments/rvzu5p/cant_run_winelutrisproton_apps_on_a_gpu_thats_not/ . It looks like I can get this to work, at least to start winecfg from the command line, if I bind the GPU before starting X, but that means I can no longer unbind it for passing it through to a VM w/o restarting X. Running untitled goose game from steam still doesn't work though.
Most other native applications like vkcube and Proton <= 5.0 work fine on the nVidia dGPU w/o Xorg started after binding to the GPU, so it does seem like a Proton/Wine regression.
I can get prime-run to work w/ the scripts generated by using PROTON_DUMP_DEBUG_COMMANDS, if I switch to wayland, but I still can't get it to run via the steam GUI. Looks like bypassing the steam runtime with the arch steam-native script works too.
This might be a Proton regression, but you said that Proton <= 5.0 is good and 5.13+ is bad, which suggests that one important factor might be whether you're using the SteamLinuxRuntime_soldier
container runtime (which is used by Proton 5.13+, and optionally for native Linux games) or not (Proton <= 5.0 and most native Linux games).
However, there were also a lot of non-container-runtime-related changes between Proton 5.0 and 5.13, so it's also possible that this is genuinely a Proton problem and nothing to do with the container runtime.
Multi-GPU is complicated, Proton is complicated, and SteamLinuxRuntime
is complicated, so the combination of the three gets very confusing. Please try to narrow down where the problem is, with as few complicated things involved as possible:
Help -> System Information
while in that state (this runs some simple diagnostic tools). The Gist you provided was before removing AMDVLK, so its results are not necessarily the same as what you're seeing now. If you alter the system state during testing (binding/unbinding the GPU, etc.), please get a new System Information
dump matching each log, so that we can compare them.%command% -vulkan
, which makes it useful for apples-to-apples comparisons between OpenGL and Vulkan.Compatibility
tab, check Force the use of a specific Steam Play compatibility tool
, and choose Steam Linux Runtime
from the list. This will result in those games running in a SteamLinuxRuntime_soldier
container (the same as Proton 5.13+) with some compatibility glue to provide the same libraries as the traditional scout
Steam Runtime.Steam Linux Runtime
as the Proton games did, then this is probably a Steam Linux Runtime problem. To confirm, uncheck Force the use of a specific Steam Play compatibility tool
for each game and try again.%command% -vulkan
, then this is a Vulkan-specific problem. Recent versions of Proton also use Vulkan when emulating most DirectX versions.A working NVIDIA GPU bound, w/o prime-run, on Proton 5.0: (couldn't find the steam runtime logfiles for some reason)
The SteamLinuxRuntime_soldier
container runtime is not used for Proton 5.0, so it is correct and expected that you will not get a SteamLinuxRuntime_soldier/var/slr-*.log
for Proton 5.0 games.
A working NVIDIA GPU bound, w/ prime-run, on Proton 5.0: steam-837470.log slr-app1420170-t20221007T135842.log
These logs don't match: if it was using Proton 5.0, then you wouldn't get a slr-*.log
for that run. slr-app1420170-t20221007T135842.log
seems to be an unrelated log from running Proton\ 5.13/proton run /home/john/.local/share/Steam/ubuntu12_32/../bin/d3ddriverquery64.exe
(see the first line).
-> % WINEPREFIX=~/.steam/steam/steamapps/compatdata/$GAMEID/pfx/ WINEARCH=win64 .steam/steam/steamapps/common/Proton\ 7.0/dist/bin/wine64 'winecfg.exe'
This is unsupported: Proton 5.13+ is intended to always be run in the SteamLinuxRuntime_soldier
container environment, not on the host system. However, if this is also failing with the same symptoms as in the container runtime, then that suggests that the problem might be with Proton and not the container runtime.
Looks like bypassing the steam runtime with the arch steam-native script works too
This is also unsupported: the steam-for-linux binaries are intended to always be run with the (older, LD_LIBRARY_PATH
-based) Steam Runtime, which is what steam-native
disables. Scripts in the Steam Runtime are responsible for choosing whether to take each library from your host system or from the runtime (in most cases whichever one is newer must be used).
I'm surprised that steam-native
has any effect on the container runtime - it only disables the older, LD_LIBRARY_PATH
-based runtime mechanism (used by Steam itself, Proton <= 5.0 and most native Linux games) and shouldn't do anything to the container runtime. If steam-native
vs. steam-runtime
makes a difference, then there must be some relatively subtle interaction going on.
Are you sure you are running steam-native
in exactly the same way that you were running Steam with the normal Steam Runtime enabled, so that the only difference is -native
or not?
One thing that might be significant here is that if you run Steam from a desktop environment shortcut, most desktop environments will try to launch it on a discrete or non-default GPU using PRIME or similar (via PrefersNonDefaultGPU=true
and X-KDE-RunOnDiscreteGpu=true
), but if you run it from a command-line prompt, that will not take effect. So I wonder whether the difference might really be that you are running steam-native
from a terminal (therefore on your default GPU), but running Steam in its normal supported mode from a desktop shortcut (therefore on your discrete GPU)?
More recent sysinfo w/ amdvlk disabled: https://gist.github.com/jhnphm/535dc9ee4154fee34648c712fc357eab
CS:GO works natively both w/ OpenGL and w/ Vulkan, and w/ the runtime set to Steam Linux Runtime. so it seems to really be a Proton issue as opposed to a runtime issue.
The steam-native thing seems to be a red-herring, probably messed up some testing w/ GPU in a bad state or some other weird transient problem. I can get Steam running Proton games w/ the latest Proton normally w/ dGPU bound under Wayland though.
It might have something to do w/ binding the GPU after Xorg is started to keep Xorg from binding to it and making it un-unbindable for VMs w/o restarting the DE. [EDIT Nope, makes no difference].
Multi-GPU used to work on Xorg when I was using an AMD dGPU w/ an AMD iGPU, but the AMD card (Vega64) had other issues w/ VFIO that necessitated running Xorg instead of Wayland. I guess since it now all works under Wayland I can just use that since it works on Wayland, but if it's useful to chase this down I can provide more information.
Wayland sysinfo: https://gist.github.com/jhnphm/d378f7601301736401c72c684f6c6e3d
I'm using VFIO for the occasional incompatible windows game. All games seem to not complete startup w/ Proton 5.13+ (tried 7.x, experimental, etc) whenever the NVIDIA card is bound to the host. My main display is being run off of the AMD iGPU and I'm launching w/
prime-run steam
. This problem manifests with or withoutprime-run
though. If I unbind the NVIDIA card proton runs fine. Versions of proton < 5.13 also run fine.Issue seems similar to https://github.com/ValveSoftware/Proton/issues/6180
I'm using Arch Linux, Ryzen 5700G, nVidia 3070
Logs attached:
slr-app837470-t20221007T121808.log steam-837470.log sysinfo.log
Console log: