Open esullivan-nvidia opened 6 months ago
When running the native Linux build of Metro Exodus
Just to check, you mean game ID 412020, right?
in SLR ... 1.) Start the native Linux version of Metro Exodus from Steam, ensuring no compatibility tools are enabled
If this part of the steps-to-reproduce is accurate, then you are specifically not running Metro Exodus in the container runtime, which is what we refer to when we say "Steam Linux Runtime" or "SLR". Instead, when you reproduce this bug, you are running it in the older LD_LIBRARY_PATH
runtime...
The only way I have found to avoid this problem is by forcing the game to run in in the scout SLR. I did this by entering the games compatibility settings in Steam, checking the "Force the use of a specific Steam Play compatibility tool", and then selecting "Steam Linux Runtime 1.0 (scout)"
... and running it in SLR actually avoids this bug.
(I am 90% sure that this is because SLR is less dependent on the LD_LIBRARY_PATH
than the legacy runtime environment was.)
When running with newer versions of SLR
There is no "newer versions of SLR" available to this particular game. There are only two environments that are valid to run a typical native Linux Steam game like 412020, with the potential to maybe add a third environment eventually:
LD_LIBRARY_PATH
runtime, which is implemented by extending the LD_LIBRARY_PATH
to include libraries from Steam Runtime 1.0 'scout'.LD_LIBRARY_PATH
to add a subset of Steam Runtime 1.0 'scout' libraries.(1.) is the default for nearly all native Linux games on desktop, and is what you use in your steps-to-reproduce.
(2.) is the default for some games on Steam Deck (e.g. Floating Point is a convenient example that is free-to-play), but according to https://steamdb.info/app/412020/info/ the default for Metro Exodus on Deck is to ignore the existence of a native Linux build, and instead run the Windows build via the default/stable branch of Proton. I assume this was done because QA testers encountered bugs when using the native Linux build.
(2). is also what you used in your workaround.
Not directly applicable to Metro Exodus, but for background information:
A minority of native Linux games, some Valve (e.g. Counter-Strike 2) and some third-party (e.g. Retroarch), require a newer branch of SLR (specifically, "Steam Linux Runtime 3.0 (sniper)"), and run in that environment without any Steam Runtime 1.0 'scout' compatibility libraries. However, this is only done if the game developer has specifically told Valve "I am targeting sniper for this game", which the developers of Metro Exodus have not done. The affected games cannot be played in the old LD_LIBRARY_PATH
runtime or in "Steam Linux Runtime 1.0 (scout)": the only way Steam will let you launch games in this category (unless you use special developer options that allow otherwise-invalid things) is in the sniper container.
Games that run under sniper in this way either have an entry in https://steamdb.info/app/891390/info/ forcing the game to use SteamLinuxRuntime_sniper
, or more commonly have an entry like the one in https://steamdb.info/app/1118310/config/ with app_mappings
mapping the game to SteamLinuxRuntime_sniper
, either for specific branches or by default.
The old LD_LIBRARY_PATH runtime, which is implemented by extending the LD_LIBRARY_PATH to include libraries from Steam Runtime 1.0 'scout'
You can tell when you are using this runtime because the game process's /proc/*/environ
(read it with e.g. perl -pe 's/\0/\n/g' /proc/$game_pid/environ
) will mention STEAM_RUNTIME=/path/to/steam-runtime
.
Steam Linux Runtime 1.0 (scout), which is implemented by entering a Steam Runtime 2.0 'soldier' container, and then extending the LD_LIBRARY_PATH to add a subset of Steam Runtime 1.0 'scout' libraries
In this case the game process's /proc/*/environ
will generally mention PRESSURE_VESSEL_RUNTIME
(this is not an API guarantee and should not be relied on, but is true in practice).
the scout SLR seems to allow the game to load the build of SDL2 that it distributes as part of the game
If this is the case, then the game's /proc/*/maps
will say something like steamapps/common/Metro Exodus/libSDL2.so
.
Looking at what I found out in #411, I think this is happening "by mistake" because the scout SLR environment does provide its own SDL2, but that SDL2 is installed with its canonical upstream SONAME
, libSDL2-2.0.so.0
, and we do not install the development symlink libSDL2.so
.
When running with [the
LD_LIBRARY_PATH
-based scout runtime] the game ends up loading the system copy of SDL2
If this is the case, then the game's /proc/*/maps
will say something like /usr/lib64/libSDL2-2.0.so.0.3000.3
.
I believe this is affecting you because your host system ships the development symlink ${libdir}/libSDL2.so
, which has the same name as the copy of SDL2 that was bundled by the developers of Metro Exodus. In your case, it's because Arch doesn't separate development libraries from runtime libraries.
In distributions that do separate runtime and development libraries, like Debian and Fedora, I expect that this bug would not occur for users who only have the runtime library ${libdir}/libSDL2-2.0.so.0
installed system-wide (Debian: libsdl2-2.0-0
, Fedora: SDL2
) - but it would occur (the same as for Arch users) if the user has installed SDL2 development files (Debian: libsdl2-dev
, Fedora, SDL2-devel
). Other related distributions often do the same, e.g. Ubuntu behaves like Debian.
The system requirements listed on Steam for this game say that it needs "Ubuntu 20" (presumably meaning 20.04), so probably the game's developer/porter/publisher QA team only tested it on Ubuntu 20.04 systems that did not have a system-wide copy of libsdl2-dev
installed, accidentally working around this issue.
Based on my investigation of #411 back in 2021, setting the game's launch options to
LD_LIBRARY_PATH="$(pwd)${LD_LIBRARY_PATH+":$LD_LIBRARY_PATH"}" %command%
as per https://github.com/ValveSoftware/steam-runtime/issues/411#issuecomment-860605574 would probably work around this on systems that have SDL2 development libraries.
In https://github.com/ValveSoftware/steam-runtime/issues/411#issuecomment-860642250 I suggested several ways that the Metro Exodus developers could have avoided #411, and those same suggestions would likely avoid #674. My understanding of Valve's policy is that the contents of the game depot are under the game developer's control and will not generally be modified by Valve, so it is up to the game developer (and not Valve) if they want to fix this.
If someone wants to bisect this and try to find out which SDL2 commit triggers the infinite loop (this issue) and/or which SDL2 commit made the game slower (#411), the easiest way to do it would be:
cp --dereference build/.libs/libSDL2-2.0.so.0 ~/tmp/libSDL2-2.0.so.0
, or use it directly from the build directorySDL_DYNAMIC_API=/path/to/libSDL2-2.0.so.0 %command%
, setting the path as necessary/path/to/libSDL2-2.0.so.0
appears in the main game process's /proc/*/maps
/proc/*/maps
as well, as long as your version under test does get loadedSee https://github.com/libsdl-org/SDL/blob/SDL2/docs/README-dynapi.md for more details of the SDL2 library feature that we're using here.
Another possible route to figuring this out would be to contact the Metro Exodus developer, Linux porter or publisher to find out what version of SDL2 they bundled with the game, and what changes (if any) they made to it.
cc @slouken
Tracked as steamrt/tasks#456 internally, but not necessarily actually actionable for the Steam Runtime - this seems like it might be more of a game-specific issue.
Your system information
steamapps/common/SteamLinuxRuntime/VERSIONS.txt
?Name Version Runtime Runtime_Version Comment
depot 0.20240415.0 # Overall version number LD_LIBRARY_PATH - scout - # see ~/.steam/root/ubuntu12_32/steam-runtime/version.txt scripts 0.20240415.0 # from steam-runtime-tools
steamapps/common/SteamLinuxRuntime_soldier/VERSIONS.txt
?Name Version Runtime Runtime_Version Comment
depot 0.20240415.84602 # Overall version number pressure-vessel 0.20240415.0 scout # pressure-vessel-bin.tar.gz scripts 0.20240415.0 # from steam-runtime-tools soldier 0.20240415.84602 soldier 0.20240415.84602 # soldier_platform_0.20240415.84602/
steamapps/common/SteamLinuxRuntime_sniper/VERSIONS.txt
?Name Version Runtime Runtime_Version Comment
depot 0.20240415.84603 # Overall version number pressure-vessel 0.20240415.0 scout # pressure-vessel-bin.tar.gz scripts 0.20240415.0 # from steam-runtime-tools sniper 0.20240415.84603 sniper 0.20240415.84603 # sniper_platform_0.20240415.84603/
Please describe your issue in as much detail as possible:
When running the native Linux build of Metro Exodus in SLR the game will always enter an infinite loop when the in game resolution is changed. I have been able to reproduce this with the NVIDIA proprietary driver, as well as RADV. I have included steps to reproduce this below.
It appears that the game locks up because it enters an infinite loop where it repeatedly calls vkGetPhysicalDeviceSurfaceCapabilitiesKHR. From looking at the assembly code in gdb the game is polling for the current surface extent to be updated to something other than the previous display resolution, but this never occurs.
The only way I have found to avoid this problem is by forcing the game to run in in the scout SLR. I did this by entering the games compatibility settings in Steam, checking the "Force the use of a specific Steam Play compatibility tool", and then selecting "Steam Linux Runtime 1.0 (scout)". I suspect this works around the issue because the scout SLR seems to allow the game to load the build of SDL2 that it distributes as part of the game. When running with newer versions of SLR the game ends up loading the system copy of SDL2.
I think there are two plausible explanations for why using the game build of SDL2 resolves this issue.
1.) The SDL2 build shipped with the SLR contains a bug that is causing the X11 surface extent to never be updated. 2.) The game is reliant on undefined behavior from SDL, and something internal to SDL changed resulting in this issue.
Regardless it would probably be worthwhile to bisect SDL to determine the exact point where this issue started to occur.
Steps for reproducing this issue: