ValveSoftware / steam-runtime

A runtime environment for Steam applications
Other
1.2k stars 86 forks source link

Steam Linux Runtime - soldier 0.20210809.22 regression: libstrangle causes games to crash #443

Closed crizan closed 3 years ago

crizan commented 3 years ago

Your system information

Please describe your issue in as much detail as possible:

After the latest update of the Steam Linux Runtime games that are launched with it (both native and with proton) cannot launch with libstrangle. Reverting to the current_release branch fixes the issue. Mangohud and gamemode still work normally.

log from hollow knight

Steps for reproducing this issue:

Install libstrangle

Proton game

  1. set strangle 60 %command% as launch options, game does not run
  2. remove launch options, game does run

Linux native game

  1. set strangle 60 %command% as launch option, game does run
  2. set steam linux runtime in the compatibility tab, game does not run
  3. remove launch options, game does run
smcv commented 3 years ago

This is probably a regression triggered by the same changes that fixed #435.

What does the strangle script actually do on your particular distribution? Does it add something to LD_PRELOAD and/or LD_LIBRARY_PATH?

Can the same crash be reproduced with a free-to-play game, so that everyone affected by this can be testing the same thing? We often use Floating Point (OpenGL, native Linux) because it's free and has low requirements. Life Is Strange episode 1 and Unturned also make good example games.

Please could you get a new log with the environment variable STEAM_LINUX_RUNTIME_VERBOSE=1 set?

10:49:32.082062: pressure-vessel-adverb[36522]: I: Command killed by signal 11

This is a segmentation fault (crash) in the game itself.

smcv commented 3 years ago

From the libstrangle documentation:

Might crash if used together with other libs that hijack dlsym, such as Steam Overlay. It seems to work with Steam Overlay when placed at the end of LD_PRELOAD for some reason.

I'm not really surprised this can go wrong: LD_PRELOAD modules are injecting arbitrary code into the game process, and there's a lot that can go wrong with that, particularly if they override something as fundamental as dlsym().

crizan commented 3 years ago

I can experience the same crash with Floating Point. This is the log with STEAM_LINUX_RUNTIME_VERBOSE=1 log

If it helps I'm using strangle straight from the repo. Also enabling STRANGLE_VKONLY=1 resolves the issue, which makes me think that the problem are these libraries that get appended to LD_PRELOAD

if [ "$STRANGLE_VKONLY" != "1" ]; then
    if [ "$STRANGLE_NODLSYM" = "1" ]; then
        LD_PRELOAD="${LD_PRELOAD}:${STRANGLE_LIB_NAME_NO_DLSYM}"
    else
        LD_PRELOAD="${LD_PRELOAD}:${STRANGLE_LIB_NAME}"
    fi
fi

Both libraries cause the crash, toggling the steam overlay doesn't seem to change anything

STRANGLE_LIB_NAME="libstrangle.so"
STRANGLE_LIB_NAME_NO_DLSYM="libstrangle_nodlsym.so"
smcv commented 3 years ago

Please could you change the title of this issue so it will remain true as the latest update changes? Something more like this: Steam Linux Runtime - soldier 0.20210809.22 regression: libstrangle causes games to crash.

The LD_PRELOAD and library search path setup for libstrangle looks similar to the way the mangohud Debian package works, which is one of the scenarios we fixed for #435.

I think what's going on here is that before depot 0.20210809.22, your strangle 60 %command% launch options were having no practical effect: the pressure-vessel container-launcher didn't know how to deal with LD_PRELOAD modules that were not specified as an absolute path, so it recovered by just ignoring them. Since depot 0.20210809.22, we handle more LD_PRELOAD modules, and in particular we now load libstrangle - but that triggers a crash, because something is not behaving as expected.

It would be useful if you could try to reproduce this in a more minimal way, by setting the launch options like this:

Things like libstrangle are a bit of a hack, especially for OpenGL games (Vulkan layers are still a mess from a support point of view, but they're less bad than what people had to do in the OpenGL world).

crizan commented 3 years ago

So, you're right, strangle 60 %command% was not having any effect on OpenGL games with the Steam Linux Runtime (it was such a specific case that I just didn't encounter it)

  1. %command% works always
  2. LD_PRELOAD="${LD_PRELOAD}:libstrangle.so" %command% crash with 0.20210809.22 log, no crash with previous version but also no fps limit, no crash without Steam Linux Runtime and fps limit working
  3. LD_PRELOAD="${LD_PRELOAD}:libstrangle_nodlsym.so" %command% like 2 log
  4. LD_PRELOAD="libstrangle.so" %command% this for some reason always crashes, even without the Steam Linux Runtime, it also doesn't show the resolution selector like the others. log The only game I was able to run with this option was Neverwinter Nights EE. With the client_beta Steam Linux Runtime even that crashes though.
  5. LD_PRELOAD="libstrangle_nodlsym.so" %command% like 2 and 3 log
smcv commented 3 years ago

I can reproduce the crash on a Debian 11 test system (NVIDIA proprietary graphics driver, if it matters).

At least part of this is a libstrangle bug. libstrangle assumes it can dlopen the libdl library via the name libdl.so, but that name is not guaranteed to exist, and in a SLR container it usually doesn't. I've proposed a fix for this in libstrangle: https://gitlab.com/torkel104/libstrangle/-/merge_requests/22

It might be possible to work around this in a future SLR release, by creating the libfoo.so symlinks even though they shouldn't really be necessary.

With that fixed, I'm still seeing a crash in Vulkan games like Artifact: the libstrangle Vulkan layer seems to be colliding with Valve's Fossilize Vulkan layer. I think that's a separate bug somewhere, either in libstrangle or in Fossilize (or maybe in SLR, it's not completely clear).

crizan commented 3 years ago

Thanks, it seems to work now (should I close the issue?)

smcv commented 3 years ago

I think yes, let's close this: it was really a libstrangle bug.

smcv commented 3 years ago

I'm still seeing a crash in Vulkan games

I'll continue to look into that, but if it's a SLR bug then we should treat it as a separate issue, and ignore it for the purposes of closing this issue.

kisak-valve commented 3 years ago

Closing per the last couple comments.