ValveSoftware / steam-runtime

A runtime environment for Steam applications
Other
1.18k stars 86 forks source link

Steam Linux Runtime cannot start on customized distro using non-standard ld.so.cache location #666

Closed yurytch closed 4 months ago

yurytch commented 5 months ago

Your system information

Please describe your issue in as much detail as possible:

I expected steam executable to run and enable me to download things from Steam. Instead it retries a couple of times and then gives me this: src/steamUI/steamuisharedjscontroller.cpp (546) : Failed creating offscreen shared JS context then it produces zenity dialog with options for restarting steam. None works.

Steps for reproducing this issue:

  1. Install
  2. Run
kisak-valve commented 5 months ago

Hello @yurytch, x86_64-linux-gnu-capsule-capture-libs: code 13: open "/var/cache/ldconfig/ld.so.cache": Permission denied in your steamwebhelper.log makes this an issue with setting up the Steam Linux Runtime container environment for the web component to run inside, so I've transferred this issue report to the steam-runtime issue tracker for a runtime dev to ponder with you.

yurytch commented 5 months ago

Thank you! I've seen that line in logs, but I believe I've never had even read access to ld.so.cache from an ordinary user account on Slackware. So didn't take notice.

TTimo commented 5 months ago

I'm not sure that error is relevant to your problem. Please test with Steam beta as several fixes have been made to the steamwebhelper startup in beta already. You can opt in from the command line like this: echo publicbeta > ~/.steam/steam/package/beta

yurytch commented 5 months ago

@TTimo Thanks, but no, switching to beta didn't help. The line is now 545, and that's that.

src/steamUI/steamuisharedjscontroller.cpp (545) : Failed creating offscreen shared JS context

And why can't the process be interrupted by Ctrl+C after being told to restart? It runs the cycle of Starting steamwebhelper under bootstrap sniper steam runtime... forever.

By the way, my home dir is sitting under symlinked subdir, I've seen some talk here about that. But I have tried with clean new user account with a 'real' home dir too, the result was the same.

smcv commented 5 months ago

Please run /sbin/ldconfig -pv and /sbin/ldconfig -XNv, and attach their output. If ldconfig(8) is located somewhere different on a Slackware system, run it by that path instead.

After that, please try running the Steam beta with STEAM_LINUX_RUNTIME_VERBOSE=1 in the environment, and attach the resulting logs for further investigation.

Steam's output to the terminal is unlikely to be the interesting thing here, it is the logs that will be important (particularly steamwebhelper.log).

I believe I've never had even read access to ld.so.cache from an ordinary user account on Slackware

The normal arrangement on distributions that don't patch or reconfigure this part of glibc (such as the Debian and Red Hat families) is that ordinary, unprivileged users do have read access to /etc/ld.so.cache. If you don't, then the cache doesn't serve its purpose - the whole point is that ordinary unprivileged programs can use it to speed up shared library loading.

It's OK if you don't have access to /var/cache/ldconfig/aux-cache: for example, on Debian-derived systems, we normally don't. An error message involving /var/cache/ldconfig/ld.so.cache should only be seen if we can't load the traditional/interoperable path /etc/ld.so.cache, and therefore we fall back to trying the less common paths. ClearLinux's /var/cache/ldconfig/ld.so.cache happens to be the last one in our list.

If you already know that Slackware uses unusual or non-standard paths for the cache and configuration used by ld.so(8) and ldconfig(8), it would be helpful if you can describe them, because the container runtime framework needs to know how this stuff works on each distribution in order to operate as intended. Normally it's /etc/ld.so.cache, built by /sbin/ldconfig according to configuration that starts at /etc/ld.so.conf. A few distributions are known to do this differently (ClearLinux, Exherbo, possibly Solus) but as far as I was previously aware, all of the older distribution families like Debian, Gentoo, Red Hat and Slackware use the interoperable/traditional paths.

By the way, my home dir is sitting under symlinked subdir, I've seen some talk here about that. But I have tried with clean new user account with a 'real' home dir too, the result was the same.

To reduce the number of variables, please start by investigating this from a user account whose "official" home directory does not have symlinks in its path. When we have got that scenario working, we can look at whether your symlinked home directory is creating any different problems.

Using symbolic links to offload a directory to a different filesystem often causes trouble for container-related things (because the meaning of the symlink changes on entering the container), so I would recommend preferring to use bind-mounts if possible.

smcv commented 5 months ago

A few distributions are known to do this differently (ClearLinux, Exherbo, possibly Solus) but as far as I was previously aware, all of the older distribution families like Debian, Gentoo, Red Hat and Slackware use the interoperable/traditional paths.

All the references I could find from a quick search indicate that Slackware does use the same paths as Debian and Red Hat (glibc's upstream defaults, where /sbin/ldconfig reads /etc/ld.so.conf and writes /etc/ld.so.cache), and we've had issue reports from Slackware users in the past where the container framework was generally successful, so this seems likely to be system-specific rather than a general Slackware problem.

yurytch commented 5 months ago

I had a bit of lookaround and maybe I know what's happening. At least I can explain the ld.so.cache thing.

Of course I had read access to ld.so.cache all the time. It was Steam's accessing /var/cache/ldconfig that threw me off (I stat-ed that and saw root 0700).

I understand Steam's binaries expect ld.so.cache to reside either in /etc or in /var/cache/ldconfig, right? However, my glibc is patched to look for ld.so.cache elsewhere, in /var/db (just a part of my setup, which by doing that reduces writes to root system ssd)

Now, how is Steam expected to find anything in /var/cache/ldconfig from an ordinary user account, if the dir has 0700 set on (and it's Ubuntu thing, according to quick look in search engine)?

So if access to ld.so.cache is really needed for Steam's functioning, is there an environment variable maybe to make Steam look somewhere else for it? Didn't find anything obvious in strings.

smcv commented 5 months ago

my glibc is patched to look for ld.so.cache elsewhere, in /var/db (just a part of my setup, which by doing that reduces writes to root system ssd)

OK, so you said you are using Slackware, but what you are actually using is more like your own Slackware derivative... if you've patched glibc to look for core system files like ld.so.cache in a different location, then that's not really Slackware any more.

For internal technical reasons the container runtime framework needs to know where your ld.so.cache is kept, and this is not something that it can learn from any standard system API, so it is something that we have to "just know". However, because your customized operating system doesn't match any of the normal paths used for this file, the container runtime framework will not work correctly.

We have a document listing ABI assumptions made by the Steam Runtime which system integrators can refer to. It is possible to add support for additional paths (as we did for e.g. Exherbo and Clear Linux), but this scales really badly - a developer (usually me) needs to do extra work every time, and after that work has been done, it will slightly slow down Steam startup for everyone. I'm willing to do this for distributions with multiple users, like Exherbo and Clear Linux, but I hope you can understand why I'm reluctant to do this for the benefit of a customized distribution with literally one user!

Reading your ld.so.cache is only half of the problem: as well as being able to read your ld.so.cache outside the container, we also have to be able to create a corresponding file inside the container (which uses a mixture of your OS libraries and libraries supplied by Steam), so that your ld.so will still be able to read it.

Now, how is Steam expected to find anything in /var/cache/ldconfig from an ordinary user account, if the dir has 0700 set on?

Reading that directory is only necessary if you happen to be using a distribution that puts ld.so.cache in it: ClearLinux is the only example I know of. On ClearLinux, that directory needs to be readable by everyone, but they make that true, so it's OK.

On most other distributions (including Debian, Ubuntu, Red Hat, and presumably unpatched Slackware), it doesn't matter that it's root:root 0700 because we don't need to read it: we iterate through a list of possible locations for ld.so.cache, successfully find it at the interoperable path /etc/ld.so.cache, and stop there.

However, we don't know where your specific distribution keeps its ld.so.cache, so if we don't find it in the interoperable location, our only choice is to try other paths and see whether we can find it somewhere else. /var/cache/ldconfig/ld.so.cache happens to be the last one we try, so it's the one that ends up in the error message.

I agree that the resulting error message is misleading, and I've opened an issue internally for making it clearer (either retrying /etc/ld.so.cache last, or showing a list of all the places we tried, or something; this is tracked as steamrt/tasks#437 internally).

So if access to ld.so.cache is really needed for Steam's functioning, is there an environment variable maybe to make Steam look somewhere else for it? Didn't find anything obvious in strings.

There is no environment variable for this. This is the sort of low-level implementation detail that is normally hard-coded into glibc: when dealing with setuid/setgid binaries, if glibc obeyed environment variables, that would be a security vulnerability.

reduces writes to root system ssd

Minimizing writes to SSDs was a practical concern for very early generations of SSD hardware, but modern SSDs have enough write-endurance that minimizing writes is no longer particularly relevant. ld.so.cache is a "read-mostly" cache which is read very frequently and written rarely, so you would probably find that moving it back onto a SSD that has fast random access would improve program startup performance.

Specifically, the ld.so.cache only needs to be updated (by running ldconfig) when libraries in the system's global search path (/usr/lib and similar directories) are updated. If you're doing that, then you're necessarily writing to the system filesystem anyway, so an additional write to update ld.so.cache is likely to be "in the noise".

If possible I would recommend moving the cache back to the interoperable path /etc/ld.so.cache, and using your chosen distribution's ordinary, unpatched glibc.

If you have done the arithmetic and decided that, despite the complexity cost, you still consider it to be worthwhile to offload ld.so.cache onto another filesystem, what I would suggest is making ld.so default to reading /etc/ld.so.cache as normal, but then making /etc/ld.so.cache a symbolic link to wherever you have chosen to put that file (/var/db/ld.so.cache?), and patching ldconfig so that by default it will write to /var/db.

That way, both ld.so and the container runtime framework will be able to read from /etc/ld.so.cache as usual; and when the container runtime framework does its container setup (which involves making the container's /etc/ld.so.cache a symbolic link to our own cache file, on a tmpfs), ld.so will still work in that situation too.

smcv commented 5 months ago

@kisak-valve, I think a better title for this one would be something like:

SLR container cannot start on customized distro not using /etc/ld.so.cache
yurytch commented 5 months ago

Thank you.

Minimizing writes to SSDs was a practical concern for very early generations of SSD hardware

Still, some habits and idiosyncrasies had been acquired in that era. ))

In fact, if just ld.so.cache moved to a non-standard location goes as 'abnormal', what would you guys say, if you saw a complete description of my system, which I still fancy to be Slackware, at its heart, like. ))

Still, apps I use (and I use quite a few) do not seem to mind.

Does Steam client work from inside a virtual machine? I might be better off just installing some small-sized Ubuntu in a VM. I need Steam only for the downloading of things I've bought to work, anyway.

Symlinking from an unstandard location of ld.so.cache to /etc won't work with unpatched GLIBC, at least in Slackware. And symlinking that way with patched GLIBC (i.e. my system) doesn't satisfy Steam client.

So... it's a VM or a separate installation or nothing for me, right?

P.S. I must congratulate you on your very reasonable tone and your discussion conduct. It's always a pleasure to see good work (at communication in that case). Thank you.

smcv commented 5 months ago

Still, apps I use (and I use quite a few) do not seem to mind.

Other apps are probably not constructing a container environment on-demand that is a hybrid of your host system's graphics drivers and a more predictable application-level library stack, which has been a necessary thing for Steam to be portable to many Linux distributions, but does require it to make lower-level assumptions about glibc than is conventional!

Thinking of Steam as an app like any other is not really accurate - it's more like an app framework, operating in the same space as things like Flatpak, Snap or AppImage.

Does Steam client work from inside a virtual machine? I might be better off just installing some small-sized Ubuntu in a VM.

I don't think we would consider that to be a fully-supported situation, but in general yes it does work (I regularly use VMs to make sure that a fresh Steam installation is still runnable in various older Ubuntu releases).

The main issue you will have in a VM is that the purpose of Steam is to download and run games, but most modern games require GPU-accelerated rendering and will not usually have acceptable performance in a VM. Older or less-demanding games are more likely to work acceptably with software rendering (Mesa's llvmpipe for GL/EGL, and lavapipe for Vulkan). You might be able to pass through rendering to the host with technologies like VirGL, or use dedicated GPU passthrough to make a GPU unavailable to the host and dedicate it to the VM, but those are not really mature enough to be supportable or recommended.

Similarly, Steam running in a VM will be unable to access gamepads, joysticks and other game controllers unless you go to significant effort to make it possible.

Using a container technology like Flatpak could be another option. Valve does not officially support the Flathub community's unofficial Flatpak-app version of Steam, but in practice it often does work, and unlike a VM, it does get direct access to the GPU and game controllers. (There is also a Snap app, but that one generally works less well than its Flatpak equivalent and I wouldn't recommend it, especially on a non-Ubuntu host.)

I need Steam only for the downloading of things I've bought

You probably know this already, but downloading a game via Steam and then running it directly (without going via Steam) is not something that can be guaranteed to work. It will likely work for some games but not others. The only scenario that is expected to be tested and supported by all Steam game developers/publishers is launching their game via Steam, and for anything beyond that, it's up to the developer/publisher whether they aim to support it or not.

This is partly because of Steam's compatibility frameworks (the Steam Runtime) which make games work on a wider range of distributions but are not active if you run the game manually; partly because some game developers/publishers have chosen to make Steam a requirement as a licensing check (DRM); and partly because some game developers/publishers have chosen to rely on other Steam features such as cloud storage and gamepad mapping.

yurytch commented 5 months ago

Other apps are probably not constructing a container environment on-demand that is a hybrid of your host system's graphics drivers and a more predictable application-level library stack, which has been a necessary thing for Steam to be portable to many Linux distributions, but does require it to make lower-level assumptions about glibc than is conventional!

Yes, the analogy with AppImages has occured to me. Is there publicly available documentation describing what Steam does/expects with/on its host system? Maybe I'm hitting something trivial. In 2021 or so Steam had worked for me, on less unconventionally built Slackware system.

RyuzakiKK commented 5 months ago

The assumptions about the host system are documented in https://gitlab.steamos.cloud/steamrt/steam-runtime-tools/-/blob/main/docs/distro-assumptions.md

yurytch commented 4 months ago

I've finally found some time and looked into this, and as it happened, it was solvable quite trivially, actually.

I've just recompiled my glibc packages with the recognised /var/cache/ldconfig location. The /var/db was there before for purely aesthetical reasons.

Steam client works now. The game works, too.

I imagine you might want to add that to your documentation. Thank you guys for your help. The issue may be closed now.

smcv commented 4 months ago

I imagine you might want to add that to your documentation.

It's already documented: https://gitlab.steamos.cloud/steamrt/steam-runtime-tools/-/blob/main/docs/distro-assumptions.md?ref_type=heads#shared-libraries

The issue may be closed now.

@kisak-valve, please close this as "not planned".

yurytch commented 4 months ago

I imagine you might want to add that to your documentation. It's already documented:

Not quite. That documentation might be interpreted (possibly superficially) as 'having ld.so.cache at one of the standard locations is sufficient' (tried that previously, with no success). While what really works is 'having glibc actually use ld.so.cache residing at one of the standard locations'. Maybe it's obvious to Steam developers but not to us 'plebes'. ))