ValveSoftware / steam-runtime

A runtime environment for Steam applications
Other
1.17k stars 86 forks source link

Issues running ALVR in Runtime and recommendations for easier building #633

Open Vixea opened 7 months ago

Vixea commented 7 months ago

We're trying to get our project built underneath your runtime so users don't have to use the Steam Play None compatibility tool. Sadly we're running into multiple build time and runtime issues I'd hope you'll be eager to solve.

Reproduction steps:

  1. run runtime either using docker or podman
  2. clone alvr using git(https://github.com/alvr-org/ALVR)
  3. install rust using rustup
  4. install x264 libraries and dev files
  5. install libunwind - needed as a hack because SteamVR doesn't implement needed apis
  6. go into alvr's repo and run cargo xtask prepare-deps --platform linux --no-nvidia(if on amd)
  7. run cargo xtask build-streamer --release --no-nvidia(if on amd)
  8. run build/alvr_streamer_linux/bin/alvr_dashboard
  9. hit launch SteamVR(you can't use the play button in Steam)
  10. SteamVR should now spit out an error and if you look in vrcompositor logs you'll see

Build time issues:

nice haves: All the Rust compiler and std libraries are way too old - rust is a fast-moving ecosystem so updating to the latest rust version would be nice libunwind, x264 dev and library packages installed by default

Runtime Issues:

Must be solved: Mon Dec 04 2023 01:06:13.481651 [Error] - WaylandHMDLeaseDevice::LeaseDevice_DrmFD: No libdrm.so - not sure if this is looking for the exact file or if it's just a placeholder - this is as far as it could get there could be other runtime issues not yet exposed.

Please contact me if you're having any issues reproducing or if you know how to solve the runtime issue. Thanks for your help!

TTimo commented 7 months ago

Hello,

I'm not sure what "Steam Play None" refers to, but I suspect it's a third party thing not supported by Valve. Please provide a link maybe.

When you say "your runtime", I assume you mean the sniper runtime (e.g. Steam Runtime for Linux 2) as generally documented here: https://github.com/ValveSoftware/steam-runtime

Having to rebuild and bundle outdated software packages with your release is generally the right way to approach this.

If you have specific versions of the packages you rely on - a minimum version needed plus a 'good' version (that would be nice), for say libunwind, x264, rust etc. then we can in turn consider them for update and inclusion in a future sniper SDK.

As far as the DRM lease error, are you having better luck on XOrg based systems?

Vixea commented 7 months ago

https://github.com/Scrumplex/Steam-Play-None basically allows you to play SteamVR use system dependencies avoiding the container which of course if you don't build your application it would not allow it to run.

Yes SteamVR uses sniper.

No and this is probably a wayland only issue as we use a different method to work around missing SteamVR APIs which do not work when SteamVR is running under Wayland.

TTimo commented 7 months ago

SteamVR runs in sniper SLR (to the exception of vrcompositor which still needs scout LDLP), but VR titles are spawned by the Steam client and don't necessarily have to (by default they would be launched under scout LDLP).

Although I guess ALVR is a bit of a special case so I'm not sure how you are trying to inject it with the rest of the system.

Vixea commented 7 months ago

Oh simple when rename the real vrcompositor vrcompositor.real inject our own little executable which has a few environment variables set on it so our layer can run, our layer then uses libunwind to get the pose from the compositor and in the case of Wayland we fake being a DRM leased display on xorg we have this whole virtual display Vulkan layer thingy

TTimo commented 7 months ago

As I mentioned, vrcompositor needs to run in scout LDLP, while the rest of SteamVR has moved to sniper SLR. If you are somehow forcing vrcompositor to run in a sniper SLR environment, it won't have the required SYS_CAP privileges.

So your custom vrcompositor override with libunwind needs to be built against the scout SDK ideally. The rest of your infrastructure could well be built and running in sniper SLR if you setup adequate inter process communication.

Vixea commented 7 months ago

Hmm, I don't think we are just building alvr in sniper, would that force it to run in sniper?

TTimo commented 7 months ago

If you inject the parts concerned with vrcompositor via modifications to vrcompositor-launcher.sh, then you'll be in scout LDLP. You can see the various relaunches in that script, from 'host level' to 'scout LDLP', then into 'vrenv' (which adds a few paths to VR stuff via LD_* env vars).

For everything else you probably want to work through vrstartup.sh which is executed directly in a sniper SLR container by Steam.

Vixea commented 7 months ago

We do not, sorry for the confusion we don't edit anything in the scripts nor the scripts themselves. We replace vrcompositor the binary with are own through symlinks and move the actual binary to vrcompositor.real also I'm not sure where this is going when it comes to the runtime error?

TTimo commented 7 months ago

That's fine too - I assume you still wrap and run vrcompositor.real at some point. If this is the only process that you need in your setup, then you are in scout LDLP and should rely on the scout SDK exclusively.

Which is going to be very old libraries and not much suitable for what you are doing. Which is why I've mentioned a multi-process setup where the more complex parts are built and running in a more modern sniper SLR environment.

I wish it could be more simple but unfortunately that's what it takes for binary ABI compatibility across Linux distributions.

Vixea commented 7 months ago

well unfortunately even beginning to build under scout is an impossibility as we use rust infrastructure which requires glibc 2.17 while scout provides glibc 2.15

Vixea commented 7 months ago

is it impossible to have it update to soldier?

smcv commented 6 months ago

WaylandHMDLeaseDevice::LeaseDevice_DrmFD: No libdrm.so

Answering this part because it looks the most straightforward:

I don't know whether this is an issue in ALVR or in SteamVR or something else, but if some component is using dlopen to load libdrm.so.2, it should be loading it by the name libdrm.so.2. The libdrm.so symlink is for compile-time linking only, and is not guaranteed to exist on non-developer systems (and in principle it could point to an incompatible libdrm.so.3, which would crash your application-level code).

smcv commented 6 months ago

is it impossible to have it update to soldier?

The short answer is yes, it's impossible. The long answer is something I will need to come back to when I have more time available to describe what is happening in more detail.

Vixea commented 6 months ago

We are not calling dlopen anywhere the code that does this is compiled separately from everything else using this command g++ -shared -fPIC $(pkg-config --cflags libdrm) drm-lease-shim.cpp -o {}/alvr_drm_lease_shim.so Minus one flag that is needed in the runtime to set the compiler to use cpp17

Vixea commented 6 months ago

Oh I forgot to mention this was compiled in the sniper runtime it's not like we can compile in scout because of a too old glibc version and I'm 99% sure upgrading that even two more minor versions is a no go

smcv commented 6 months ago

The longer answer:

SteamVR is unusually complicated, and if a third-party project like yours hooks into it by moving/replacing/overwriting parts of SteamVR, then it is unfortunately going to inherit all of that complexity.

There are three separate execution environments involved in SteamVR:

  1. The sniper container where most of SteamVR runs. This is a container environment. Application-level libraries like SDL and PulseAudio come from sniper (Debian 11, with selected backports for components like SDL). Graphics drivers and related libraries, like Mesa, come from the user's host system (can be anything, for example SteamOS on Steam Deck). For dependencies of the graphics drivers, like glibc and libdrm, we look at the sniper runtime and the host system, compare the versions, and take whichever one is newer: we have to do this, otherwise it wouldn't work. The recommended build environment for any component that runs here is the sniper SDK.
  2. The environment where the Steam client runs, which is also the environment where the vrcompositor runs. This is a LD_LIBRARY_PATH environment, consisting of the user's host system, plus compatibility libraries to make it ABI-compatible with scout (Ubuntu 12.04 with selected backports). The recommended build environment for any component that runs here is the scout SDK, but newer build environments can be made to work if you're careful and lucky.
  3. The environment where the actual VR game runs, for example Half-Life: Alyx. This depends on the game. For some games, it's a sniper container based on Debian 11, the same as (1.) above (for Proton 8 or a few recent native Linux games). For other games, it's a soldier container based on Debian 10 (for Proton 7, or (when combined with a scout compatibility layer), native Linux games with "Steam Linux Runtime 1.0 (scout)" selected). Or it might be a legacy LD_LIBRARY_PATH environment, the same as (3.) above. If you are injecting arbitrary code into game processes with LD_PRELOAD or a Vulkan layer, then it needs to be compatible with anything and everything: like (2.), the recommended build environment would be the scout SDK, but newer build environments can be made to work if you're careful and lucky.

What the SteamVR developers wanted to do was to run all of SteamVR in the sniper container (1.), to be able to get a modern(ish) library stack and correspondingly modern compilers. If this had been possible, then (2.) could have been eliminated, leaving only (1.) and (3.) for you to deal with. Unfortunately that's not possible. The problem is that SteamVR also wants to run one component (the vrcompositor) at an elevated priority, by using CAP_SYS_NICE. This requires it to be setcap or setuid, and it also can't work inside the sniper container. This is because the Linux kernel won't allow us to set up the sniper container without first setting the PR_SET_NO_NEW_PRIVS flag, but if that flag is set, setcap and setuid are ignored. Instead, SteamVR inside the sniper container (1.) uses inter-process communication (in our case, D-Bus) to ask the Steam client to launch vrcompositor outside the container, in (2.), where it can run with elevated priority as desired. This is what Timo was referring to in https://github.com/ValveSoftware/steam-runtime/issues/633#issuecomment-1843387555.

Because there are these three different execution environments, it is not necessarily going to be possible to build your whole project as one build pass in one SDK: it might be necessary to build most of it in the sniper environment (for 1.), but then build small parts of it in a scout or soldier environment so that they can be run in more hostile environments (2. and 3.). This is a technique that also gets used in Steam itself: the pressure-vessel tool that starts the Steam Linux Runtime containers is built in a scout environment, so that it can run almost anywhere, but the application-level libraries for the container are built in the much newer soldier/sniper environment.

If you need your vrcompositor wrapper, which will run in (2.), to be able to do things that require a modern build environment, one way to achieve it would be to do what SteamVR does, but in reverse: make your vrcompositor wrapper be a very thin shim that just sends IPC requests back to a component that is running inside the sniper container (1.), and do the difficult parts inside the sniper container. For example, it could use D-Bus or some other protocol over an AF_UNIX socket. If you can limit this part to C or simple C++ code that is compilable with scout's older toolchains, then the other end of the IPC connection could be Rust code in the sniper container.

Above, I said that the recommended build environment is the scout SDK, but there are other options if you're careful and lucky. The reason I say this is that the scout environment is the newest thing that can be guaranteed, but in practice, the vast majority of VR-capable systems are going to be at least as new as soldier (Debian-10-based). SteamVR relies on this: I believe vrcompositor is actually compiled in a soldier environment, and its maintainers make sure to be cautious about adding dependencies, so that in practice it will usually work (as long as the user's glibc, Mesa, libdrm and so on are new enough). This cannot be guaranteed, though: the only SDK that definitely produces binaries compatible with (2.) and (3.) is scout, which as you've noticed is very old (and that's why it produces binaries compatible with as many environments as possible).

Because you seem to be writing the majority of your application logic in Rust, you have another option available to you: Rust is generally "mostly" statically-linked, with only a minority of dependencies dynamically loaded at runtime. This is convenient, because it means that executables built in a newer environment or with a newer toolchain will often run on older systems most of the time, even in situations where equivalent C or C++ code wouldn't: so you might be able to compile binaries in a newer environment than we would normally recommend, and still run them successfully in an older environment. Again, this cannot be guaranteed.

smcv commented 6 months ago

I'm 99% sure upgrading that even two more minor versions is a no go

Unfortunately, yes. scout is more than 10 years old, and any time we replace part of it, it comes with risks. Upgrading a core library like glibc will definitely break the ability to run Steam on very old LTS operating systems that have an older glibc. It could also break our ability to build Steam or higher-level components of scout, causing the whole Jenga tower to fall down. The safest option is to do as little with it as possible, and encourage game developers to use sniper for new games.

All the Rust compiler and std libraries are way too old - rust is a fast-moving ecosystem so updating to the latest rust version would be nice

Tracking a fast-moving ecosystem is the opposite of what the Steam Runtime is usually trying to do: its purpose is to make it as likely as possible that games from 2013 will still run today, and also make it as likely as possible that games released today will still work in 2033. That's why everything in it is relatively old: we're trying to give games a fixed platform to run on, and avoid having a moving target that makes games stop working as soon as their developer stops actively maintaining them. New versions of the Rust toolchain generally need new versions of LLVM, which is also large.

If the prebuilt toolchains available from rustup work in a sniper environment, that might well be your best option.

If you have specific versions of the packages you rely on - a minimum version needed plus a 'good' version (that would be nice), for say libunwind, x264, rust etc. then we can in turn consider them for update and inclusion in a future sniper SDK.

Yes, if you have requests for additions to the SDK, please be as specific as possible.

smcv commented 6 months ago

run runtime either using docker or podman

There are lots of things that could be referred to as "runtime". Do you mean registry.gitlab.steamos.cloud/steamrt/sniper/sdk? Or if not that, then what?

run build/alvr_streamer_linux/bin/alvr_dashboard

Do you mean that you are running your VR component inside the Docker/Podman container? Or do you mean that you are building a binary inside the Docker/Podman container, and then running that same binary on your host system, without using any particular container? Or something else?

Docker/Podman/Toolbx is a very suitable environment for compiling components like this, but is not really set up for running end-user graphical software like games or VR, so I would not recommend using "run your VR component inside Docker/Podman" as your approach to runtime portability.

If you are treating your binary as "build once, run almost anywhere" and hoping that Rust's mostly-static-linking will allow you to run it on the host system with no container, then maybe that will work. However, if your binary runs components from SteamVR as subprocesses, then those components are going to be running on your host system with no container, which is not what they are expecting: all parts of SteamVR are going to expect to be run from one of the three environments I described in a previous comment https://github.com/ValveSoftware/steam-runtime/issues/633#issuecomment-1852735892.

The Steam Linux Runtime containers are the environment that Steam uses to provide the environment that I described as (1.) above (the same as we use for Counter-Strike 2, Dota 2, Proton 8 and most of SteamVR). These are not the same as Docker or Podman: they are their own unique Steam-specific thing, with more similarities to Flatpak than anything else. To run a program in this environment, please see: https://gitlab.steamos.cloud/steamrt/steam-runtime-tools/-/blob/main/docs/slr-for-game-developers.md#running-commands-in-sniper-soldier-etc. You will probably also find the rest of that document useful.

Vixea commented 6 months ago

As to the first question: building in the container(the steam specific or registry.gitlab.steamos.cloud/steamrt/sniper/sdk), running on the host. Second question: we use the oldest Ubuntu build that includes vulkan-headers for us, now because of glibc it means it can't run anywhere but it will cover 99% of cases, when we have a specific SteamVR part of the ci that would probably run in the container, with another ci for Monado(when we add support for that)

Vixea commented 6 months ago

well umm since the drm-leasing-shim was the part that was giving us problems I tried compiling in scout to see if that would fix that... the gcc version is too old we use the filesystem header for some functions and everything I tried before doesn't work probably just because its too old

Vixea commented 6 months ago

also tried soldier failed too because I think the drm version is too old ie

 'struct _drmModeModeInfo' has no member named 'vdisplay'