ValveSoftware / steam-runtime

A runtime environment for Steam applications
Other
1.18k stars 86 forks source link

X11/Wayland sockets not visible/passed to games run with Sniper/Soldier under flatpak #646

Closed Ristovski closed 8 months ago

Ristovski commented 8 months ago

Your system information

Issue:

Games that utilize either the soldier or sniper runtimes do not get the flatpak X11/wayland sockets propagated to their respective environments. This is the case for Team Fortress 2 with its x64_linux_test branch which uses SteamLinuxRuntime_sniper.

The respective socket symlinks however are still present under /var/run/user/1000/ but now point to non existent files. This is best illustrated by manually looking at the contents of /var/run/flatpak/:

$ flatpak run --command=bash com.valvesoftware.Steam
[📦 com.valvesoftware.Steam ~]$ ls /var/run/flatpak/ /tmp/.X11-unix/
/tmp/.X11-unix/:
X0

/var/run/flatpak/:
Xauthority  app  at-spi-bus  bus  doc  ld.so.conf.d  p11-kit  per-app-dirs-ref  pulse  wayland-1

this shows both /tmp/.X11-unix/X0 and /var/run/flatpak/wayland-1 are present inside the regular flatpak environment.

Issuing the same launch command that's being used to launch TF2 from the x64_linux_test branch, with absolute paths (captured with execsnoop):

$ flatpak run --command=bash com.valvesoftware.Steam
[📦 com.valvesoftware.Steam ~]$ data/Steam/ubuntu12_32/steam-launch-wrapper -- '/home/rafael/data/Steam/steamapps/common/SteamLinuxRuntime_sniper'/_v2-entry-point --verb=waitforexitandrun -- ls /tmp/.X11-unix/ /var/run/flatpak/ /var/run/user/1000/
/tmp/.X11-unix/:

/var/run/flatpak/:
app  at-spi-bus  bus  doc  p11-kit  per-app-dirs-ref  pulse

/var/run/user/1000/:
app  discord-ipc-0  discord-ipc-2  discord-ipc-4  discord-ipc-6  discord-ipc-8  doc      p11-kit          pulse
bus  discord-ipc-1  discord-ipc-3  discord-ipc-5  discord-ipc-7  discord-ipc-9  flatpak-info  pressure-vessel  wayland-1

as you can see, neither socket is being passed through, and the wayland-1 symlink in /var/run/user/1000 is thus left dangling.

This can further be reduced by dropping the steam-launch-wrapper:

$ /home/rafael/data/Steam/steamapps/common/SteamLinuxRuntime_sniper/_v2-entry-point --verb=waitforexitandrun -- ls /tmp/.X11-unix/ /var/run/flatpak/
/tmp/.X11-unix/:

/var/run/flatpak/:
app  at-spi-bus  bus  doc  p11-kit  per-app-dirs-ref  pulse

Checking execsnoop, I can see that the whole thing is run under pressure-vessel-adverb, which finally executes the following command, which when run manually shows the sockets as present:

$ /home/rafael/.local/share/Steam/steamapps/common/SteamLinuxRuntime_sniper/pressure-vessel/bin/steam-runtime-launcher-interface-0 container-runtime ls /tmp/.X11-unix/ /var/
run/flatpak/
/tmp/.X11-unix/:
X0

/var/run/flatpak/:
Xauthority  app  at-spi-bus  bus  doc  ld.so.conf.d  p11-kit  per-app-dirs-ref  pulse  wayland-1

Thus, it would appear that the issue is related to pressure-vessel-adverb and how it sets up the nested(?) sandbox environment.

The same thing seems to happen on soldier as well, which can be confirmed manually:

$ /home/rafael/data/Steam/steamapps/common/SteamLinuxRuntime_soldier/_v2-entry-point --verb=waitforexitandrun -- ls /tmp/.X11-unix/ /var/run/flatpak/
/tmp/.X11-unix/:

/var/run/flatpak/:
app  at-spi-bus  bus  doc  p11-kit  per-app-dirs-ref  pulse

Do note that the public branch of TF2 does not use either runtime, and thus works out of the box, as both sockets are available under the regular flatpak environment.

smcv commented 8 months ago

This is the case for Team Fortress 2 with its x64_linux_test branch which uses SteamLinuxRuntime_sniper.

As a side note, it's probably going to be more reliable to test this with a game that uses sniper for all branches: Battle for Wesnoth, Counter-Strike 2, Dota 2, Endless Sky or Retroarch. (Conveniently, all of those are free-to-play, several are open source, and several have very limited system requirements.)

Games that utilize either the soldier or sniper runtimes do not get the flatpak X11/wayland sockets propagated to their respective environments.

They usually do, so I think this is a system-specific problem. If you do this:

$ flatpak run --command=bash com.valvesoftware.Steam
[com.valvesoftware.Steam ~]$ ls -l /tmp/.X11-unix $XDG_RUNTIME_DIR
[com.valvesoftware.Steam ~]$ flatpak-spawn -- ls -l /tmp/.X11-unix $XDG_RUNTIME_DIR

does it show the same failure mode?

When running under Flatpak, the soldier and sniper runtimes end up doing the equivalent of a call to flatpak-spawn with some special parameters. This is implemented by doing D-Bus IPC to the flatpak-portal D-Bus-activated per-user service, which is a systemd --user service on systemd systems, or a child of dbus-daemon --session on non-systemd systems.

(They actually do the equivalent D-Bus call via the steam-runtime-launch-client tool instead of flatpak-spawn, to avoid having a mandatory dependency on a sufficiently new flatpak-spawn).

I suspect that what is happening on your system might be that the flatpak run process is inheriting the DISPLAY, WAYLAND_SOCKET and XDG_RUNTIME_DIR environment variables from your shell, but the flatpak-portal is inheriting its execution environment from systemd --user or dbus-daemon --session, which might not have those environment variables set?

If that guess is correct, then the solution will be to use dbus-update-activation-environment(1) during graphical session startup to ensure that the DISPLAY and possibly other variables get uploaded into the execution environment used by dbus-daemon --session. Other D-Bus-activatable session services also rely on being able to pick up this information from the environment, notably xdg-desktop-portal, so you will want this to be true anyway.

I see that you're using Sway. Unfortunately, Sway is known not to do various bits of integration/setup like this and XDG_CURRENT_DESKTOP automatically. Some distribution packages work around this as a downstream change, for example https://gitlab.archlinux.org/archlinux/packaging/packages/sway/-/commit/2f9c63b0539119acb63d6028c61d41c7faa1cebb in Arch, and some related projects document it as something that has to be done in per-user configuration, for example https://github.com/emersion/xdg-desktop-portal-wlr?tab=readme-ov-file#running.

In the Flatpak 1.15.x development branch, various environment variables are copied from the original flatpak run command instead of being inherited from flatpak-portal, to solve https://github.com/flatpak/flatpak/issues/5278, and that would probably have worked around this issue.

smcv commented 8 months ago

If my guess is not correct, then please capture a detailed log with this command inside the Flatpak shell:

STEAM_LINUX_RUNTIME_LOG=1 \
STEAM_LINUX_RUNTIME_VERBOSE=1 \
/home/rafael/data/Steam/steamapps/common/SteamLinuxRuntime_sniper/_v2-entry-point -- xterm

or by setting the launch options of the game to

STEAM_LINUX_RUNTIME_LOG=1 STEAM_LINUX_RUNTIME_VERBOSE=1 %command%

The log file will appear in /home/rafael/data/Steam/steamapps/common/SteamLinuxRuntime_sniper/var/, with a symbolic link slr-latest.log pointing to it.

smcv commented 8 months ago

Checking execsnoop, I can see that the whole thing is run under pressure-vessel-adverb

For your information, you've missed several steps in the middle. If you capture a detailed log, you'll see what is actually happening: several commands either replace themselves with another command via execve(), or run another command as a subprocess, or (in the case of steam-runtime-launch-client) ask to run another command via D-Bus IPC.

pressure-vessel-adverb is very late in the exec chain, and is intended to be run inside the new container that has the sniper runtime as its /usr. It does some final setup that needs to happen inside the container, and also acts as a supervisor process (a "subreaper") to make sure that the container isn't destroyed while there are still game processes running, even if some of those game processes are run in the background.

The actual container setup happens earlier than that, and is mostly done by the pressure-vessel-wrap tool. In Flatpak, it happens by asking flatpak-portal to create a new Flatpak "sub-sandbox" with a /usr specified by pv-wrap. When not in Flatpak, pv-wrap builds up a very long command-line for bubblewrap (the same sandboxing tool that Flatpak uses), and then uses execve() to replace itself.

which finally executes [steam-runtime-launcher-interface-0], which when run manually shows the sockets as present

This is because you're running it in your current execution environment, and not in a newly-created container environment, so you've skipped the step that has had a problem.

Ristovski commented 8 months ago

They usually do, so I think this is a system-specific problem. If you do this:

$ flatpak run --command=bash com.valvesoftware.Steam
[com.valvesoftware.Steam ~]$ ls -l /tmp/.X11-unix $XDG_RUNTIME_DIR
[com.valvesoftware.Steam ~]$ flatpak-spawn -- ls -l /tmp/.X11-unix $XDG_RUNTIME_DIR

does it show the same failure mode?

I had to modify your command, as $XDG_RUNTIME_DIR contains a symlink (../../flatpak/wayland-1) which is always present and shown but left dangling when the issue occurs. Your guess however is indeed correct:

$ flatpak run --command=bash com.valvesoftware.Steam
[📦 com.valvesoftware.Steam ~]$ ls -l /tmp/.X11-unix /var/run/flatpak/
/tmp/.X11-unix:
total 0
srwxr-xr-x 1 rafael rafael 0 Jan 21 13:49 X0

/var/run/flatpak/:
total 0
-rw------- 0 rafael rafael   0 Jan 31 17:08 Xauthority
drwx------ 4 rafael rafael  80 Jan 31 17:08 app
srwxr-xr-x 1 rafael rafael   0 Jan 31 17:08 at-spi-bus
srwxr-xr-x 1 rafael rafael   0 Jan 31 17:08 bus
dr-x------ 2 rafael rafael   0 Jan  1  1970 doc
drwx------ 2 rafael rafael 140 Jan 31 17:08 ld.so.conf.d
drwx------ 2 rafael rafael  60 Jan 31 17:08 p11-kit
-rw------- 1 rafael rafael   0 Jan 21 17:46 per-app-dirs-ref
drwx------ 2 rafael rafael  80 Jan 31 17:08 pulse
srwxr-xr-x 1 rafael rafael   0 Jan 21 13:49 wayland-1

[📦 com.valvesoftware.Steam ~]$ flatpak-spawn -- ls -l /tmp/.X11-unix /var/run/flatpak/
/tmp/.X11-unix:
total 0

/var/run/flatpak/:
total 0
drwx------ 4 rafael rafael  80 Jan 31 17:08 app
srwxr-xr-x 1 rafael rafael   0 Jan 31 17:08 at-spi-bus
srwxr-xr-x 1 rafael rafael   0 Jan 31 17:08 bus
dr-x------ 2 rafael rafael   0 Jan  1  1970 doc
drwx------ 2 rafael rafael 140 Jan 31 17:08 ld.so.conf.d
drwx------ 2 rafael rafael  60 Jan 31 17:08 p11-kit
-rw------- 1 rafael rafael   0 Jan 21 17:46 per-app-dirs-ref
drwx------ 2 rafael rafael  80 Jan 31 17:08 pulse

When running under Flatpak, the soldier and sniper runtimes end up doing the equivalent of a call to flatpak-spawn with some special parameters. This is implemented by doing D-Bus IPC to the flatpak-portal D-Bus-activated per-user service, which is a systemd --user service on systemd systems, or a child of dbus-daemon --session on non-systemd systems.

(They actually do the equivalent D-Bus call via the steam-runtime-launch-client tool instead of flatpak-spawn, to avoid having a mandatory dependency on a sufficiently new flatpak-spawn).

I suspect that what is happening on your system might be that the flatpak run process is inheriting the DISPLAY, WAYLAND_SOCKET and XDG_RUNTIME_DIR environment variables from your shell, but the flatpak-portal is inheriting its execution environment from systemd --user or dbus-daemon --session, which might not have those environment variables set?

Aah, I see. I was not aware of this mechanism, that explains quite a lot.

Fwiw, I am launching sway with: exec dbus-launch --exit-with-session sway.

After running dbus-update-activation-environment --verbose WAYLAND_DISPLAY XDG_CURRENT_DESKTOP=sway, restarting xdg-desktop-portal and killing flatpak-portal and flatpak-session-helper, the issue seems to be fixed:

$ flatpak run --command=bash com.valvesoftware.Steam
[📦 com.valvesoftware.Steam ~]$ ls -l /tmp/.X11-unix /var/run/flatpak/
/tmp/.X11-unix:
total 0
srwxr-xr-x 1 rafael rafael 0 Jan 21 13:49 X0

/var/run/flatpak/:
total 0
-rw------- 0 rafael rafael   0 Jan 31 17:22 Xauthority
drwx------ 4 rafael rafael  80 Jan 31 17:22 app
srwxr-xr-x 1 rafael rafael   0 Jan 31 17:22 at-spi-bus
srwxr-xr-x 1 rafael rafael   0 Jan 31 17:22 bus
dr-x------ 2 rafael rafael   0 Jan  1  1970 doc
drwx------ 2 rafael rafael 140 Jan 31 17:22 ld.so.conf.d
drwx------ 2 rafael rafael  60 Jan 31 17:22 p11-kit
-rw------- 1 rafael rafael   0 Jan 21 17:46 per-app-dirs-ref
drwx------ 2 rafael rafael  80 Jan 31 17:22 pulse
srwxr-xr-x 1 rafael rafael   0 Jan 21 13:49 wayland-1

[📦 com.valvesoftware.Steam ~]$ flatpak-spawn -- ls -l /tmp/.X11-unix /var/run/flatpak/
/tmp/.X11-unix:
total 0
srwxr-xr-x 1 rafael rafael 0 Jan 21 13:49 X0

/var/run/flatpak/:
total 0
-rw------- 0 rafael rafael   0 Jan 31 17:22 Xauthority
drwx------ 4 rafael rafael  80 Jan 31 17:22 app
srwxr-xr-x 1 rafael rafael   0 Jan 31 17:22 at-spi-bus
srwxr-xr-x 1 rafael rafael   0 Jan 31 17:22 bus
dr-x------ 2 rafael rafael   0 Jan  1  1970 doc
drwx------ 2 rafael rafael 140 Jan 31 17:22 ld.so.conf.d
drwx------ 2 rafael rafael  60 Jan 31 17:22 p11-kit
-rw------- 1 rafael rafael   0 Jan 21 17:46 per-app-dirs-ref
drwx------ 2 rafael rafael  80 Jan 31 17:22 pulse
srwxr-xr-x 1 rafael rafael   0 Jan 21 13:49 wayland-1

This can be further confirmed by launching TF2 from the x64_linux_test branch, which works.

Closing as this is indeed a system-specific problem.

smcv commented 8 months ago

I am launching sway with: exec dbus-launch --exit-with-session sway

This doesn't necessarily do what you think it does. If standard input is a terminal, --exit-with-session will poll it for input until end-of-file, which can result in consuming and ignoring terminal input. Otherwise, --exit-with-session won't do anything particularly useful: in particular it will not exit when sway does.

If you intend to be using the session bus provided by systemd --user at $XDG_RUNTIME_DIR/bus, you can normally just inherit an appropriate DBUS_SESSION_BUS_ADDRESS from libpam_systemd, and not need to run dbus-launch at all.

A well-integrated Linux distribution and/or well-integrated desktop environment shouldn't require you to set all of this up yourself, but if you're doing your own system-integration, then part of the price you pay for that is workarounds will hang around indefinitely and potentially conflict with each other.