ValveSoftware / steam-runtime

A runtime environment for Steam applications
Other
1.2k stars 86 forks source link

Games run with Proton 5.13 don't see displays when launched in a network namespace #285

Closed rkfg closed 3 years ago

rkfg commented 4 years ago

My setup is unusual, I run Steam in a network namespace. I need that to make it skip my VPN and connect directly so that my ping in games isn't affected and I have no issues with my region when I buy games (i.e. I need Steam to see my real IP instead of VPN). So far it worked great without issues but I noticed that I can only run Windows games with Proton 5.13 when Steam is launched in the default network namespace. I have no issues running them on older Protons that don't use a containerized Steam Runtime (stock 5.0-9, Proton-GE etc.). Games also run fine on 5.13 when Steam is run as it's supposed to be, in the default network namespace. I'll describe my setup in detail.

First, I use newpid utility (can be installed from Debian repository) to launch programs in an NN (network namespace) because it doesn't require root privileges. This is the script I use to setup this NN:

#!/bin/sh
ip link add type veth peer name veth1
ip link set dev veth1 up
brctl addif br0 veth1
ip net add newpidsteam
ip link set veth0 netns newpidsteam
ip net exec newpidsteam ip link set dev lo up
ip net exec newpidsteam ip link set dev veth0 up
ip net exec newpidsteam ip a a 192.168.0.50/24 b 192.168.0.255 dev veth0
ip net exec newpidsteam ip r a default via 192.168.0.1

# make dnsmasq work

ip net exec newpidsteam iptables -t nat -F
ip net exec newpidsteam iptables -t nat -A POSTROUTING -p udp -m udp --dport 53 -j MASQUERADE
ip net exec newpidsteam iptables -t nat -A OUTPUT -d 127.0.0.1/32 -p udp -m udp --dport 53 -j DNAT --to-destination 192.168.0.10
ip net exec newpidsteam sysctl net.ipv4.conf.veth0.route_localnet=1

My main interface enp7s0 belongs to the bridge br0, I add a virtual ethernet pipe (requires kernel module veth) that connects namespaces, add its one end to the bridge, then create a new namespace called newpidsteam, and configure a distinct IP in that namespace (192.168.0.50). Then I make dnsmasq work by forwarding the DNS traffic to my main IP in the default namespace (192.168.0.10). Now I can run Steam in that namespace and all its traffic (and games) will be routed directly to my router instead of VPN provider's gateway. I use this script to run it:

#!/bin/sh
if [ -z "$INSTEAMNS" ]
then
  echo "Not in netns, relaunching..."
  newpid -N newpidsteam env INSTEAMNS=1 "$0" "$@"
  exit 0
fi
unset XMODIFIERS
export GDK_SCALE=2
export __GL_GSYNC_ALLOWED=1
unset GTK_IM_MODULE
unset QT_IM_MODULE
/opt/SteamLinux/steam.sh "$@"

When I try to launch Natural Selection 2 (appid 4920) I get this log: https://gist.github.com/rkfg/8b67d378ffd486b04641de3dd0f24416 So it starts but can't connect to the display. It all works fine if I run Steam normally in the default namespace. I'd be very glad for any advices and/or fixes to make it work because otherwise I'd need to turn off VPN every time (and don't forget to turn it back on after that!) and it's quite a bother. I realize that my case is probably very uncommon but I have to balance privacy and security issues (solved with VPN) with Steam requirements like region locks/prices and network games ping (and this is solved with network namespaces).

My system info: https://gist.github.com/rkfg/543f2eac2fb0c099a0e3c7f246d7a4fa

kisak-valve commented 4 years ago

Hello @rkfg, let's evaluate this as a Steam Linux Runtime - Soldier issue, which Proton 5.13 is running on top of, until there's a stronger indication that the issue is in Proton instead of the container runtime.

rkfg commented 4 years ago

For anyone encountering the same issue, I decided to change the VPN workaround to iptables and packet marking for now, it allows Proton 5.13 to work just fine and I still can make Steam access internet directly. Here's the solution:

  1. create an additional routing table and a rule:
    ip route add default via 192.168.0.1 table 100
    ip route add 192.168.0.0/24 dev br0 src 192.168.0.10 table 100
    ip rule add fwmark 100 table 100
  2. add your user to a group you want to use for routing (for example, games)
  3. add iptables rules:
    iptables -t nat -I POSTROUTING -m owner --gid-owner games -j MASQUERADE
    iptables -t mangle -A OUTPUT -m owner --gid-owner games -j MARK --set-mark 100

    Source address wasn't correct for me (it was the VPN source) so I need to SNAT it with MASQUERADE, the second line marks all packets from processes run with games group with mark 100 which is used to choose the routing table (also 100 in my example).

  4. run Steam with the primary group games:
    #!/bin/sh
    if [ `id -gn` != "games" ]
    then
    echo "Not in group games, relaunching..."
    sudo -Eg games "$0" "$@"
    exit 0
    fi
    /path/to/steam.sh "$@"

An easy way to see if it works is to run traceroute as usual and with sudo -g games traceroute ... and see if the gateway is correct. This method shouldn't interfere with any containerization techniques Steam Runtime currently uses. It also shouldn't affect file access either because most of the time the main user's group doesn't matter as it's not used by anyone except the user itself.

smcv commented 3 years ago

I don't think wrapping container/namespace tricks around the "outside" of Steam or pressure-vessel is something that's really feasible to support. You can run Steam (or any other Linux program) in arbitrarily weird ways, but we can't really hope to support all of them, so if they break I'm afraid "you get to keep both pieces".

However, I think you might have found a genuine bug in pressure-vessel, which might be worth investigating.

I think what was going on here might be that unsharing the network namespace makes Steam/pressure-vessel unable to connect to abstract AF_UNIX sockets (the ones that look like @something in netstat -l or ss -l), and more specifically your X11 display, @/tmp/.X11-unix/X0 or similar.

There are usually two ways to connect to the X11 server: one abstract socket like @/tmp/.X11-unix/X0, and one non-abstract socket in the filesystem like /tmp/.X11-unix/X0. Running X11 programs with DISPLAY=:0 will try one and then the other.

pressure-vessel is meant to make the filesystem-based socket available inside the container, the same way Flatpak does (in fact, we use the same code for this) - which means it doesn't matter that you can't see the abstract socket, because the filesystem-based socket is good enough. However, we've had some bugs involving that in the past, and the handling of environment variables between Steam, pressure-vessel, Proton and the game is pretty complicated, so it's possible that this has regressed.

Please could you try launching the game with PRESSURE_VESSEL_VERBOSE=1 in the environment? Either run Steam like that, or temporarily put PRESSURE_VESSEL_VERBOSE=1 %command% in the game's Launch Options. This will produce a lot of debug logging to the same place as the game's own output. You can censor the log if you need to, as long as it's obvious what you did (for example replacing your username with XXX is fine).

In particular, I'm interested in seeing what happens to the DISPLAY environment variable through the process.

What is meant to happen is that whatever your DISPLAY is outside the container, we end up with DISPLAY=:99.0 inside the container; and we take your X11 filesystem socket from outside, and put it at /tmp/.X11-unix/X99 inside, so that when the game tries to connect to :99, it gets the correct socket and everything works.

rkfg commented 3 years ago

Thanks for this insight! Very interesting and reminded me of another bug directly related to network namespaces and abstract unix sockets that I investigated and reported to NVIDIA.

Surprisingly, I can't reproduce this Pressure Vessel bug now! I started Steam in the namespace, changed Proton version to 5.13-2 for Natural Selection 2 and it launched just fine. Probably something has been fixed in the runtime so it's no longer an issue. I will still keep my iptables solution as it's much less invasive (the only side effect is that all files now have games group owner but it doesn't matter). I'll close this issue then.