ValveSoftware / Dota-2

Tracker for issues specific to Linux and Mac in the Reborn client. If you have a general issue or non-system-specific feature request please go to dev.dota2.com
466 stars 38 forks source link

last update: dota2 new runtime misses libraries then segfaults #2390

Open sylware opened 1 year ago

sylware commented 1 year ago

last update dota2 is not starting anymore, seems to reall happen in libtier0:

[29163.326708] dota2[3465]: segfault at 0 ip 00007faf5d6d3606 sp 00007ffe46399d00 error 6 in libtier0.so[7faf5d61f000+2e5000] likely on CPU 1 (core 1, socket 0) [29163.326719] Code: 2d 3c 24 24 00 31 db 0f 1f 44 00 00 49 8b 74 dc 08 48 85 f6 74 0a 4c 89 ef 31 c0 e8 14 17 f5 ff 48 83 c3 01 41 39 1c 24 7f e2 04 25 00 00 00 00 00 00 00 00 0f 0b c7 04 25 00 00 00 00 00 00

smcv commented 12 months ago

Does Dota 2 even work at all on Linux right now (not through Proton)?

Yes, for example I've recently run it successfully on Arch, Debian, Fedora and Ubuntu. There are lots of other distributions, and there can be driver-, hardware- or system-specific bugs (for example it's working fine for me on Debian, but isn't working for the reporter of #2394); so if it isn't working when run in the supported way (via Steam and SteamLinuxRuntime_sniper) on some other distribution, please report an issue with logs so that it can be analyzed and fixed, as was done with #2392.

2394 is presumably waiting for someone in Valve to analyze the crash dump that was uploaded.

Dota 2 was using a container runtime already wasn't it?

A qualified yes. If you run it via Steam, it has been run using a container runtime since January 2022. If you're running it not via Steam (which, as I said, is unsupported), then it won't have run via a container runtime (or the older LD_LIBRARY_PATH runtime, or any other compatibility tool) unless you have taken steps to replicate that setup outside Steam.

What changed recently is that until recently, Dota 2 was compiled in a very old environment (Steam Runtime 1 'scout', based on Ubuntu from 2012) with very old libraries and compilers; so it was intended to be run in the container environment, but if it was run without using the container, in practice it would often still work if you're sufficiently lucky.

Since July 2023, it has been compiled in a much newer environment (Steam Runtime 3 'sniper', based on Debian from 2021) and now requires correspondingly newer libraries and compilers. This makes it much less likely to run successfully on your host system (I don't think you have mentioned what that host system is?) in the absence of a container: it's still a matter of luck whether it will run in that environment or not, but having been compiled against newer libraries makes it less likely for an arbitrary host system to match them.

If you're running it from outside Steam (again, unsupported), it is possible to run it in the same container, but that won't happen automatically.

why was Dota 2's runtime changed specifically?

I am not a Dota 2 developer and cannot speak for their specific reasons, but the most likely reason is to be able to rely on the newer libraries and compilers that the 'sniper' container provides. If developers who are working on Windows can upgrade their compiler and rely on its new features, but the Linux build is forever stuck in an environment from 2012, that's not a sustainable situation to be in. There are basically two ways out of that: either do the Linux builds in a way that allows for newer dependencies (the container runtime), or stop producing Linux binaries altogether and require Linux users to run the Windows binaries via Proton. They chose the first option, which is the one I prefer, and based on your comments about Proton, presumably the one you prefer as well.

I'm sure you're tempted to ask "why can't you just?" about all the obvious maybe-solutions that don't involve a container (such as bundling more dependencies, or backporting compilers), but the generic answer is: because when we tried them, either they didn't work, or they worked for now but were clearly not going to be feasible to keep working reliably for another 5 or 10 years. I do not intend to debate this, and it would be off-topic to do so on an issue tracker.

hezd1 commented 12 months ago

This might help someone figure out a work around. If i try to launch to the game via lutris with wine the game launches perfectly but it's not connected to steam so it's not playable. Anyone know if it's possible to fix this?

smcv commented 12 months ago

@hezd1: Please report a separate issue with logs. It seems very unlikely that you are using @sylware's operating system, so whatever issue you are seeing, it is a different one.

sylware commented 12 months ago

So, the host glibc is newer than the container glibc.

1 - The container fails to pull the host ELF interpreter /lib64/ld-linux-x86-64.so.2 . I made a hard copy instead of a symbolic link and the container managed to pull it in. Dota2 start.

2 - It seems you are using LD_LIBRARY_PATH, which you said you are not, that to please the broken games overwritting LD_LIBRARY_PATH completely in their startup script.

3 - alsa in-game voice works fine in the container. Once alsa-lib system configuration /etc/asound.conf is pulled in the container, alsa[dmix] should work fine like the currently working alsa[dsnoop].

Please refrain any conflict of interets, due to the fact that "pressure vessel" and pulseaudio2(pipewire/wireplumber) are from collabora, interfering with making alsa[dmix] work (as I said alsa[dsnoop] is already working, volume meter in dota2 audio panel).

smcv commented 11 months ago

1 - The container fails to pull the host ELF interpreter /lib64/ld-linux-x86-64.so.2 . I made a hard copy instead of a symbolic link and the container managed to pull it in.

It surprises me that this is necessary: the ELF interpreter is normally a symlink on some of the OSs where pressure-vessel is known to work, for example /lib64/ld-linux-x86-64.so.2 -> /lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 on Debian.

Please attach a log so that this can be diagnosed and fixed.

smcv commented 11 months ago

2 - It seems you are using LD_LIBRARY_PATH, which you said you are not, that to please the broken games overwritting LD_LIBRARY_PATH completely in their startup script.

Yes, that was a simplification of the situation. You will have noticed a continuing theme, that portability is complicated.

More precisely, we aren't using LD_LIBRARY_PATH in situations where ld.so.cache is sufficient. However, a few libraries have alternative names, like libbz2, which normally (upstream and in e.g. Debian) has SONAME libbz2.so.1.0, but is libbz2.so.1 in some distributions like Fedora. Game binaries could in principle be relying on either of those names, although if the game was correctly built in a Steam Runtime container, libbz2.so.1.0 is more likely.

We cannot use ld.so.cache for the alias, because ld.so.cache only indexes libraries by their canonical SONAME (for example libbz2.so.1.0 on Debian, libbz2.so.1 on Fedora); so we still have to use a LD_LIBRARY_PATH for the alias (for example libbz2.so.1 on Debian, libbz2.so.1.0 on Fedora), and if a game is relying on the non-canonical name and clears the LD_LIBRARY_PATH, then that game will not work.

smcv commented 11 months ago

3 - alsa in-game voice works fine in the container. Once alsa-lib system configuration /etc/asound.conf is pulled in the container, alsa[dmix] should work fine like the currently working alsa[dsnoop].

This is https://github.com/ValveSoftware/steam-runtime/issues/344. If you configure dmix/dsnoop in ~/.asoundrc then it will probably work.

pressure-vessel intentionally does not share /etc/asound.conf with the container because /etc/asound.conf is "owned by" the host system's libasound, making it relatively likely that an upgrade will make it use modules or syntax that are not available inside the container, leading to a risk of breaking audio for other Steam users. We don't have unlimited resources, so we have to spend them wisely, and there's a limit to how much of our time and risk budget we can afford to spend on non-PulseAudio audio backends: the majority of OS distributions default to PulseAudio (or Pipewire with PulseAudio emulation), so anything that breaks audio with those backends would be a serious problem affecting the majority of Linux Steam users.

Audio code paths in the Steam client itself only support PulseAudio, or Pipewire emulating PulseAudio (Valve's decision, not Collabora's), and it would have been easy to say "PulseAudio or nothing" in the pressure-vessel codebase as well - but we didn't do that, and instead we try to keep less-widely-used audio systems working where feasible, within the limitations of the time and information available to us.

More generally, I am trying to be helpful to users of less-commonly-used environments and stacks by providing detailed information, replying to issue reports, and checking logs. If you work constructively with me and provide information, then many compatibility issues can be solved: for example pressure-vessel has been made to work on OSs as unusual as NixOS, Exherbo and ClearLinux, because users and developers of those OSs helped. Conversely, if you make this process difficult and unpleasant, or if you accuse me of maliciously breaking your system, it will become harder for me to justify spending time on work that would benefit you.

Espionage724 commented 11 months ago

The unsupported method I was using only came about when the Steam GUI client itself had the context menu oddities, and the runtime change just happened to be seemingly unrelated bad timing.

I reinstalled Steam a few days ago on Fedora 38 (RPM Fusion, not flatpak) and Dota 2 runs no problem from Steam client beta and some Linux-specific runtimes installed and set to client-beta branches ahead of time. I'm not forcing a runtime on Dota 2, and I have the option in Steam settings to enable Steam play for all games. Forcing Dota 2 to Steam Linux Runtime also seemingly works fine.

smcv commented 11 months ago

@Espionage724:

I reinstalled Steam a few days ago on Fedora 38 (RPM Fusion, not flatpak) and Dota 2 runs no problem from Steam client beta and some Linux-specific runtimes installed and set to client-beta branches ahead of time.

Great to hear. Since you're using Fedora, the fact that you need the beta runtime is almost certainly #2392.

Forcing Dota 2 to Steam Linux Runtime also seemingly works fine.

That's not really meant to work, and the fact that it's offered as an option is a Steam client bug, but it might accidentally work if you're lucky enough!

smcv commented 11 months ago

I believe the only remaining issues being tracked here that do not have a solution elsewhere (or at least a request to open a separate bug elsewhere) are those that apply to @sylware on a customized or user-specific OS.

To avoid mixing up multiple root causes for similar symptoms, if you are not using @sylware's customized OS, please don't comment here: instead, open a separate issue. If you suspect a Steam Runtime problem, https://github.com/ValveSoftware/steam-runtime is likely to be a better place (and the issue template over there will help you to get the necessary information).

smcv commented 11 months ago

BTW, the container shell scripts use the long options for the "timeout" command, is it possible to use the short options instead (-s instead of --signal, I run busybox timeout).

Today's new sniper beta uses the short options for timeout(1).

sylware commented 11 months ago

@smcv

(It seems the only problem remaining is the audio, see down below).

After further investigation: the root cause of all this is the malicious behavior (I don't think this is incompetence now) of glibc devs and gcc libs devs (libstdc++ and libgcc) with the help of broken ELF features and ELF gnu extensions. "Welcome to the gnu digital jail" (hardly less worse than apple or msft ones).

Then decoupling the various glibc runtimes via an expensive and very intrusive kernel mechanism is unfortunately the only mitigation I could think of too: I have to agree about the usage of the linux mount namespace because I cannot see valve forking the glibc and gcc libs to put a near-definitive end to the root causes. I stand corrected and hate more those gnu devs (they really deserve it).

You will have to "run" after the glibc runtimes: for instance, one game needs sniper runtime to be at least at glibc 2.34 (basically I run "windows 10" and they want "windows 11"). On my side I droped the ball too, I have a glibc 2.38 build ready in case of.

pressure vessel probing/setup should be all statically linked and independent of the user system glibc runtime (I did not check that). As far as I could check: scripts are shell and not bash, I am happy since I run busybox shell or debian dash (thanks for the short options but don't worry too much about that, it was very easy to fix on my end).

(I may try to test if the mesa radv vulkan driver does work properly without the hardware x11/dri, namely with the x11/dri software implementation you setup if you don't find the GL driver).

The LD_LIBRARY_PATH management you implemented then won't fix those games on distros which require you use LD_LIBRARY_PATH. Alright, this is a partial fix.

As I said earlier, dota2 is starting fine in the container (it was just unable to pull my ELF interpreter binary into the container: my /lib64/ld-linux-x86-64.so.2 is just a symbolic link into a glibc specific installation directory). I did fix it by just removing the symbolic link and copying the real ELF interpreter. I'll just be carefull when I switch glibc implementation (now it cannot be an atomic switch anymore :( ).

For the audio, I don't change my stance. You should only care about alsa dmix and dnoop IPCs even if many distros provide a pulseaudio1 IPCs (and probably some will move to native pulseaudio2 IPCs). I repeat again: with that reasoning, we all better go on windows or a video game console, so this argument has no validity.

alsa-lib [dmix+snoop IPCs] works on 100%. Not pulseaudio[12]. It is actually providing native pulseaudio1 support which is a "courtesy" to users, not the other way around, and this is nice.

You are right: /etc/asound.conf is dangerous. I don't have /etc/asound.conf anyway. My "default" card (to be used by PCMs) is defined in the environment variable ALSA_CARD (which contain the short text id or number of the alsa card). That said, ALSA_CARD should not matter: last time I did check dota2 in the container, dota2 does not seem to fallback on its alsa backend then to enumerate the alsa cards in order to propose them to me in the audio panel. I have the channel profile, then in the audio device drop down, I should see the card to use while opening the alsa-lib "standard" PCMs (default, plug, dmix, dsnoop, etc) and probably the pulseaudio 1 PCM and the pulseaudio 2 PCM, if not implemented as virtual hardware cards in the alsa lib. I know there are 2 enumeration interfaces: the one for the "hinted" PCMs and the one for the actual cards. Without pulseaudio1 but with pulseaudio2, if dota2 does not support natively pulseaudio2 it should fallback to its alsa backend which will propose to go thru the pulseaudio2 alsa-lib plugin (same for the steam client).

sylware commented 11 months ago

@smcv

Improvment:

I started to dig deep into your pressure-vessel again and I found out that my ALSA_CONFIG_DIR/ALSA_CARD environment variables were propagated into the container. It is fine for ALSA_CARD, but I removed ALSA_CONFIG_DIR (lost sound into the steam client) and now I have dota2 in the container listing my audio cards in the audio panel and I can select the right one, but I don't get any sound out. I can see the in-game voice capture PCM from my usb webcam working fine via the volume meter.

So first thing to do in the container is to filter out ALSA_CONFIG_DIR since the container deploy the alsa config files in another location.

Still no sound output though, even with the right card selected.

sylware commented 11 months ago

@smcv

Ok, I did dive deep and hacked a bit the container and manage to make work the dmix IPC and get sound, I know what's wrong:

1 - The container must filter out ALSA_CONFIG_DIR since it is using its own libasound with its own path for the data files.

2 - dmix and dsnoop IPC must read /etc/group to read the "audio" group GID to configure their IPC permissions... but the container is not importing my glibc nss libs.

I did replace the 2.31 libnss libs in sniper with my 2.33 ones, and voilà, dmix can get the "audio" group GID and the default pcm (using my ALSA_CARD environment variable) does work fine (I could use $HOME/.asoundrc)

I don't understand why the container is not picking up my glibc libnss libs and I made a capsule log: https://paste.c-net.org/FlamingoPlanned

There you can see that my LD_LIBRARY_PATH is properly parsed to find libidn2... but not for the following libnss libs??

sylware commented 7 months ago

Ping: "pressure-vessel" still unable to import my libnss_files which seems the only way to resolve unix group names (for alsa dmix to work, it needs to resolve the audio group to its GID).

See the capsule log provided above.

smcv commented 7 months ago

@sylware, if you know the mechanics of how dmix actually works, in terms of which processes communicate, how they communicate, and which process is responsible for actually opening /dev/snd/*, then that is information I have wanted for a long time - but please don't assume that everything about it is immediately obvious! The information requested on https://github.com/ValveSoftware/steam-runtime/issues/501#issuecomment-1085877721 would be extremely useful.

There you can see that my LD_LIBRARY_PATH is properly parsed to find libidn2... but not for the following libnss libs??

Do I infer correctly from this statement that on your customized system, core NSS plugins provided by glibc are only findable via LD_LIBRARY_PATH, and are not listed in the ld.so.cache maintained by ldconfig?

If that is the case, then that would explain part of the problem you are having. We're looking up NSS plugins using a glob ("soname-match:libnss_files.so.*") to be future-proof against possible future glibc changes, but that glob matching is only implemented for libraries that are listed in the ld.so.cache (which is keyed by SONAMEs), and not for the LD_LIBRARY_PATH (in which it is not immediately obvious whether a filename is the canonical SONAME or a convenience symlink).

It's very unusual for core glibc functionality to be relying on the LD_LIBRARY_PATH in this way, and I would strongly recommend setting up ldconfig to write details of at least the core libraries for both x86_64 and i386 into the ld.so.cache. That will be both faster and more reliable than the LD_LIBRARY_PATH (and that's why the cache and ldconfig exist).

pressure-vessel could potentially work around this by also looking for libnss_files.so.2 specifically, but that will come with a minor startup performance penalty for every users of pressure-vessel (not just you, but everyone, because it will need to do additional lookups), and will be less future-proof against possible future glibc changes.

As I think I've said in the past, the more your customized system diverges from "ordinary" Linux distributions, the more weird bugs you are likely to encounter.

smcv commented 7 months ago

There you can see that my LD_LIBRARY_PATH is properly parsed to find libidn2... but not for the following libnss libs??

Do I infer correctly from this statement that on your customized system, core NSS plugins provided by glibc are only findable via LD_LIBRARY_PATH, and are not listed in the ld.so.cache maintained by ldconfig?

Please take any further discussion of this part of the issue to: https://github.com/ValveSoftware/steam-runtime/issues/632