ValveSoftware / steam-runtime

A runtime environment for Steam applications
Other
1.2k stars 86 forks source link

Crash when game uses XAudio-2.7+ with Proton-5.13+ on Slackware64 14.2 #357

Closed 414n closed 3 years ago

414n commented 3 years ago

Your system information

Please describe your issue in as much detail as possible:

After selecting to run some games with Proton-5.13 or Proton-5.21-GE-1, these crash at different times after pushing "Play" in Steam, but all of them seem to crash shortly after an xaudio dll is loaded by wine. This is an excerpt of the trace from "The Witcher 3" (one of the affected titles) that can be seen from the Proton log:

25901.439:00c4:00c8:trace:loaddll:build_module Loaded L"C:\\windows\\system32\\xaudio2_7.dll" at 00007F0018080000: builtin
ALSA lib pcm_dmix.c:1108:(snd_pcm_dmix_open) unable to open slave
INFO: OpenAudioDevice failed: ALSA: Couldn't open audio device: Device or resource busy
INFO: Assertion failed: 0 && "Failed to open audio device!"
25901.493:00c4:00c8:trace:seh:dispatch_exception code=c0000005 flags=0 addr=00007F001805D940 ip=1805d940 tid=00c8
25901.493:00c4:00c8:trace:seh:dispatch_exception  info[0]=0000000000000000
25901.493:00c4:00c8:trace:seh:dispatch_exception  info[1]=00000000000000dc
25901.493:00c4:00c8:trace:seh:dispatch_exception  rax=0000000000000000 rbx=0000000002ff0720 rcx=0000000000000000 rdx=000000001805f050
25901.493:00c4:00c8:trace:seh:dispatch_exception  rsi=0000000000000000 rdi=0000000002fdbe90 rbp=0000000000000000 rsp=000000000221e560
25901.493:00c4:00c8:trace:seh:dispatch_exception   r8=000000007d9c7a50  r9=0000000000000001 r10=0000000000000001 r11=000000008262de50
25901.493:00c4:00c8:trace:seh:dispatch_exception  r12=0000000002fdc730 r13=0000000000000000 r14=0000000000000000 r15=0000000002fdc748
25901.493:00c4:00c8:trace:seh:call_vectored_handlers calling handler at 0000000069060770 code=c0000005 flags=0

The strange thing is that the system is running on pulseaudio, so the ALSA error message seems a bit off. After ValveSoftware/steam-runtime#343 got solved I'm also sure that the pulseaudio socket is now available inside the container runtime, but the issue still remains. I can usually work around this by making sure that no xaudio builtin dlls get loaded by wine: this makes the games not crash and audio works too.

The titles that are systematically affected on my system and that I've tested are:

Game Proton version log (with +xaudio2,+pulse,+alsa) WINEDLLOVERRIDES that avoid the crash
Gunfire Reborn Proton-5.13-5 steam-1217060.log.gz xaudio2_9,xaudio2_8,xaudio2_7=d
The Witcher 3 Proton-5.13-5 steam-292030.log.gz xaudio2_7=n,b
Spellbreak Proton-5.21-GE-1 steam-1399780.log.gz xaudio2_8,xaudio2_7=d
Resident Evil 3 Proton-5.21-GE-1 steam-952060.log.gz xaudio2_8,xaudio2_7=d
Resident Evil Resistance Proton-5.21-GE-1 steam-952070.log.gz xaudio2_8,xaudio2_7=d
414n commented 3 years ago

EDIT: turns out I was wrong. It seems like the pulseaudio detection code inside SDL2 fails to determine that pulseaudio is "alive and well" on the system and decides to switch to ALSA. This can be overcome by forcing the audio driver choice via the SDL_AUDIODRIVER=pulseaudio environment variable override or by making sure that the missing pulseaudio libraries (that live under /usr/lib64/pulseaudio on my system) can be found:

$ ls -l /usr/lib64/pulseaudio
total 1400
-rwxr-xr-x 1 root root 610504 Jun 22  2016 libpulsecommon-9.0.so*
-rwxr-xr-x 1 root root 758112 Jun 22  2016 libpulsecore-9.0.so*
-rwxr-xr-x 1 root root  55976 Jun 22  2016 libpulsedsp.so*
# Test failing and rolling back to ALSA without the libs under /usr/lib64/pulseaudio
$ LD_LIBRARY_PATH=/path/to/Proton\ 5.13/dist/lib64:/run/host/usr/lib64 \
  ./faudio_tests 
ALSA lib pcm_dmix.c:1030:(snd_pcm_dmix_open) unable to open slave
INFO: OpenAudioDevice failed: ALSA: Couldn't open audio device: Device or resource busy
INFO: Assertion failed: 0 && "Failed to open audio device!"
test failed (/tmp/FAudio-20.11/tests/xaudio2.c:369): CreateMasteringVoice failed: 88960004
Segmentation fault
# Test succeeding adding the libs under /usr/lib64/pulseaudio
$ LD_LIBRARY_PATH=/path/to/Proton\ 5.13/dist/lib64:/run/host/usr/lib64:/run/host/usr/lib64/pulseaudio \
  ./faudio_tests
Finished with 1219 successful tests and 0 failed tests

I also tested a game ("Assassin's Creed: Unity") outside Steam on regular wine (which was compiled on my system) and noticed no issues of the same sort, even if the game loads the xaudio2_7.dll. This makes me believe that the issue at hand is caused by the container environment.

ORIGINAL MESSAGE I think I got a little further with this. I can reproduce the same issue by launching the faudio_tests binary that is built along the FAudio library, both on my regular system and inside a pressure-vessel container:

$ LC_ALL=C \
  LD_LIBRARY_PATH=/path/to/Proton\ 5.13/dist/lib64:/run/host/usr/lib64 \
  ./faudio_tests 
ALSA lib pcm_dmix.c:1030:(snd_pcm_dmix_open) unable to open slave
INFO: OpenAudioDevice failed: ALSA: Couldn't open audio device: Device or resource busy
INFO: Assertion failed: 0 && "Failed to open audio device!"
test failed (/tmp/FAudio-20.11/tests/xaudio2.c:369): CreateMasteringVoice failed: 88960004
Segmentation fault

The audio initialization seems to fail because SDL2 chose an ALSA device instead of a PulseAudio one. By forcing the usage of PulseAudio over ALSA in SDL2 with the override SDL_AUDIODRIVER=pulseaudio i can overcome this issue, both in the test binary and in game:

$ SDL_AUDIODRIVER=pulseaudio \
  LC_ALL=C \
  LD_LIBRARY_PATH=/path/to/Proton\ 5.13/dist/lib64:/run/host/usr/lib64:/run/host/usr/lib64/pulseaudio \
  ./faudio_tests 
Finished with 1179 successful tests and 0 failed tests.

I'm a bit puzzled about this, however... Assuming that FAudio does nothing "exotic" when initializing audio devices via SDL2 and if SDL2 audio device detection on my system systematically fails to prefer PulseAudio over ALSA, I don't understand how this test program (taken from here, can be built with gcc test.c $(pkg-config sdl2 --cflags --libs) -o test) does always correctly detect (and prefer) the PulseAudio devices though:

#include <stdio.h>
#include <SDL2/SDL.h>

int main(int argc, char** argv) {
    int i;

    /* Initialize only SDL Audio on default device */
    if (SDL_Init(SDL_INIT_AUDIO) < 0)  return 1;

    /* List Audio Drivers */
    int audDvrCnt = SDL_GetNumAudioDrivers();
    printf("SDL_GetNumAudioDrivers(): Found %d Audio Drivers:\n", audDvrCnt);
    for (i=0; i<audDvrCnt; i++)  printf("    Audio Driver %d: %s\n", i, SDL_GetAudioDriver(i));

    /* Find Audio Output Devices*/
    int audOutCnt = SDL_GetNumAudioDevices(0);
    printf("SDL_GetNumAudioDevices(0): Found %d Audio Out Devices:\n", audOutCnt);
    for (i=0; i<audOutCnt; i++)  printf("    Audio device %d: %s\n", i, SDL_GetAudioDeviceName(i, 0));

    /* Find Audio Input Devices*/
    int audInCnt = SDL_GetNumAudioDevices(1);
    printf("SDL_GetNumAudioDevices(1): Found %d Audio In Devices:\n", audInCnt);
    for (i=0; i<audInCnt; i++)  printf("    Audio device %d: %s\n", i, SDL_GetAudioDeviceName(i, 1));

    return 0;
}
smcv commented 3 years ago

It seems like the pulseaudio detection code inside SDL2 fails to determine that pulseaudio is "alive and well" on the system and decides to switch to ALSA. This can be overcome by forcing the audio driver choice via the SDL_AUDIODRIVER=pulseaudio environment variable override or by making sure that the missing pulseaudio libraries (that live under /usr/lib64/pulseaudio on my system) can be found

Inside the container, we are meant to be using the libSDL2-2.0.so.0, libpulse.so.0, etc. from Steam Runtime 2 'soldier', found in /usr/lib/x86_64-linux-gnu (which is SteamLinuxRuntime_soldier/var/soldier/files/lib/x86_64-linux-gnu outside the container). We are not meant to be using libraries from your host system for audio (or networking, or compression, or really everything other than OpenGL/Vulkan hardware drivers). This is done to make it more likely that a game that has passed QA testing on one Linux distribution will work on all Linux distributions.

Adding /run/host/usr/lib64 to your LD_LIBRARY_PATH inside the container is not how this is meant to work: that will result in us using all libraries from Slackware in preference to libraries from soldier, even if the Slackware libraries are older. The container setup is careful to select only the libraries that are required for the graphics stack, and are equal to or newer than the version in soldier; it puts symbolic links to those libraries, and only those libraries, in /overrides/lib/* inside the container.

Please could you capture a pressure-vessel and steam-runtime-system-info log without applying any special workarounds (and definitely not altering LD_LIBRARY_PATH), so that we have a baseline?

The best thing to test here is probably this:

host$ cd .../SteamLinuxRuntime_soldier
host$ ./run-in-soldier --verbose -- steam-runtime-system-info --verbose >srsi.json 2>debug.log

which will give you a more verbose version of the "system info" report in srsi.json, and a debug log in debug.log. I'd like to see both of those files (you can replace usernames, etc. with something like XXX if you want, as long as it's obvious what has been edited). In particular, this more verbose steam-runtime-system-info log should tell me which version of libpulse.so.0 gets loaded, which will be important information!

After that, if you want to try out workarounds, you can get a shell inside the container with

host$ cd .../SteamLinuxRuntime_soldier
host$ ./run-in-soldier -- xterm

from which you can run things like ./faudio_tests. If you put compiled programs in /tmp or in your home directory, they should end up visible both inside and outside the container.

If you're showing us a shell transcript from running a test, please could you make it clear whether it was run inside or outside the container, and (if inside) whether any special environment variables or other workarounds were used to launch that container? The first shell transcript in your comment seems like a confusing mixture of inside and outside the container, if I'm reading it correctly. Perhaps I'm missing details, but I'm dealing with a lot of similar-looking bug reports, so you'll get things solved sooner if you make it as obvious as possible for me :-)

If I understand correctly, your Slackware host system is using PulseAudio 9 with some relatively minor Slackware patches. Yes?

The soldier runtime is meant to be using the client library from PulseAudio 12.2, with some minor Debian patches. My understanding is that different versions of PulseAudio are meant to be at least broadly client/server compatible, so this should work.

414n commented 3 years ago

Please could you capture a pressure-vessel and steam-runtime-system-info log without applying any special workarounds (and definitely not altering LD_LIBRARY_PATH), so that we have a baseline? The best thing to test here is probably this:

host$ cd .../SteamLinuxRuntime_soldier
host$ ./run-in-soldier --verbose -- steam-runtime-system-info --verbose >srsi.json 2>debug.log

which will give you a more verbose version of the "system info" report in srsi.json, and a debug log in debug.log. I'd like to see both of those files (you can replace usernames, etc. with something like XXX if you want, as long as it's obvious what has been edited). In particular, this more verbose steam-runtime-system-info log should tell me which version of libpulse.so.0 gets loaded, which will be important information!

Done: srsi.json.gz (the debug.log file was empty).

After that, if you want to try out workarounds, you can get a shell inside the container with

host$ cd .../SteamLinuxRuntime_soldier
host$ ./run-in-soldier -- xterm

from which you can run things like ./faudio_tests. If you put compiled programs in /tmp or in your home directory, they should end up visible both inside and outside the container.

If you're showing us a shell transcript from running a test, please could you make it clear whether it was run inside or outside the container, and (if inside) whether any special environment variables or other workarounds were used to launch that container? The first shell transcript in your comment seems like a confusing mixture of inside and outside the container, if I'm reading it correctly. Perhaps I'm missing details, but I'm dealing with a lot of similar-looking bug reports, so you'll get things solved sooner if you make it as obvious as possible for me :-)

My apologies, I admit that my previous post is on the chaotic side. You're right, I was under the wrong assumption that LD_LIBRARY_PATH would provide a "fallback" for libraries missing from ld.so.conf, instead of having precedence over it, hence why I went "commando" with LD_LIBRARY_PATH.

Let me recap the rationale behind those tests:

  1. I was investigating the source of those "ALSA" error messages inside the Proton logs, that just happen to appear shortly after a game using Proton-5.13+ loads an xaudio >=2.7 DLL library;
  2. given that xaudio is implemented in Proton/wine using the FAudio library, I tried to find a sample/test application for FAudio that could be run inside the container-runtime to check whether it would have the same issue as Proton or not;
  3. inside the FAudio library sources I noticed that a test binary (faudio_tests) can optionally be built while compiling the library itself;
  4. I then compiled and built that test suite on my host and tried to launch it inside the pressure-vessel container runtime but, as it is linked against the libunwind.so.8 and libunwind-x86_64.so.8 libraries that are not in the container, it wouldn't run;
  5. I then used LD_LIBRARY_PATH to resolve that library from my host system (along with everything else...) and I suddenly got the same ALSA error output as Proton, which made me think that I was on the right path;
  6. I then found that the issue at step 5. would be solved by either:
    • forcing the SDL2 audio driver to pulseaudio via SDL_AUDIODRIVER=pulseaudio
    • throwing more host libraries in the mix (the ones under /usr/lib64/pulseaudio on my host system), because without them the pulseaudio detection code inside SDL2 seems to fail and then reverts to ALSA.

I concur that these tests are "tainted" by my wrong assumption about LD_LIBRARY_PATH, however the SDL_AUDIODRIVER=pulseaudio override seems to fix my issue with Proton and xaudio titles, even though I probably got to it more by chance than anything :grimacing:.

Now, after your advice, I ran again that test inside the container, this time only preloading:

If I understand correctly, your Slackware host system is using PulseAudio 9 with some relatively minor Slackware patches. Yes?

Correct.

The soldier runtime is meant to be using the client library from PulseAudio 12.2, with some minor Debian patches. My understanding is that different versions of PulseAudio are meant to be at least broadly client/server compatible, so this should work.

I think so too, otherwise I wouldn't have working audio at all, I guess.

smcv commented 3 years ago

the SDL_AUDIODRIVER=pulseaudio override seems to fix my issue with Proton and xaudio titles

So, to be completely clear about this:

Is that all correct?

I then compiled and built that test suite on my host and tried to launch it inside the pressure-vessel container runtime but, as it is linked against the libunwind.so.8 and libunwind-x86_64.so.8 libraries that are not in the container, it wouldn't run;

That makes sense. To get a test/debug tool that can be run in the soldier runtime, you'd usually want to compile it in a soldier environment (there's an official Docker container available, which should also be compatible with podman, or you could use these older instructions).

# Container terminal launched with:
# /path/to/SteamLinuxRuntime_soldier/_v2-entry-point \
# --deploy=soldier \
# --suite=soldier \
#  --verb=run  -- xterm -e bash --norc
$ LD_DEBUG=libs \ 
  LD_PRELOAD="/run/host/usr/lib64/libunwind-x86_64.so.8 /run/host/usr/lib64/libunwind.so.8" \
  LC_ALL=C \
  LD_LIBRARY_PATH=/path/to/Proton\ 5.13/dist/lib64 \
  ./faudio_tests 2>/tmp/faudio.libs
Finished with 1235 successful tests and 0 failed tests.

To be clear: you are saying that this test is successful inside the soldier container, this time without having to force SDL_AUDIODRIVER=pulseaudio?

If it is, then that might point to this being a problem at a higher level than FAudio, perhaps involving Proton/Wine.

smcv commented 3 years ago

Assuming that FAudio does nothing "exotic" when initializing audio devices via SDL2

From the source code, it looks like it does not do anything exotic.

I don't understand how this test program does always correctly detect (and prefer) the PulseAudio devices

I don't think you've actually told me what the result of running that test program is. Does it correctly detect and prefer the PulseAudio devices, when run in the soldier container, with no special workarounds?

(I can prepare a soldier binary for you to run, if you need one.)

414n commented 3 years ago

So, to be completely clear about this:

* Steps to reproduce: Launch a Windows game that uses XAudio 2.7 or later, under Proton 5.13+

* Expected result: they work like they would on Windows, with sound

* Actual result: they crash, as per the original bug report

* Workaround 1: If you run Steam as `SDL_AUDIODRIVER=pulseaudio steam`, with no other workarounds applied, then the games do not crash

* Workaround 2: If you use the `WINEDLLOVERRIDES` in the original bug report, with no other workarounds, then the games also do not crash

Is that all correct?

Almost, as I did apply the SDL_AUDIODRIVER=pulseaudio override only on a per-game basis in previous tests. Now that you mentioned it, I've tried setting it for the entire steam process while removing it from the titles command line where it was already enforced and I think I found the underlying issue as:

Even though the steam process was launched with SDL_AUDIODRIVER=pulseaudio LC_ALL=C LANG=C PRESSURE_VESSEL_VERBOSE=1 steam &> /tmp/steam.log &, the SDL driver is indeed forced to ALSA via an override that is added while packaging the steam client package for Slackware. I even edited the relevant SlackBuild script recently to build newer steam client packages, but totally failed to notice that the following overrides would be harmful for the current Slackware version while running on pulseaudio:

# Apply changes to the steam script which we need on Slackware:
sed -i -e '/env bash/ a\
# --- Start Slackware mod ---\
export LD_LIBRARY_PATH=/usr/lib/seamonkey\
export LD_PRELOAD='"'"'/usr/$LIB/libasound.so.2'"'"'\
# Audio output goes to first "hw" device of ALSA\
export SDL_AUDIODRIVER=alsa\
#export AUDIODEV=hw\
# On window close, minimize to the system tray area:\
export STEAM_FRAME_FORCE_CLOSE=1\
# Add any custom variable exports here\
[ -f ${HOME}/.steam4slackware ] \&\& . ${HOME}/.steam4slackware\
# --- End Slackware mod ---' $PKG/usr/bin/steam 

Those overrides were indeed needed in Slackware <=14.1, where pulseaudio was not part of the official distribution. Anyway, I'm still in the middle of performing tests after having commented out both the LD_PRELOAD and SDL_AUDIODRIVER overrides from that script, but for what I can see now the issue seems to be finally gone.

That makes sense. To get a test/debug tool that can be run in the soldier runtime, you'd usually want to compile it in a soldier environment (there's an official Docker container available, which should also be compatible with podman, or you could use these older instructions).

That's convenient, thanks for the info!

# Container terminal launched with:
# /path/to/SteamLinuxRuntime_soldier/_v2-entry-point \
# --deploy=soldier \
# --suite=soldier \
#  --verb=run  -- xterm -e bash --norc
$ LD_DEBUG=libs \ 
  LD_PRELOAD="/run/host/usr/lib64/libunwind-x86_64.so.8 /run/host/usr/lib64/libunwind.so.8" \
  LC_ALL=C \
  LD_LIBRARY_PATH=/path/to/Proton\ 5.13/dist/lib64 \
  ./faudio_tests 2>/tmp/faudio.libs
Finished with 1235 successful tests and 0 failed tests.

To be clear: you are saying that this test is successful inside the soldier container, this time without having to force SDL_AUDIODRIVER=pulseaudio?

Yes.

If it is, then that might point to this being a problem at a higher level than FAudio, perhaps involving Proton/Wine.

If it was all due to those overrides in the steam start script, I guess you're totally right :wink:

I don't understand how this test program does always correctly detect (and prefer) the PulseAudio devices

I don't think you've actually told me what the result of running that test program is. Does it correctly detect and prefer the PulseAudio devices, when run in the soldier container, with no special workarounds?

Yes, sorry if it was not clear. When I ran it inside the container, it always correctly detected and preferred pulseaudio over ALSA.

smcv commented 3 years ago

export SDL_AUDIODRIVER=alsa

That would do it! I'm going to add a check for this to steam-runtime-system-info so that we can detect misconfiguration more easily.

I think we can resolve this as a Slackware/Slackbuild bug: the overrides inserted by the Slackbuild are not consistent with your system configuration.

export LD_LIBRARY_PATH=/usr/lib/seamonkey\
export LD_PRELOAD='"'"'/usr/$LIB/libasound.so.2'"'"'\

/o\

I don't think I want to know what these are/were for... but I wouldn't advise keeping them.

smcv commented 3 years ago

I'm going to add a check for this to steam-runtime-system-info so that we can detect misconfiguration more easily

Done. This missed the boat for today's beta (actually prepared yesterday), but will be in the next beta, whenever that happens, numbered 0.20210128 or later.

I think we can close this issue.

414n commented 3 years ago

export SDL_AUDIODRIVER=alsa

That would do it! I'm going to add a check for this to steam-runtime-system-info so that we can detect misconfiguration more easily.

Would that become a warning inside the report?

I think we can resolve this as a Slackware/Slackbuild bug: the overrides inserted by the Slackbuild are not consistent with your system configuration.

I concur

export LD_LIBRARY_PATH=/usr/lib/seamonkey\
export LD_PRELOAD='"'"'/usr/$LIB/libasound.so.2'"'"'\

/o\

I don't think I want to know what these are/were for... but I wouldn't advise keeping them.

I think one was for finding libnspr/libnss (that on Slackware are inside the seamonkey package), maybe at a time where the Steam client would need either of them. The libasound preload trick was probably needed for forcing the Steam client to use ALSA when pulseaudio was not even installed on the system. Anyway, I think they were only needed for the first Linux versions of the Steam client to run on older Slackware versions, so they can be retired now.

As always, thanks for the support!

smcv commented 3 years ago

Would that become a warning inside the report?

Not a warning as such, but the report includes a block for driver-environment, which shows environment variables that are known to influence the choice of (mostly graphics) drivers. For instance, if you set MESA_LOADER_DRIVER_OVERRIDE, that goes into the report, but totally irrelevant environment variables like PS1 and LOGNAME don't.

In future versions we'll include SDL_AUDIODRIVER there, if it's set.