ValveSoftware / steam-for-linux

Issue tracking for the Steam for Linux beta client
4.22k stars 174 forks source link

Crashes when using steam voice/microphone #1853

Closed Dubanubiel closed 11 years ago

Dubanubiel commented 11 years ago

Ever since the official launch of steam for Linux, TF2 will crash when I hit the microphone button. It will usually work the first couple of times but then it will freeze up, sort of catch itself, be REALLY slow and buggy, and then won't close without having to turn off my computer. I can Alt+tab away and shutdown but I can't even kill TF2 using system monitor.

Processor Information: Vendor: AuthenticAMD Speed: 2800 Mhz 2 logical processors 2 physical processors HyperThreading: Unsupported FCMOV: Supported SSE2: Supported SSE3: Supported SSSE3: Supported SSE4a: Supported SSE41: Unsupported SSE42: Unsupported

Network Information: Network Speed:

Operating System Version: Ubuntu 12.10 (32 bit) Kernel Name: Linux Kernel Version: 3.5.0-23-generic X Server Vendor: The X.Org Foundation X Server Release: 11300000 X Window Manager: Compiz Steam Runtime Version:

Video Card: Driver: NVIDIA Corporation GeForce 9800 GTX+/PCIe/SSE2/3DNOW!

Driver Version:  3.3.0 NVIDIA 304.43
Desktop Color Depth: 24 bits per pixel
Monitor Refresh Rate: 59 Hz
VendorID:  0x10de
DeviceID:  0x613
Number of Monitors:  1
Number of Logical Video Cards:  1
Primary Display Resolution:  1440 x 900
Desktop Resolution: 1440 x 900
Primary Display Size: 16.06" x 10.04"  (18.94" diag)
                                        40.8cm x 25.5cm  (48.1cm diag)
Primary Bus: PCI Express 16x
Primary VRAM: 512 MB
Supported MSAA Modes:  2x 4x 8x 16x 

Sound card: Audio device: Realtek ALC889A

Memory: RAM: 4036 Mb

Miscellaneous: UI Language: English LANG: en_CA.UTF-8 Microphone: Not set Total Hard Disk Space Available: 202004 Mb Largest Free Hard Disk Block: 123696 Mb

Installed software:

Recent Failure Reports: Sun Feb 17 01:29:17 2013 GMT: file ''/tmp/dumps/assert_20130216202905_1.dmp'', upload yes: ''CrashID=bp-913d3f0f-27b9-4711-98fc-4f1712130216''

doug65536 commented 11 years ago

Oh come on. The reason why we never get anything fixed and the reason Linux audio is complete crap is everyone blaming the "other" components. My other programs that use pulseaudio don't crash so why would tf2 crash. It's not pulseaudio anyway, I removed pulseaudio completely from my system and used ALSA and it still happens.

doug65536 commented 11 years ago

If you are lazy, please don't report that you have a "fix". TF2 forums are plagued by people with false "fixes". I'm expecting someone to say "I stuck my finger up my nose and it doesn't crash! Pick your nose while it loads and it solves the issue. Oh, and reinstall" Seems like everyone tries one thing, tests it, sees that it doesn't crash exactly the same way, feels good about how smart they are, and posts false information to the forum. :(

DerRidda commented 11 years ago

While I didn't try removing PA completely I tried using ALSA by simply selecting it in Steam quite often and the crashes still happen. How would that be different to using pure ALSA without having PA installed anyway? Honest question. Judging from @doug65536's statement it seems like it was a false lead anyway.

Doesn't Miles support Linux through OpenAL only? In that case how could there be a difference between PA and ALSA as OpenAL would still be involved? Which version of Miles is Steam using? Because browsing through this (http://www.radgametools.com/msshist.htm) reveals some OpenAL specific improvements made late last year and a whole bunch of general crash fixes over the last few years.

Is there any known setup on which this never ever happens?

gdrewb-valve commented 11 years ago

There are user reports, such as adaricmar above, that say that removing pulseaudio fixed their crashes, so why would it be a false lead? It doesn't help everybody but that it helps some people seems like interesting information.

In the particular path where the crash is happening Steam is calling OpenAL directly, it isn't going through Miles. That may be a red herring since the problem may be something completely diferent that occurred in the past, but as a starting point it says that something in OpenAL's state is messed up.

MrPopinjay commented 11 years ago

There are user reports, such as adaricmar above, that say that removing pulseaudio fixed their crashes, so why would it be a false lead?

Yet it does not fix the problem for all users so it's clearly not the entire issue here.

gdrewb-valve commented 11 years ago

Does that make it uninteresting? It's the only lead we've gotten so far so shouldn't we follow it?

MrPopinjay commented 11 years ago

Just pointing out that it's not the whole issue. It certainly proves that this is not a PulseAudio bug, as someone suggested previously. :)

gdrewb-valve commented 11 years ago

It doesn't actually prove that it's not a pulseaudio problem as it may indeed be a pulseaudio problem for some people and there may be other issues that other people are hitting. It could easily be something entirely unrelated to pulseaudio. As we know essentially nothing about the cause of the crash it's hard to say anything definitively.

A good step would be for the people behind OpenAL to take a look since the crash is in their code so they're best positioned to comment on the immediate cause of the crash (possibly not the actual original reason, just what is broken that was the immediate culprit).

DerRidda commented 11 years ago

Is the issue also appearing with OpenAL 1.15? Ubuntu's is stuck on a version that's out of date and it seems like raring is on the same version. What do the crash reports say? I'm really interested in what setups can't repro this issue at all.

gdrewb-valve commented 11 years ago

I spot-checked four crashes and they were all from OpenAL 1.13 and mostly from Ubuntu, although there was one instance on Arch.

DerRidda commented 11 years ago

Might be worth testing it with a fresh snapshot of OpenAL then, with some luck the issue might actually be in OpenAL and has already been dealt with in 1.15.

I assume I have to build for i386 for Steam?

gdrewb-valve commented 11 years ago

Correct, Steam is 32-bit.

DerRidda commented 11 years ago

Seems like it's still happening, can you verify through these crash reports that Steam actually used 1.15?

Fri Apr 19 14:45:59 2013 GMT: file ''/tmp/dumps/crash_20130419164546_1.dmp'', upload yes: ''CrashID=bp-9d4fccab-249e-4149-9158-91e2c2130419 '' Fri Apr 19 14:46:10 2013 GMT: file ''/tmp/dumps/assert_20130419164559_1.dmp'', upload yes: ''CrashID=bp-8f31cafd-c1a1-4c02-87a9-ce4652130419 '' Fri Apr 19 14:46:17 2013 GMT: file ''/tmp/dumps/assert_20130419164609_2.dmp'', upload yes: ''CrashID=bp-6ad9ba1e-b6c4-4b52-a4f5-ca3bc2130419 ''

gdrewb-valve commented 11 years ago

I'll check the crashes, but can you catch the crash in gdb and get a stack with your build of OpenAL so that we can see exactly where it's crashing there? Are you familiar enough with gdb to do some basic debugging at the time of the crash?

DerRidda commented 11 years ago

I will do that, I used gdb once before, though a link to a basic 101 couldn't hurt.

DerRidda commented 11 years ago

I'm currently testing and a bit concerned. I don't feel like joining a VAC secured server with gdb attached to steam to do my usual repro. Is that a justified concern? Can I repro the issue with just steam friends voice chat? Is that the same?

BHSPitMonkey commented 11 years ago

If you can reproduce the crash just by using Steam voice chat, that would be insightful information. I haven't heard of the issue affecting Steam chat, though.

On Fri, Apr 19, 2013 at 1:52 PM, DerRidda notifications@github.com wrote:

I'm currently testing and a bit concerned. I don't feel like joining a VAC secured server with gdb attached to steam to do my usual repro. Is that a justified concern? Can I repro the issue with just steam friends voice chat? Is that the same?

— Reply to this email directly or view it on GitHubhttps://github.com/ValveSoftware/steam-for-linux/issues/1853#issuecomment-16676113 .

DerRidda commented 11 years ago

I don't think it's happening with just Steam voice chat, had a chatty one going for over half an hour and it didn't happen. And I'm not going to test this out in-game unless I know how VAC thinks about finding gdb attached to the Steam client.

gdrewb-valve commented 11 years ago

Crash bp-9d4fccab-249e-4149-9158-91e2c2130419 is still showing libopenal.so.1.13.0. How did you install your new build? If you look for the libopenal.so.1 in /usr/lib/i386-linux-gnu it should be a link to your new libopenal.so.1.15.0. If it isn't try updating it and rerun.

You should be OK running steam under gdb, If you do catch it under gdb do 'bt' to get a stack trace. Assuming it finds your symbols properly that should show you the file and source line in your OpenAL source to see exactly where it hit the problem. In the 1.13 crash sent up it's most likely a call to malloc or free, so there may be some system and libc frames before you get to the OpenAL frame.

DerRidda commented 11 years ago

@gdrewb-valve I had suspected that the old libs were still loaded but 1.13? That is odd, I'm running Quantal which has 1.14 in it's repos and checking my backed up OpenAL libs I'm positive that I didn't have 1.13 installed.

Ok, I just checked the steam-runtime folder and there is OpenAL 1.13 in there. That would also explain why Arch users are seeing 1.13 crashes on their cutting edge distro, I will proceed and replace that lib instead.

Update: Here is another crash report with replaced OpenAl in steam-runtime. Didn't debug yet was just a test run. Fri Apr 19 20:33:06 2013 GMT: file ''/tmp/dumps/crash_20130419223300_1.dmp'', upload yes: ''CrashID=bp-6ce5a6ae-8f1d-437f-b21c-727af2130419 '' Fri Apr 19 20:33:11 2013 GMT: file ''/tmp/dumps/assert_20130419223306_1.dmp'', upload yes: ''CrashID=bp-6a8e35c7-d1b9-4426-b2d8-319922130419 '' Fri Apr 19 20:33:11 2013 GMT: file ''/tmp/dumps/assert_20130419223306_1.dmp'', upload yes: ''CrashID=bp-f341e850-7d04-4771-89ab-5ff962130419 ''

DerRidda commented 11 years ago

Here is a gist with the bt from my latest repro of the crash https://gist.github.com/DerRidda/5423273 flibitijibibo is the person I asked to make me a clean 32bit build. I'm currently still having gdb and Steam open at that precise point if there is anything more I can do with this.

gdrewb-valve commented 11 years ago

Thanks, that's good information. Steam is calling OpenAL which is calling through alsa and getting into pulseaudio. pulseaudio appears to be doing an allocation.

Was there any output prior to what you put in the gist? It looks like the C runtime detected heap corruption at the time of pulseaudio's alloc and is killing the process. The unfortunate thing is that the heap could have been corrupted at any time earlier and it's just showing up now. Sometimes you'll get a piece of output indicating the address of the corrupted block, which would be a little useful. After that there isn't much to extract, corruptions like this usually require complex debugging to catch when the corruption occurs. That means somebody who can track the corruption (Valve, unless somebody in the community is motivated and knows how to do it) will need to repro this.

Thanks again for capturing this and I'm sorry I don't have a quick solution.

DerRidda commented 11 years ago

Here are the 3 lines between me killing CS:S - as it didn't recover from the client crash while gdb was still on - and actually calling bt.

Program received signal SIGABRT, Aborted. [Switching to Thread 0xe7908b40 (LWP 5350)] 0xf7789425 in __kernel_vsyscall ()

Before that there are only normal New thread / Thread exited messages.

gdrewb-valve commented 11 years ago

OK, there's no extra information. We're back to the same place of we need to reproduce in controlled conditions where we can track what's happening leading up to the problem, unfortunately. We'll keep working on that.

Thanks for trying it out.

gdrewb-valve commented 11 years ago

We're going to add a debugging option in the next client beta that will help us take a small step farther here (but will not be conclusively, to be clear).

doug65536 commented 11 years ago

Actually, I am guilty of the same thing I complained about. I can confirm that with pulseaudio removed from my system, I don't get the crash. However, there is another issue with that: I have to go into steam settings every boot (once) and click "detect audio devices" OR change the device and change it back, to get the microphone to work in TF2. I had hoped that setting my USB microphone with soundrc.conf would have solved it, allowing "default" to work. This was successful but I still have to detect or change device (and change it back) before voice works. (Ubuntu 64-bit 12.10 - logitech desk microphone (AK5370), and RealTek ALC889 bigbang xpower X58 chipset motherboard audio (intel-hd-audio))

doug65536 commented 11 years ago

Just want to clarify since I made it confusing in my previous post: removing pulseaudio DOES fix the voice hang. I use voice a lot and I play for many hours some days so I am quite sure that pulseaudio is involved in the hangs. Sorry for adding confusion. It seems like a memory corruption because it (usually) shows "unable to load model" error dialog box if you let the hang sit for a while.

gdrewb-valve commented 11 years ago

Thanks, that does build evidence that pulseaudio is involved somehow. The next beta client will give us one more piece of data when it comes out and that may get us another bit closer.

gdrewb-valve commented 11 years ago

If you have updated to the latest beta client from today, here's another thing to try. First, see if the problem still happens. If so, quit Steam completely and launch with the environment variable STEAM_OPENAL_SKIP_CAPTURE=1 set. This skips Steam's call to alcCaptureSamples and instead fills Steam's buffer with zero. You will not have working voice since no audio samples are being captured, but if the crash does not occur it makes it extremely likely that the problem lies in the audio stack below Steam. If the crash does occur then it's not the sample capturing and is more likely a Steam bug.

If you try this and your voice is working (other people can hear you) then the environment variable is not having an effect, so make sure that you are heard in the crash case and are not heard after setting the environment variable.

BHSPitMonkey commented 11 years ago

I was hoping to try this, but realized that I can't click anything on the main menu in the Beta client. All my mouse input is just ignored. Never had this issue in the regular client.

On Wed, Apr 24, 2013 at 10:48 PM, Drew Bliss notifications@github.comwrote:

If you have updated to the latest beta client from today, here's another thing to try. First, see if the problem still happens. If so, quit Steam completely and launch with the environment variable STEAM_OPENAL_SKIP_CAPTURE=1 set. This skips Steam's call to alcCaptureSamples and instead fills Steam's buffer with zero. You will not have working voice since no audio samples are being captured, but if the crash does not occur it makes it extremely likely that the problem lies in the audio stack below Steam. If the crash does occur then it's not the sample capturing and is more likely a Steam bug.

If you try this and your voice is working (other people can hear you) then the environment variable is not having an effect, so make sure that you are heard in the crash case and are not heard after setting the environment variable.

— Reply to this email directly or view it on GitHubhttps://github.com/ValveSoftware/steam-for-linux/issues/1853#issuecomment-16987285 .

gdrewb-valve commented 11 years ago

@BHSPitMonkey, you might be seeing some of the window-manager incompatibilities that are open here, what WM are you using?

DerRidda commented 11 years ago

@gdrewb-valve:

Crash still happens after Apr. 24 Update with default settings, see:

Thu Apr 25 20:28:21 2013 GMT: file ''/tmp/dumps/crash_20130425222815_1.dmp'', upload yes: ''CrashID=bp-a1ce2f27-168d-4e9a-b281-9e14c2130425
''
Thu Apr 25 20:28:25 2013 GMT: file ''/tmp/dumps/assert_20130425222821_1.dmp'', upload yes: ''CrashID=bp-d7f9a579-839c-4430-a9a6-e5fe22130425
''
Thu Apr 25 20:28:49 2013 GMT: file ''/tmp/dumps/assert_20130425222845_2.dmp'', upload yes: ''CrashID=bp-2c5843c6-2848-4a59-b93f-470142130425
''
Thu Apr 25 20:29:36 2013 GMT: file ''/tmp/dumps/assert_20130425222928_3.dmp'', upload yes: ''CrashID=bp-80245284-1af8-4578-bd08-ed7052130425
''

And it also happens with environment variable set:

Thu Apr 25 20:54:45 2013 GMT: file ''/tmp/dumps/crash_20130425225434_1.dmp'', upload yes: ''CrashID=bp-997e1106-1d62-4a3a-8503-834aa2130425
''
Thu Apr 25 20:54:48 2013 GMT: file ''/tmp/dumps/assert_20130425225445_1.dmp'', upload yes: ''CrashID=bp-6d9e2db2-1125-4405-a8ea-0b9542130425
''
NothingMuchHereToSay commented 11 years ago

Confirmed, still crashes on me with voice, but doesn't with that set that @gdrewb-valve posted.

gdrewb-valve commented 11 years ago

Interesting, so people are having different results with the envvar? The stack from DerRidda is messy but if parts of it are to be believed things are still crashing in pulse, just along a different code path that the new envvar doesn't affect, so it doesn't add much info. However, if things no longer crash for NothingMuchHereToSay with the envvar set it is having at least some effect. We'll probably need to try and get the pulse people involved to look from their side.

Thanks for trying the experiment.

DerRidda commented 11 years ago

I'm skeptical if @NothingMuchHereToSay actually tested for long enough and would highly recommend he test again for a prolonged time. Keep in mind you actually have to keep using voice chat just as much as you would if it was still enabled and that the time it takes for it to happen is highly variable.

NothingMuchHereToSay commented 11 years ago

@DerRidda It all depends on how often the server communicates, but I thought @gdrewb-valve said that voice chat isn't possible with that environmental (or whatever that is) set. Then again, I'm not sure how to properly set it, do I put "steam" at the end of the command or do I have to set it before I launch steam?

gdrewb-valve commented 11 years ago

You can (and should) still use voice chat, it's just that the microphone data will not be picked up so nobody will hear anything you say.

DerRidda commented 11 years ago

@NothingMuchHereToSay Just open a terminal window and enter "STEAM_OPENAL_SKIP_CAPTURE=1 steam" without the quotation marks of course. To test if it worked go into the Steam settings voice tab and test the microphone, you shouldn't hear a thing and the level meter should also not be moving. When testing your microphone in your operating system's audio settings it should work as intended, though. After that just start playing CS:S or another Source game online and make liberal use of push to talk.

To not feel like a complete fool while talking to yourself I recommend making snarky comments about other players that would normally get you kicked. ;)

NothingMuchHereToSay commented 11 years ago

@DerRidda Honestly, after spending about an hour of talking on Turbine with the "STEAM_OPENAL_SKIP_CAPTURE=1 steam" option up, I haven't experienced a crash. I don't know what it could be other than maybe Pulseaudio, but I can't live without it, as without Pulseaudio, ALSA supports only one stream out of my speakers. Which is bad for multitaskers that use Teamspeak to communicate.

EDIT: Just to let you know that nobody could hear me, so I know that the envvar was set.

Nemoder commented 11 years ago

on debian in /etc/pulse/daemon.conf I set: enable-shm = no and steam no longer crashes, only the mic stops working until steam is restarted. not great but still a vast improvement to having games completely freeze on me.

BHSPitMonkey commented 11 years ago

Interesting; I'll try that tonight for a while and report back. Does something clue you in to the fact that your mic doesn't work anymore once the event happens? Or do you just wait for someone to say "Nemoder, we can't hear you"? :)

On Tue, Apr 30, 2013 at 12:22 AM, Nemoder notifications@github.com wrote:

on debian in /etc/pulse/daemon.conf I set: enable-shm = no and steam no longer crashes, only the mic stops working until steam is restarted. not great but still a vast improvement to having games completely freeze on me.

— Reply to this email directly or view it on GitHubhttps://github.com/ValveSoftware/steam-for-linux/issues/1853#issuecomment-17210063 .

NothingMuchHereToSay commented 11 years ago

@Nemoder Still crashes for me dude.

Nemoder commented 11 years ago

@BHSPitMonkey No indication other than it becomes apparent pretty quickly in a game like Guns of Icarus that your team can't hear you. If I tab out and look at the terminal I started steam in I'll see something like: AL lib: pulseaudio.c:588: pa_context_new() failed

DerRidda commented 11 years ago

@gdrewb-valve: Bear with me on this one, please. At best the following would deliver a workaround and do nothing to actually fix the issue.

How exactly does the push-to-talk option work? Something like this? On button press: Open a new stream; Keep that stream open while button is down; On release: close stream. What I'm asking is: Does a new "context" get created every time push-to-talk is triggered? Because that seems to be the point at which the crash happens all the time, you press the button and it feels like a kill-switch. I never had this happen in the middle of an on-going p2t event. (Unless the rest of the people still following this issue can claim otherwise.)

If the above is somewhat accurate I would like to propose a new experiment (if it's doable): Add an option to Steam that would change the behavior of p2t so that when a server is joined the client opens a new stream/ creates a new context that stays open all the time but is not being transmitted to the server and and treat the push-to-talk event as a trigger for the transmission to the server only.

I hope my idea is clear. If the above assumptions are correct, once there is a good stream/context, it seems like this won't cause any trouble (at least regarding this issue, no idea if this won't open a whole new can of worms) and since the issue is afaik only happening after several push-to-talk events the first stream/context could be considered an "almost guaranteed good free shot". So if it doesn't cause any trouble, hold onto this stream/context and keep using it once it's established thus delivering a workaround that makes voice-chat usable for Linux users while the actual issue is being investigated further.

If my assumptions are too off the mark, sorry... look at all the letters that are totally useless! ;)

gdrewb-valve commented 11 years ago

Push-to-talk is relatively simple in its control, it just skips work if the push-to-talk button is not down. If you don't have push-to-talk things are live all of the time. The voice device is not changed at all. It's theoretically possible to recreate the device when the button goes down but this would be complicated in our code as nothing is set up to allow that.

I think you might see crashes when pressing the push-to-talk button because something goes wrong and the process is messed up, but you don't go down the voice path until you push the button. At that point the voice system starts getting audio samples and you hit the problem that had been lying there latent. Just a theory.

gdrewb-valve commented 11 years ago

I sent a request for assistance to pulseaudio's bug reporting alias. So far no response.

DerRidda commented 11 years ago

@Nemoder: I also didn't have any luck with disabling shared memory in that config file, still the very same behavior for me. Did you play around with any other option in the pulse config files by any chance?

BHSPitMonkey commented 11 years ago

@gdrewb-valve Even with the underlying cause of the audio/pulse failure unknown, shouldn't it be possible with the submitted crash reports to at least prevent the client from crashing when the situation arises? My impression is that PulseAudio or OpenAL is behaving in some way that the client isn't prepared for, and that it may be possible to simply bolster the client by sanity-checking what's coming in at the point of failure (and just printing some kind of error to the game console: "Error: OpenAL did something bad!"). It would be better to just lose voice communication than to get pulled out of the game entirely.

That said, things could certainly be more complicated than my first impression, and the above could be easier said than done.

gdrewb-valve commented 11 years ago

Unfortunately it isn't the kind of problem that can be handled. Process state gets corrupted at some point we aren't sure of and there's no way to recover from it.

BotoX commented 11 years ago

The bug is still happening when running the steam beta with STEAM_OPENAL_SKIP_CAPTURE=1 steam I spammed my mic button nonstop while playing and it happened after 10 minutes. Going to play now without touching the mic button.