libsdl-org / SDL

Simple Directmedia Layer
https://libsdl.org
zlib License
10.11k stars 1.85k forks source link

SDL audio callback uses wrong buffer size after WASAPI device change #11122

Open StephenCWills opened 1 month ago

StephenCWills commented 1 month ago

We recently received a report from a user who has encountered audio distortion and app crashes as a result of plugging/unplugging his headphones. He appears to have a somewhat rare audio setup where his speakers and headphones are connected as separate playback devices with different default device periods. As a result, the WASAPI driver assigns this->spec.samples = 236 for the speakers, and this->spec.samples = 221 for the headphones. This can be observed in the logs from our game client that he provided to us in the issue report.

https://github.com/diasurgical/devilutionX/issues/7452#issuecomment-2395613317 Speakers: VERBOSE: Aulib sampleRate=22050 channels=2 frameSize=236 format=0x8120 Headphones: VERBOSE: Aulib sampleRate=22050 channels=2 frameSize=221 format=0x8120

This version of our application uses SDL release version 2.30.5.


In SDL_RunAudio(), the callback is invoked with data and data_len parameters. udata is irrelevant in our case. https://github.com/libsdl-org/SDL/blob/2eef7ca475decd2b864214cdbfe72b143b16d459/src/audio/SDL_audio.c#L736

data is the return value of WASAPI_GetDeviceBuf(). https://github.com/libsdl-org/SDL/blob/2eef7ca475decd2b864214cdbfe72b143b16d459/src/audio/SDL_audio.c#L707

data_len is assigned the value from device->callbackspec.size. https://github.com/libsdl-org/SDL/blob/2eef7ca475decd2b864214cdbfe72b143b16d459/src/audio/SDL_audio.c#L703

When the WASAPI driver updates the number of samples in the audio spec, it calls SDL_CalculateAudioSpec(&this->spec) to update device->spec.size. But that does nothing to device->callbackspec. https://github.com/libsdl-org/SDL/blob/2eef7ca475decd2b864214cdbfe72b143b16d459/src/audio/wasapi/SDL_wasapi.c#L486-L500

WASAPI_GetDeviceBuf() uses this->spec.samples to allocate the buffer for audio samples. https://github.com/libsdl-org/SDL/blob/2eef7ca475decd2b864214cdbfe72b143b16d459/src/audio/wasapi/SDL_wasapi.c#L183

I believe that if the channels, format, or freq are also updated, then SDL will create an audio stream to resample internally. But if only the number of samples changes, then it will "use the existing audio stream" which means that nothing actually changes. Even if the stream was created, I have doubts about whether the callback would be providing the right number of samples to the resampler. https://github.com/libsdl-org/SDL/blob/2eef7ca475decd2b864214cdbfe72b143b16d459/src/audio/wasapi/SDL_wasapi.c#L82-L93

Therefore, unless I've missed something, after the device change the callback is made to fill a 221-sample buffer with 236 samples or vice versa.

icculus commented 3 weeks ago

(This was an excellent bug report...thank you for being so thorough!)

This is SDL2-specifc, even though it landed in the 3.2.0 milestone, fwiw. All this code was almost completely replaced in SDL3.

I believe this should be fixed by 5b0e838a7433daf0d44aff10574791cced63113e, but I don't have a way to test this here to verify. If this turned out to not fix the problem, please report back and I'll reopen this issue!

StephenCWills commented 2 days ago

Our user updated to SDL 2.30.9 and reported his game was still crashing when plugging/unplugging his headphones. We ended up sending him a debug build to get a stack trace from ASAN. From what I can tell, the trace indicates that this issue is not fully resolved. Here is what I consider to be the relevant parts from the log. The full log is attached at the bottom.

VERBOSE: Aulib sampleRate=22050 channels=2 frameSize=236 format=0x8120
...
VERBOSE: Unhandled SDL event: SDL_AUDIODEVICEADDED 0
VERBOSE: Unhandled SDL event: SDL_AUDIODEVICEADDED 0
...
=================================================================
==1848==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x130a21cd2768 at pc 0x7ff800fe6774 bp 0x00fa411ff1b0 sp 0x00fa411fe940
WRITE of size 1888 at 0x130a21cd2768 thread T11
#0 0x7ff800fe6773 in memcpy D:\a\_work\1\s\src\vctools\asan\llvm\compiler-rt\lib\sanitizer_common\sanitizer_common_interceptors_memintrinsics.inc:121
#1 0x7ff611feb9e3 in Aulib::floatToFloatLSB T:\staphen\Projects\devilutionX-1.5\build\x64-Debug\_deps\sdl_audiolib-src\src\sampleconv.cpp:127
#2 0x7ff611ff058d in Aulib::Stream_priv::fSdlCallbackImpl T:\staphen\Projects\devilutionX-1.5\build\x64-Debug\_deps\sdl_audiolib-src\src\stream_p.cpp:224
#3 0x7ff611fd45a3 in sdlCallback T:\staphen\Projects\devilutionX-1.5\build\x64-Debug\_deps\sdl_audiolib-src\src\aulib.cpp:25
...

From the stack trace, these are the relevant lines from SDL_audiolib.

Plus, to show how this ties into the SDL audio system, here is the line where SDL_audiolib assigns the sdlCallback function to the audio spec. So SDL should be calling #3 directly from SDL_RunAudio(). https://github.com/realnc/SDL_audiolib/blob/cc1bb6af8d4cf5e200259072bde1edd1c8c5137e/src/aulib.cpp#L49


A basic review of the callback function in stream_p.cpp suggests to me that these lines are the most relevant to understanding how the buffer overflow is happening.

https://github.com/realnc/SDL_audiolib/blob/cc1bb6af8d4cf5e200259072bde1edd1c8c5137e/src/stream_p.cpp#L98-L105

    const int out_len_samples = outLen / (SDL_AUDIO_BITSIZE(fAudioSpec.format) / 8);
    const int out_len_frames = out_len_samples / fAudioSpec.channels;

    if (fStrmBuf.size() != out_len_samples) {
        fFinalMixBuf.reset(out_len_samples);
        fStrmBuf.reset(out_len_samples);
        fProcessorBuf.reset(out_len_samples);
    }

outLen is the callback parameter that receives the value of data_len from SDL, so this callback handler is computing the number of audio samples it needs to produce based on the buffer size passed by SDL. If the buffer it has is not the right size, as would be the case when the outLen parameter changes between calls, it will call fFinalMixBuf.reset() which simply allocates a new buffer of the correct size. The result should be a buffer which can be memcpy'd into the out parameter without overflow, but ASAN indicates that it's not.

Frankly, I'm a bit stumped. The fix from icculus looks to me like it would have resolved this completely, so I'm hoping someone else notices something that I don't.


Full log: stderr.log

AJenbo commented 1 day ago

It's maybe worth noting that the fix that was applied did resolve the audio corruption he experienced. (it crashed on one, and corrupted on the other but I don't remember which was which)

StephenCWills commented 1 day ago

Actually, I managed to find an audio device with device->spec.samples == 224 to test with, and I was able to reproduce the issue. I think it's just due to order of events.

https://github.com/libsdl-org/SDL/blob/c98c4fbff6d8f3016a3ce6685bf8f43433c3efcc/src/audio/SDL_audio.c#L701-L707

You can see here that we grab device->callbackspec.size into data_len before calling current_audio.impl.GetDeviceBuf(). However, if you look at the call stack for UpdateAudioStream()...

image

You can see that WASAPI_GetDeviceBuf() is triggering the code that updates device->callbackspec.size.

EDIT: Also, I moved that line data_len = device->callbackspec.size; down so it was just before the comment /* !!! FIXME: this should be LockDevice. */ and I can't reproduce the issue anymore.