libsdl-org / SDL

Simple Directmedia Layer
https://libsdl.org
zlib License
10.23k stars 1.87k forks source link

testautomation --filter audio_resampleLoss fails #8036

Closed slouken closed 1 year ago

slouken commented 1 year ago

We may have an audio resampling quality regression here, can you look at this for 2.28.2?

icculus commented 1 year ago

There's definitely something going on in SDL3 that I'm tracking down after the audio rewrite merges, but did this code change for SDL2 recently too? I'll check.

slouken commented 1 year ago

Ah, it looks like this is only broken in SDL3.

slouken commented 1 year ago

Can you urgently look at this after the audio rewrite merge? I believe this will affect DOTA.

icculus commented 1 year ago

Yeah, I'll take care of that.

slouken commented 1 year ago

@icculus, this is still failing after your merge. Can you please take a look?

icculus commented 1 year ago

So after tracking this down, this failure landed in e5a6c24c822e11c9a99f10adad49f904b8170562, when the resampler changed dramatically, which isn't surprising.

The bigger problem is not this test, though, it's the crackling you can hear in test/testaudiostreamdynamicresample.c, when you slow the playback down with downsampling (drag the slider to the left)...upsampling might also do it, but slowing the thing way down makes it way easier to hear it over the playback. My guess is this what's also upsetting testautomation, but it is upsetting my human ear in any case, so I'll be digging around to figure out the problem on this as my next task. I'll have to dig around in the original commits before we squashed it down, if they still exist, because I'm sure this wasn't in there for most of the development work.

But I'll start by seeing exactly what the test is outputting; if there's an obvious crackle in it, then it's the same issue for sure.

madebr commented 1 year ago

I ran a diff between the resampled audio generated before and after e5a6c24, and found out the audio is identical for the first chunk, but is different for the 2nd chunk (and next). https://github.com/libsdl-org/SDL/blob/dabd45997eb681ea216957834a8debf4cf9d928d/src/audio/SDL_audiocvt.c#L1051-L1055

(when comparing the resampled audio for this test spec)

icculus commented 1 year ago

The click is gone as of 91cd5478be5aaf684636d64662c4a63acb09afdc, but I'm still failing the resampleLoss test, so I'll look more closely into that next.

icculus commented 1 year ago

So here's the problem:

image

...the top waveform is the resampled one, the next one down is the ideal version we generate on the fly; over time you can see the waveform is shifting a little, but otherwise is producing what appears to be a correct waveform, but this test only cares about the difference between samples at the same point, so it sees this as significant noise. I'll see if I can improve this, but I'm not sure if this is actually a serious problem in reality.

The bottom two is the other test, which is passing; not sure yet if the difference is the lower sample rate, or that it is resampling to an integer multiple, or that one is upsampling and the other downsampling.

icculus commented 1 year ago

not sure yet if the difference is the lower sample rate, or that it is resampling to an integer multiple, or that one is upsampling and the other downsampling.

Downsampling has similar result. I'm assuming it's the integer multiple that made the difference here but I'll check.

image

icculus commented 1 year ago

Integer multiples looks pretty good.

Downsample:

image

Upsample:

image

So it's just a question of where we choose to sample from, I guess.

madebr commented 1 year ago

Is it expected that the output float audio values lie outside the [-1.f, 1.f] range?

icculus commented 1 year ago

They shouldn't, but the resampler will give you garbage output for garbage input without clamping.

Is there somewhere we're giving values outside that range? It's not apparent to me in the screenshots, but I haven't looked at this sample by sample yet.

madebr commented 1 year ago

I wrote the following little thing to save the input and output to a wav file. It should also build with SDL3 from e5a6c24c822e11c9a99f10adad49f904b8170562 (and before)

write to wavfile ```patch diff --git a/test/testautomation_audio.c b/test/testautomation_audio.c index a0598fdc5..3fb53a7da 100644 --- a/test/testautomation_audio.c +++ b/test/testautomation_audio.c @@ -2,6 +2,7 @@ * Original code: automated SDL audio test written by Edgar Simo "bobbens" * New/updated tests: aschiffler at ferzkopp dot net */ +#define SDL_ENABLE_OLD_NAMES /* quiet windows compiler warnings */ #if defined(_MSC_VER) && !defined(_CRT_SECURE_NO_WARNINGS) @@ -771,6 +772,50 @@ static double sine_wave_sample(const Sint64 idx, const Sint64 rate, const Sint64 return SDL_sin(((double)(idx * freq % rate)) / ((double)rate) * (SDL_PI_D * 2) + phase); } +#include +void save_wave(SDL_RWops *dst, void *samples, size_t count_samples, int rate, int format) { + int channels = 1; + int bits_per_sample = 16; + int chunk2_size = count_samples * channels * bits_per_sample / 8; + + /* RIFF header */ + SDL_WriteBE32(dst, 0x52494646); /* "RIFF" */ + SDL_WriteLE32(dst, 36 + chunk2_size); /* ChunkSize */ + SDL_WriteBE32(dst, 0x57415645); /* "WAVE" */ + + /* chunk: WAVE */ + SDL_WriteBE32(dst, 0x666d7420); /* "fmt " */ + SDL_WriteLE32(dst, 16); /* Subchunk1Size (PCM = 16) */ + SDL_WriteLE16(dst, 1); /* AudioFormat (PCM = 1) */ + SDL_WriteLE16(dst, channels); /* NumChannels */ + SDL_WriteLE32(dst, rate); /* SampleRate */ + SDL_WriteLE32(dst, rate * channels * bits_per_sample / 8); /* ByteRate */ + SDL_WriteLE16(dst, channels * bits_per_sample / 8); /* BlockAlign */ + SDL_WriteLE16(dst, bits_per_sample); /* BitsPerSample */ + + /* chunk: data */ + SDL_WriteBE32(dst, 0x64617461); /* "data" */ + SDL_WriteLE32(dst, chunk2_size); /* Subchunk2Size */ + + for (size_t i = 0; i < count_samples; i++) { + switch (format) { + case AUDIO_F32:{ + float sample = ((float*)samples)[i]; + if (sample < -1.f || sample > 1.f) { + SDL_Log("sample=%f", sample); + } + /*SDL_assert(sample >= -1.f);*/ + /*SDL_assert(sample <= 1.f);*/ + SDL_WriteLE16(dst, sample * 32700); + break; + } + default: + SDL_Log("Unsopported format"); + abort(); + } + } +} + /** * \brief Check signal-to-noise ratio and maximum error of audio resampling. * @@ -849,6 +894,16 @@ static int audio_resampleLoss(void *arg) *(buf_in + i) = (float)sine_wave_sample(i, spec->rate_in, spec->freq, spec->phase); } + { + char name[256]; + SDL_RWops *rwops; + SDL_snprintf(name, sizeof(name), "/tmp/from_%d_to_%d_in.wav", spec->rate_in, spec->rate_out); + rwops = SDL_RWFromFile(name, "wb"); + + save_wave(rwops, buf_in, len_in / (int) sizeof(float), spec->rate_in, AUDIO_F32); + SDL_RWclose(rwops); + } + tick_beg = SDL_GetPerformanceCounter(); ret = SDL_PutAudioStreamData(stream, buf_in, len_in); @@ -898,6 +953,17 @@ static int audio_resampleLoss(void *arg) sum_squared_error += error * error; sum_squared_value += target * target; } + + { + char name[256]; + SDL_RWops *rwops; + SDL_snprintf(name, sizeof(name), "/tmp/from_%d_to_%d_out.wav", spec->rate_in, spec->rate_out); + rwops = SDL_RWFromFile(name, "wb"); + + save_wave(rwops, buf_out, len_out / (int) sizeof(float), spec->rate_out, AUDIO_F32); + SDL_RWclose(rwops); + } + SDL_free(buf_out); signal_to_noise = 10 * SDL_log10(sum_squared_value / sum_squared_error); /* decibel */ SDLTest_AssertCheck(isfinite(sum_squared_value), "Sum of squared target should be finite."); ```

When I import /tmp/from_44100_to_48000_in.wav and /tmp/from_44100_to_48000_out.wav in audacity, invert the second track, and then mix them together, the following appears: image

Earlier, I compared the generated audio before and after https://github.com/libsdl-org/SDL/commit/e5a6c24c822e11c9a99f10adad49f904b8170562 and found out there is an issue between chunks.

The patch also echoes samples that are out of range to the console.

icculus commented 1 year ago

Earlier, I compared the generated audio before and after e5a6c24 and found out there is an issue between chunks.

We were overflowing the source buffer by a sample or two, which I fixed a few days ago; it would produce a value way outside the range at the end of a chunk, since it was reading bogus memory, which is likely what this problem was. So this specific thing should be fixed in main...?

madebr commented 1 year ago

The screenshot of https://github.com/libsdl-org/SDL/issues/8036#issuecomment-1679248637 was made with current main (cbab33482a99fe91432c6ab6fc9a36acc58c01fe)

The issue can be exacerbated by reducing the chunk size.

icculus commented 1 year ago

Oh, I see what you're saying now: it's not that the samples are drifting generally, it's that they are drifting exactly at the chunk boundaries...which is why the screenshot shows variation in steps.

I thought that was a separate discussion, because we were talking about samples outside the range and that was happening before. My mistake!

icculus commented 1 year ago

...and yeah, if I take out the chunking, it totally works correctly.,,

icculus commented 1 year ago

Raw data sent to ResampleAudio, not counting the padding buffers. Top is the non-chunked version, bottom is the chunked version with each call's data appended to a single waveform:

image

...chunking appears to be losing the end of the buffer by a few samples.

icculus commented 1 year ago

Pretty sure it's this piece here:

https://github.com/libsdl-org/SDL/blob/cbab33482a99fe91432c6ab6fc9a36acc58c01fe/src/audio/SDL_audiocvt.c#L876-L893

We're probably losing a sample frame every iteration here:

        const int resampled_input_frames = (int) ((((Uint64) input_frames) * src_rate) / dst_rate);

The right thing to do is move this block out to the top level function, so it's calculated once and doesn't accumulate losses, and then subtract from this number on each iteration as we eat data.

For now, I'm just going to turn off chunking, so we're in a working state but will allocate unbounded memory for the work buffer, so it's a problem if you try to resample a massive buffer in one shot.

When I walk through all the logic, I'll fix this properly and close the bug.

0x1F9F1 commented 1 year ago

I'm not sure SDL_ResampleAudio ever really worked correctly, even in SDL2. To correctly handle chunking, the code needs to keep track of the offset within the current sample.

I've written a fix https://github.com/0x1F9F1/SDL/commit/a3cc4eb7fccb867530e1e9c2a0e06c059fb0765e (though I'm not 100% sure it correctly handles all edgecases). Edit: Also fixed/improved ResampleAudio https://github.com/0x1F9F1/SDL/commit/7058645455e9f3216cb755750c88f1ca78535ca4

slouken commented 1 year ago

This is fixed, thanks!