lieff / minimp3

Minimalistic MP3 decoder single header library
Creative Commons Zero v1.0 Universal
1.58k stars 213 forks source link

Float decode out of range [-1, 1] #104

Open szanni opened 1 year ago

szanni commented 1 year ago

Using

#define MINIMP3_FLOAT_OUTPUT
mp3dec_ex_open(MP3D_SEEK_TO_SAMPLE);
mp3dec_ex_read();

results in float samples that are outside the typical range of [-1, 1].

Is this to be expected and to be trimmed/clipped by the user? Or is this a bug?

Snippet from printing the out of range samples from the attached file:

Sample -1.008740
Sample -1.017203
Sample -1.015058
Sample 1.001051
Sample 1.005223
Sample 1.008936
Sample 1.011510
Sample 1.008202
Sample -1.003591
Sample -1.009071
Sample -1.005774
Sample 1.002314
Sample -1.006699
Sample -1.000278
Sample -1.002885
Sample 1.001274
Sample 1.005011
Sample -1.001993
Sample -1.008192
....

Sample file: 01m_40s__01m_50s.mp3.zip

And thanks for writing this amazing piece of software!

sagamusix commented 1 year ago

[-1, 1] is just the nominal output range representing 0 dBFs. But the MP3 format is not restricted to representing data in the nominal range. Even if the original input signal was in that range, due to MP3's lossy compression the reconstructed output can sometimes exceed this range. There is no need to worry about this, as float audio APIs normally have no issue with values outside of the [-1,1] range. Just clamp the range if it's really needed (e.g. when converting to 16-bit data).

szanni commented 1 year ago

Even if the original input signal was in that range, due to MP3's lossy compression the reconstructed output can sometimes exceed this range.

I personally believe this would be useful to mention in the docs, not everybody is familiar with the encoding details of mp3.

There is no need to worry about this, as float audio APIs normally have no issue with values outside of the [-1,1] range. Just clamp the range if it's really needed (e.g. when converting to 16-bit data).

No need to worry indeed, but handling values out of the [-1, 1] range is needed for backends like alsa, at least on my system - to avoid clipping.

Whatever the case, I think just a comment in the docs would be very helpful. This way users can decide how to handle this case. Clamp, use a compressor, etc.

Extend the README section:

MINIMP3_FLOAT_OUTPUT makes mp3dec_decode_frame() output to be float instead of short and additional function mp3dec_f32_to_s16 will be available for float->short conversion if needed. Be aware that the resulting float value may exceed the usual range of [-1, 1] due to the specifics of the MP3 format.

Happy to open a PR.