lordmulder / LameXP

Audio Encoder Front-End
http://lordmulder.github.io/LameXP
Other
184 stars 18 forks source link

higher quality mp3 decoder? when doing conversions #46

Closed cmrdt closed 7 years ago

cmrdt commented 7 years ago

instead of mpg123 and output as 16-bit integer using libmad http://mp3decoders.mp3-tech.org/24bit.html www.underbit.com/resources/mpeg/audio/compliance/ http://www.mp3-converter.com/decoders/ or (ffmpeg.exe -loglevel "debug" says opus accepts 16-bit-signed or 32-floats and not44.1kHz but 48000|24000|16000|12000|8000Hz..) thus has to be resampled if 44.1 ffmpeg -c:a mp3float -i ___ -filter:a aresample=resampler=soxr:precision=33:osf=flt -c:a:0 libopus

dont know how ffmpeg(mp3float) compares to libmad

are people more sensitive to bit depth (24 vs 32 ) than frequency (96 192 kHz) overkill

lordmulder commented 7 years ago

There is no such thing as a "higher quality mp3 decoder". MP3 is a standardized format, so every decoder will produce the exactly same output from the same input bitstream – provided, of course, that it is a valid MP3 bistream (result are "undefined" for corrupted streams), and provided that the decoder isn't buggy.

The decoder's output sample format, e.g. Integer vs. Floating-Point, is merely a decision of the decoder's "external" interface. And mpg123 even can be forced to output Floating-Point format by using --float option. However, there is little advantage in having the MP3 decoder output Floating-Point format. Keep in mind that even "CD quality" is only 16-Bit Integer. And MP3 is worse quality than CD, because it's a lossy compression. Compressed audio formats (like MP3) don't really have a bit-depths in the sens of "bits per sample", but the equivalent of a 192 kbps MP3 at 44.100 Hz and Stereo would be about 2 Bit per sample! Provided that the decoder uses proper dithering (instead of just truncation), I highly doubt you can hear a difference between 16-Bit Integer and 32-Bit Floating-Point decoded from the same lossy MP3 file. Even more so, because most (pretty much all) audio hardware can't deal with Floating-Point directly – which means that, in the end, your 32-Bit Floating-Point samples would be rounded to Integer (16- or 32-Bit) anyway.

Don't get me wrong, I'm not totally against using Floating-Point format. It can have some real advantage in production when a lot of filters are applied in a row, because the audio data doesn't need to get rounded/dithered after each processing step. But many audio tools can not deal with 32-Bit Floating-Point data. So, if we allowed decoders to output 32-Bit Floating-Point files, we would either require that all encoders and all filtering tools support 32-Bit Floating-Point input; or we would require to negotiate "Integer vs. Floating-Point" between each pair of successive tools in the processing chain – which is complicated.


As an aside: Yes, sampling rates above 48 KHz are "overkill". With 48 KHz it's already possible to keep frequencies of up to 24 KHz, according to Nyquist's Theorem. That's enough to keep all frequencies a human can hear – about 20 Hz to 20 KHz. That's why new compressed formats, such as Opus, are fixed at 48 KHz.