Open hajimehoshi opened 6 years ago
In my opinion, I'm leaning toward not to fix this and keep this as it is so that we can keep go-mp3 'dumb' as a decoder infrastructure.
On second thought, as ffmpeg treats this problem well (ffmpeg sets 'start skip samples' if the decoder is LAME), go-mp3 should treat this problem... 🤔
I think it's better keep decoder lib itself as simple as frame in->frame out. But example/application can address this problem.
I'm still not sure. In the current go-mp3, there is no way to detect the encoder, so it is almost impossible to care this problem on example/application side.
Example can accept command line param, same as lame's --decode-mp3delay x
. So user can be aware about this problem.
So you mean specifying delay for each file would work? Well that would work, but I'd want some library to detect the delay amount.
To be exact, there is also a tail part that should be ignored :-/
I think this solution is at least encoder dependent. If we want remove leading zeros - it's possible. If we want align output same as original input - we must remove encoder-dependent number of samples from the beginning... (in range 576 + 481...510 + any additional encoder wants) I'm not sure how absolute delay minimum polyphase filterbank can be optimized, 481 or even lower.
https://github.com/aspt/diff_tools can be used to measure real delay.
Doesn't this require the original sound file?
I use ffmpeg -i
for this purpose btw.
Yes, this tool compares two audio files and can compute time shift using -align[int] option, [int] - max shift in samples, k/m can be used (-align128k for example).
I've checked lame 3.99.5, both VBR and CBR shows 2257 samples delay using other decoders, zero delay using lame --decode
decoder, 1728 samples is I corrupt just first frame metadata, and 576 samples if I fully corrupt first frame which contains metadata:
So lame just store information in first frame ancillary data and use it to compensate time shift. This data changes over lame versions (note 1105 delay for CBR lame 3.87 from table). It's not standard behaviour, decoder can't be absolutely universal here.
OK, so this is the same result as http://mp3decoders.mp3-tech.org/decoders_lame.html, right?
Almost, Lame 3.87 CBR Overall Delay is 1105 samples in table, it seems changed in lame 3.99.5.
Lame CBR injects "Info" block in first frame, which affects resulting lame --decode
delay: 0 if Info block exists, 1728 if Info block corrupted (I've replaced "Info" bytes with spaces).
Lame VBR injects "Xing" block in first frame with same behaviour.
Here latest lame decoder always compensate 529 samples: https://github.com/lingfennan/mp3lame/blob/master/lame/frontend/get_audio.c#L553 And additional encoder delay encoded in Info/Xing header: https://github.com/lingfennan/mp3lame/blob/master/lame/libmp3lame/VbrTag.c#L831
https://web.archive.org/web/20180111173850/http://lame.sourceforge.net/tech-FAQ.txt
It looks like the first 528 samples are always zero in MP3 decoded results that is decoded by the (almost?) all decoders. Then, should go-mp3 skip the first 528 samples?