hajimehoshi / go-mp3

An MP3 decoder in pure Go
Apache License 2.0
746 stars 80 forks source link

Remove the first 528 samples (only if the encoder is LAME?) #25

Open hajimehoshi opened 6 years ago

hajimehoshi commented 6 years ago

https://web.archive.org/web/20180111173850/http://lame.sourceforge.net/tech-FAQ.txt

It looks like the first 528 samples are always zero in MP3 decoded results that is decoded by the (almost?) all decoders. Then, should go-mp3 skip the first 528 samples?

hajimehoshi commented 6 years ago

In my opinion, I'm leaning toward not to fix this and keep this as it is so that we can keep go-mp3 'dumb' as a decoder infrastructure.

hajimehoshi commented 6 years ago

On second thought, as ffmpeg treats this problem well (ffmpeg sets 'start skip samples' if the decoder is LAME), go-mp3 should treat this problem... 🤔

hajimehoshi commented 6 years ago

http://mp3decoders.mp3-tech.org/decoders_lame.html Ugh

lieff commented 6 years ago

I think it's better keep decoder lib itself as simple as frame in->frame out. But example/application can address this problem.

hajimehoshi commented 6 years ago

I'm still not sure. In the current go-mp3, there is no way to detect the encoder, so it is almost impossible to care this problem on example/application side.

lieff commented 6 years ago

Example can accept command line param, same as lame's --decode-mp3delay x. So user can be aware about this problem.

hajimehoshi commented 6 years ago

So you mean specifying delay for each file would work? Well that would work, but I'd want some library to detect the delay amount.

To be exact, there is also a tail part that should be ignored :-/

lieff commented 6 years ago

I think this solution is at least encoder dependent. If we want remove leading zeros - it's possible. If we want align output same as original input - we must remove encoder-dependent number of samples from the beginning... (in range 576 + 481...510 + any additional encoder wants) I'm not sure how absolute delay minimum polyphase filterbank can be optimized, 481 or even lower.

lieff commented 6 years ago

https://github.com/aspt/diff_tools can be used to measure real delay.

hajimehoshi commented 6 years ago

Doesn't this require the original sound file?

I use ffmpeg -i for this purpose btw.

lieff commented 6 years ago

Yes, this tool compares two audio files and can compute time shift using -align[int] option, [int] - max shift in samples, k/m can be used (-align128k for example).

lieff commented 6 years ago

I've checked lame 3.99.5, both VBR and CBR shows 2257 samples delay using other decoders, zero delay using lame --decode decoder, 1728 samples is I corrupt just first frame metadata, and 576 samples if I fully corrupt first frame which contains metadata: 1a16-3fc2-a586-50ec

So lame just store information in first frame ancillary data and use it to compensate time shift. This data changes over lame versions (note 1105 delay for CBR lame 3.87 from table). It's not standard behaviour, decoder can't be absolutely universal here.

hajimehoshi commented 6 years ago

OK, so this is the same result as http://mp3decoders.mp3-tech.org/decoders_lame.html, right?

lieff commented 6 years ago

Almost, Lame 3.87 CBR Overall Delay is 1105 samples in table, it seems changed in lame 3.99.5. Lame CBR injects "Info" block in first frame, which affects resulting lame --decode delay: 0 if Info block exists, 1728 if Info block corrupted (I've replaced "Info" bytes with spaces). Lame VBR injects "Xing" block in first frame with same behaviour.

lieff commented 6 years ago

Here latest lame decoder always compensate 529 samples: https://github.com/lingfennan/mp3lame/blob/master/lame/frontend/get_audio.c#L553 And additional encoder delay encoded in Info/Xing header: https://github.com/lingfennan/mp3lame/blob/master/lame/libmp3lame/VbrTag.c#L831