lieff / minimp3

Minimalistic MP3 decoder single header library
Creative Commons Zero v1.0 Universal
1.59k stars 214 forks source link

Best way to determine length in seconds on vbr mp3 #54

Closed ghost closed 5 years ago

ghost commented 5 years ago

Hi,

First, thank you very much for an excellent mp3 decoder! It is working very well for me.

In my program, I show the length (in time) of the mp3 playing and this can be calculated easily when the mp3 is a constant bitrate, but not so easy when the bitrate is variable. My only idea is to average the bitrate of the first maybe 10 frames and use that. Probably be close but maybe there is a better technique for this (other than reading the entire file first)?

Thanks for any suggestions.

lieff commented 5 years ago

Sorry for the delay, I'm missing notifications somehow, meh. Fastest way is to iterate through all frames and calculate samples from headers without decoding:

static int frames_iterate_cb(void *user_data, const uint8_t *frame, int frame_size, size_t offset, mp3dec_frame_info_t *info)
{
    (void)offset;
    frames_iterate_data *d = user_data;
    d->channels = info->channels;
    d->hz       = info->hz;
    d->layer    = info->layer;
    d->samples  += hdr_frame_samples(frame);
    /* or  d->samples += mp3dec_decode_frame(d->mp3d, frame, frame_size, 0, info); */
    /* TODO: check layer and hz change - bug, channels change - possible but rare case, handle it optionally */
    return 0;
}
frames_iterate_data d = { 0 };
mp3dec_iterate(input_file_name, frames_iterate_cb, &d);
/* calc length from samples */

This still can be slightly wrong, because some frames may be not decodable (usually first frames of "cut'ed" mp3 files). For bypass "cut'ed" mp3 files, only first frames may be decoded. For 100% bullet proof all frames must be decoded. There optimization can be done - only decode frame data, without actual samples synthesis:

    if (info->layer == 3)
    {
        int main_data_begin = L3_read_side_info(bs_frame, scratch.gr_info, hdr);
        if (main_data_begin < 0 || bs_frame->pos > bs_frame->limit)
        {
            mp3dec_init(dec);
            return 0;
        }
        success = L3_restore_reservoir(dec, bs_frame, &scratch, main_data_begin);
        if (success)
        {
            /* Can be omitted:
            for (igr = 0; igr < (HDR_TEST_MPEG1(hdr) ? 2 : 1); igr++, pcm += 576*info->channels)
            {
                memset(scratch.grbuf[0], 0, 576*2*sizeof(float));
                L3_decode(dec, &scratch, scratch.gr_info + igr*info->channels, info->channels);
                mp3d_synth_granule(dec->qmf_state, scratch.grbuf[0], 18, info->channels, pcm, scratch.syn[0]);
            }*/
        }
        L3_save_reservoir(dec, &scratch);
    } else

For layer 2:

        for (i = 0, igr = 0; igr < 3; igr++)
        {
            if (12 == (i += L12_dequantize_granule(scratch.grbuf[0] + i, bs_frame, sci, info->layer | 1)))
            {
                i = 0;
                /* Can be omitted:
                L12_apply_scf_384(sci, sci->scf + igr, scratch.grbuf[0]);
                mp3d_synth_granule(dec->qmf_state, scratch.grbuf[0], 12, info->channels, pcm, scratch.syn[0]);
                memset(scratch.grbuf[0], 0, 576*2*sizeof(float));
                pcm += 384*info->channels;
                */
            }
            if (bs_frame->pos > bs_frame->limit)
            {
                mp3dec_init(dec);
                return 0;
            }
        }
ghost commented 5 years ago

My idea of checking a few frames in the beginning did not work too well. Definitely not enough data points to be remotely accurate, maybe if it was a sine wave it might be ok but not for anything more complex.

Thank you for the sample code, I will use this method.

lieff commented 5 years ago

Yeah, vbr bit-rate can be highly dependent on audio data (silence for example can produce high bit-rate drop). So we can't really deduce which place we can partially process.