phoboslab / pl_mpeg

Single file C library for decoding MPEG1 Video and MP2 Audio
799 stars 58 forks source link

Any chance of a seek API? #7

Closed raysan5 closed 4 years ago

raysan5 commented 4 years ago

I'm using this library with raylib to play MPEG files, it works flawlessly! In under 150 lines of C I got a nice video player!

Just wondering if there is any plan to add a frame/sample seek API. It would be really great.

(I can keep decoding frames/samples to implement a seek mechanism but it seems a bit inefficient)

EDIT: Just created a function to get video length but it's extremely inefficient, it takes up to 10 seconds on my system. I can probably execute it in a second thread while starting playing the video in main thread... but... any better idea to get video length?

int plm_get_total_frames(const char *filename)
{
    int total_frames = 0;

    plm_t *plm = plm_create_with_filename(filename);

    if (plm != NULL)
    {
        plm_frame_t *frame = plm_decode_video(plm);
        while (frame != NULL)
        {
            total_frames++;
            frame = plm_decode_video(plm);
        }

        plm_destroy(plm);
    }

    return total_frames;
}
raysan5 commented 4 years ago

I took the day to play with this and I got a basic video seek API working, actually, just implemented a plm_skip_video_frame() that is a copy of plm_decode_video() but commenting plm_video_decode_slice() call. Now I'm taking a look to audio...

raysan5 commented 4 years ago

Just in case it can be useful for someone, here my seek code:

if (IsMouseButtonPressed(MOUSE_LEFT_BUTTON) && CheckCollisionPointRec(GetMousePosition(), timeBar))
{
    int timeBarPositionX = GetMouseX();

    // Get equivalent audio frame
    currentAudioFrame = (timeBarPositionX*totalAudioFrames)/timeBar.width;

    // Get equivalent video frame (use audio frame to sync)
    currentVideoFrame = (int)((float)currentAudioFrame*(float)totalVideoFrames/(float)totalAudioFrames);

    // Reset video/audio
    if (plm != NULL)
    {
        plm_rewind(plm);
        frame = NULL;
        samples = NULL;
        baseTime = 0.0;
        timeExcess = 0.0;
    }

    // Move to required video/audio frame
    for (int i = 0; i < currentAudioFrame; i++) samples = plm_decode_audio(plm); // TODO: plm_skip_audio_frame(plm)
    for (int i = 0; i < currentVideoFrame; i++) frame = plm_skip_video_frame(plm);
}

About sync, I'm requesting video/audio frames manually and measuring time to display one new frame every 40 ms (1000ms/25fps), I use timeExcess counter to discard frames if required (due to time measure not being perfect. This is the update code:

if ((plm != NULL) && !pause)
{
    // Video should run at 'framerate' fps => One new frame every 1/framerate
    double time = (GetTime() - baseTime);
    double frameTime = (1.0/framerate);

    if (time >= frameTime)
    {
        timeExcess += (time - frameTime);
        baseTime = GetTime();

        // Decode video frame
        frame = plm_decode_video(plm);  // Get frame as 3 planes: Y, Cr, Cb
        currentVideoFrame++;

        if (timeExcess >= frameTime)
        {
            // Discard previous frame a load new one
            frame = plm_decode_video(plm);
            currentVideoFrame++;
            timeExcess = 0;
        }

        if (frame != NULL)      // We got a video frame!
        {
            plm_frame_to_rgb(frame, imFrame.data);  // Convert (Y, Cr, Cb) to RGB on the CPU (slow)
            UpdateTexture(texture, imFrame.data);   // Update texture with new data for drawing
        }
    }

    // Refill audio stream if required
    while (IsAudioStreamProcessed(stream))
    {
        // Decode audio sample
        samples = plm_decode_audio(plm);
        currentAudioFrame++;

        if (samples != NULL)     // We got an audio frame (1152 samples)!
        {
            // Copy sample to audio stream
            UpdateAudioStream(stream, samples->interleaved, PLM_AUDIO_SAMPLES_PER_FRAME*2);
        }
    }
}

It works really good and fast, just an acceptable seek delay (~0.5s) due to the plm_decode_audio() usage (it should be replaced by plm_skip_audio_frame()).

RandyGaul commented 4 years ago

Hmm. I'm inspired to try this out myself! Thank Ray, looks very useful :)

raysan5 commented 4 years ago

Hi @RandyGaul! Glad you find it useful!

Just in case you or someone else wants to try it, here there are the sources: raylib_video_player.zip

I'm not submitting the plm_skip_video_frame() implementation to pl_mpeg because it looks a bit hacky to me, it probably needs review and improvement... Also, plm_skip_audio_frame() is not implemented, audio decoding is fast enough to deal with it... but it adds 0.5-1 sec. delay to seeking and about 2 sec. on video loading.

r-lyeh commented 4 years ago

Ping @phoboslab as he is not watching the repo :D

RandyGaul commented 4 years ago

@phoboslab why aren’t you following your own repo? XD

phoboslab commented 4 years ago

Well, the simple honest truth is that this is just a hobby project. If you want to get my attention, pay me. Otherwise I'll work on it whenever I feel like it.

On topic: thanks for the seeking code! Looks like a good starting point to put this into pl_mpeg proper.

I believe calculating the total video length could be done by just looking at the PS packet headers, as most of them include a presentation timestamp. Then there'd be no need for demuxing or even decoding to find the length.

Ideally seeking would be done by a binary search - estimating the byte offset, finding packet headers and jumping forward/backward until we are at the right spot. Or, probably much cleaner: build an index of packet offsets and timestamps at load time.

RandyGaul commented 4 years ago

Pretty sure everyone assumes these kinds of things are merely hobbies :)

phoboslab commented 4 years ago
int plm_seek(plm_t *self, double time, int seek_exact);

A seek API is now implemented. Seeking should be instant provided your data source can keep up. The seeking is done on the packet layer and no unnecessary video or audio data is decoded in the process. Note that seeking to inter frames will still decode all frames starting at the previous intra frame.

This does not build an index of timestamps up front, but instead guesses the byte offset of the seek time. Writing the function that seeks to the correct PS packet took me longer than I'd like to admit. It's a bit hairy, but heavily documented for anyone interested: https://github.com/phoboslab/pl_mpeg/blob/master/pl_mpeg.h#L1876

From the documentation:

Seek to the specified time, clamped between 0 -- duration. This can only be 
used when the underlying plm_buffer is seekable, i.e. for files, fixed 
memory buffers or _for_appending buffers. 

If seek_exact is TRUE this will seek to the exact time, otherwise it will 
seek to the last intra frame just before the desired time. Exact seeking can 
be slow, because all frames up to the seeked one have to be decoded on top of
the previous intra frame.

If seeking succeeds, this function will call the video_decode_callback 
exactly once with the target frame. If audio is enabled, it will also call
the audio_decode_callback any number of times, until the audio_lead_time is
satisfied.

Returns TRUE if seeking succeeded or FALSE if no frame could be found.

Edit: there's also a new function to get the file's duration:

double plm_get_duration(plm_t *self);