bplaum / gmerlin-avdecoder

Gmerlin audio/video decoder library
GNU General Public License v2.0
0 stars 1 forks source link

regression: cannot seek backwards #9

Open umlaeute opened 7 months ago

umlaeute commented 7 months ago

in Gem i'm using gmerlin-avdecoder for random-access to film footage.

as such, i'm heavily depending on being able to arbitrarily seek in files (at least: if the file format reasonably allows for that), including backward seeks.

basically i'm doing something like this (the full test program can be found here):

void getFrame(int imgNum) {
  if(bgav_can_seek(m_file)) {
      int64_t seekpos = gavl_frame_table_frame_to_time(m_frametable, imgNum, NULL);
      bgav_seek_video(m_file, m_track, seekpos);
  }
  bgav_read_video(m_file, m_gframe, m_stream);
}

the test films i am using can be found here (anim-1.mov and homer.avi): these films use containers and codecs from the past millenium; modern footage shows the same problems however).

i can once run:

/* m_numFrames is the number of frames in the film, as reported by gavl_frame_table_num_frames() */
for (int i=0; i<m_numFrames; i++)
   getFrame(i);

however, calling getFrame(0) afterwards will yield

Info: Detected EOF 1

and the bgav_read_video() call fails with 0.

Similarly, trying to access the same frame multiple times:

for (int i=0; i<m_numFrames; i++)
   getFrame(0); /* always read the same frame */

... leads to an error (at least with the homer.avi):

[video] Warning: Cannot skip backwards: Stream time: 1 skip time: 0 difference: -1

linking my test-program against the old gmerlin-avdecoder, i don't see any of these issues.

please bring back frame-accurate seeking (I'm aware that for a simple media-player, you are more interested in playing back the video at the correct speed, probably skipping frames if decoding is too slow; but my use-case is different, and I do want to be able to play a scrub a 1 hour video and play it backwards and loop and what not)

bplaum commented 7 months ago

Ok, sample accurate seeking indeed vanished because I didn't need it. On the other hand, I cleaned up and simplified the demuxing stuff a lot in the last years. So in principle it shouldn't be too hard to re-enable. Frame- and sample accurate seeking is defintely a nice feature.

I'll put this onto my TODO list. Let's keep this issue open because I might have some questions.

bplaum commented 7 months ago

First question: Originally I built indices for the files and saved them. Do you think this is still neccesary? Parsing compressed packages for index generation is usually quite fast, I already generate seek indices on the fly for flac files. But I don't know the typical size of your files.

umlaeute commented 7 months ago

i don't really know. my users might want to scrub files that can be held in memory entirely (decoded), but they are equally likely to attempt doing the same thing on feature-film sized footage.

in the first case, it might be possible to educate them to just decodes the film and store the uncompressed frames in memory (we have constructs that allow users to do that), but in the 2nd case we are relying on the decoding library.

bplaum commented 7 months ago

Ok, so I think I'll just measure the time needed for building the index and the decide whether to save it or not.

Another point: In the past, I supported sample/frame accurate seeking for multiple streams in a file. This implied, that e.g. the audio- and video positions can be completely different and that the demuxer must always seek in the file depending on whether an audio- or video packet is needed. I think the in few cases, where this can be really useful (e.g. for editing) it would be smarter to open the same file several times. Doing only global seeks across all enabled streams simplifies things a lot internally. And the overhead of a second demuxer instance is negligible if the unneeded streams are switched off.

It would also mean, that I remove bgav_seek_audio() and bgav_seek_video() because both can be done with bgav_seek_scaled() then. The frametable for translating between frame counts and timestamps will be re-enabled of course.

Would this be ok for your workflows?

umlaeute commented 7 months ago

i do think so.

umlaeute commented 7 months ago

note however, that it would be great if i would not need to change my code :-) (that is, it would be great if the ABI could stay compatible; i know that already a number of symbols have been removed (so don't forget to bump the soname!), but so far i didn't need any of these symbols - thus no harm on my side)

would it be possible to provide bgav_seek_audio() and bgav_seek_video() implementations that just use bgav_seek_scaled() internally?

bplaum commented 7 months ago

Actually the ABI changed already because some public structs got changed :)

I looked at your test program. I'm thinking about the following: What about 2 functions

int bgav_num_video_frames(bgav_t *, int stream);

and

void bgav_seek_to_video_frame(bgav_t *, int stream, int frame);

with the meaning, that all enabled streams are seeked to that position.

Then you need to change your code a bit, but it gets much simpler (no need for the frame table anymore). Would that be an idea?

I really want to remove bgav_seek_video() and the other per stream seek functions, because independent seeking for streams adds too much complexity for a feature, which is never really needed..

bplaum commented 7 months ago

Ok, it somehow works now. I didn't work on the public functions yet, because I'm waiting for your opinion (see my last omment).

But bgav_seek_scaled() should now be frame- or sample accurate if you request it using bgav_options_set_sample_accurate(). The index is build on demand and saved in $XDG_CACHE_HOME/gmerlin-avdecoder/indices/ if index building takes more than 2 seconds. The duration threshold is hardcoded for now, I can make it configurable as well.

You can use the seektest program e.g.

./seektest -vpos 1000 -sa your_file

in the tests directory. I seeks twice, once by decoding all frames from the beginning and with bgav_seek_scaled() and compares the frames for identity.

umlaeute commented 2 months ago

i guess this is OK :-)