FFMS / ffms2

An FFmpeg based source library and Avisynth/VapourSynth plugin for easy frame accurate access
Other
574 stars 104 forks source link

Seeking failures when a B frame depends on a future I frame #403

Closed benrg closed 9 months ago

benrg commented 2 years ago

This video (extracted from the beginning of https://www.libde265.org/hevc-bitstreams/elephants-dream-1080-cfg02.mkv) illustrates the problem, but I think it affects any video that has the property in the title.

If you request frames starting from 0, the video decodes correctly, but if you start from 41 (which is identified as a key frame), all frames from 41 to 48 are copies of frame 48, after which it starts to decode correctly.

Frame 41 in coded order is an I frame, but it's frame 48 in presentation order, and is followed in coded order by B frames that are 44, 42, 41, 43, 46, ... in presentation order.

The above is what happens in a build from the current head (ff61bca) against ffmpeg 5.0.1. The official 2.40 binaries behave a bit differently: 41 to 44 are copies of 48, then 45 gives you 49, and all later frames are off by 4. If you start from 0 you get the correct frames.

benrg commented 2 years ago

Here's a simpler test case, H.264 this time. There are five frames, IBIBI in presentation order. If you request frame 4 (0-based) and then frame 2, you get frame 4 twice.

works.mp4 in the same archive doesn't have the problem, even though it contains exactly the same video. broken.h264 and works.mp4 were created by x264-r3095-baee400.exe --open-gop, and broken.mp4 was created by ffmpeg -i works.mp4 -c copy broken.mp4 (ffmpeg 5.0.1).

myrsloik commented 9 months ago

Reproduced. It is however caused by FFmpeg marking all I-frames as IDR-frames on stream copy. Most files in the world could be secretly broken. Have a nice evening.

benrg commented 9 months ago

Given that:

I think this issue should be left open. That way it's easier to find if someone else runs into the problem, and maybe someone will even submit a patch for it, even if you never work on it. If it must be closed, at least close it as wontfix instead of completed.

dwbuiten commented 9 months ago

AFAIK, there is no reasonable way to support / work around this, since, for example, the stss box in those MP4s mark non-IDR I-frames as IRAP.

Or well, there is no way to work around it unless FFmpeg decides to expose more than I, P, and B as frame types in its API, since this is the root cause of both the remuxing issue, and an inability to work around the bad remuxes. The reason broken.h264 and the mkv is busted is the same (they rely on the parser).

There might be an open bug on FFmpeg Trac (as this is an FFmpeg iissue), but I didn't find one.

myrsloik commented 9 months ago

I did encounter this bug when googling yesterday: https://trac.ffmpeg.org/ticket/8820 Basically it implies that at some point FFmpeg sometimes implicitly leaked which frames are truly seekable to when skipping. I failed to reproduce it even with older builds and then proceeded to inelegantly poke FFmpeg's internals and the parsed list of stss keyframes. Which is, as we've stated, is indeed different between the clips and FFmpeg does use

benrg commented 9 months ago

I looked at this a bit and I'm not sure that marking I-frames as key frames is the problem.

According to this answer, FFmpeg only marks IDR-frames and recovery-point I-frames as key frames. The I-frames in broken.* are marked as recovery points with recovery_frame_cnt = 0, which means, I think, that you can seek there and correctly decode every frame that's later in presentation order. So having a key frame there seems like a good thing.

With Elephant's Dream, the problem is that FFmpeg returns the packets in decoding order with no information about the correct presentation order. So FilePos, OriginalPos, FrameType, and KeyFrame in FFMS2's index are all wrong, and I'm impressed that it still manages to return the right frame some of the time.

Muxing to mp4 or mkv using FFmpeg doesn't solve the problem because FFmpeg writes the same bogus PTSs to the muxed file, and then trusts them when reading it later, so you end up with the same index. This is definitely a bug in FFmpeg unless I'm missing something.

Muxing to mkv using MKVToolNix does solve the problem. It generates its own seemingly correct timestamps, the FFMS2 index is correct, and I wasn't able to reproduce this issue. Remuxing it with FFmpeg produces another working file, so the problem is just the initial generation of the timestamps. The file generated by MKVToolNix has cue points for all three I-frames (even though only the first is IDR), and all three I frames are marked as key frames in FFMS2's index, but it works anyway.

ffprobe -show_frames lists the frames in presentation order, so it is possible to get that information from the ffmpeg libraries. But it's very slow. It seems to be completely decoding the video even though it only displays a few basic properties of it. I don't know whether it's possible to get the information from FFmpeg without the speed hit.

With broken.*, the index is correct (ignoring the key-frame issue), but for some reason, when attempting to seek to frame 2 and decode it, although the seek succeeds and the appropriate data seems to be going to avcodec_send_packet, avcodec_receive_frame ends up failing with AVERROR_EOF. Frame 2 is an I-frame and a recovery point, and we're only trying to decode that frame, so I can't see any reason why it shouldn't work. With works.mp4, because frame 2 isn't marked as a key frame, decoding starts from frame 0 instead, and that ends up working. So I'm not convinced that marking frame 2 as a key frame is the problem in this case either. It's somehow triggering another problem.

myrsloik commented 9 months ago

So I'm not convinced that marking frame 2 as a key frame is the problem in this case either. It's somehow triggering another problem.

Internally seeking is (nowadays) simply done with av_seek_frame(FormatContext, VideoTrack, Frames[n].PTS, AVSEEK_FLAG_BACKWARD); meaning that whatever frame at or before the timestamp FFmpeg thinks is usable will be the target. So if FFmpeg internally has a somewhat correct idea (which it does, hidden deeply inside the mov/mp4 parser) it will still pick an earlier frame and work.

There's a huge number of bugs related to seeking and timestamps in FFmpeg, that's why I kinda gave up and wrote https://github.com/vapoursynth/bestsource instead which just linearly decodes everything. Snappy results on most things up to 30 min long on modern hardware.