Support Matroska CueDuration and CueRelativePosition for better subtitle handling on seeks

GoogleCodeExporter commented 9 years ago

See Matroska specs for the details on the elements.

The basic idea is to "cherry-pick" subtitle frames that are still valid at the 
target seek point out of the file, so that you immediately get subs after the 
seek.

A few samples are archived locally.

Original issue reported on code.google.com by h.lepp...@gmail.com on 12 Dec 2012 at 7:27

GoogleCodeExporter commented 9 years ago

To add a bit more to the mystery, 

I cannot seem to reproduce this if I remux my sample in #44 which used 
--timecode-scale 10000 (0.01ms precision),
with the default --timecode-scale 1000000 (1.0ms precision).

Original comment by cyber.sp...@gmail.com on 20 Mar 2015 at 3:00

GoogleCodeExporter commented 9 years ago

I can see the problems happening even with the vanilla nightly so I think this 
belongs into a separate bug report.

Original comment by sneaker...@googlemail.com on 20 Mar 2015 at 3:20

GoogleCodeExporter commented 9 years ago

You seem to be correct...

Testing now, I can actually reproduce the problem with the sample in #44 with 
every LAV release on Github (0.60.1, 0.61, 0.61.1, 0.61.2, 0.62, 0.63, 0.64) as 
well as nightly builds. The bug seems to be LAV specific as well, since I 
cannot reproduce it with Haali or Gabest splitters.

Original comment by cyber.sp...@gmail.com on 20 Mar 2015 at 3:59

GoogleCodeExporter commented 9 years ago

Other than the unrelated issue, does the patch seem to work properly for 
everyone now? Then I will give it a final review and merge it soon.

Original comment by h.lepp...@gmail.com on 24 Mar 2015 at 10:21

GoogleCodeExporter commented 9 years ago

I have tested mkvmerge 7.7.0 and ffmpeg files and didn't notice any issues.

Original comment by sneaker...@googlemail.com on 24 Mar 2015 at 10:40

GoogleCodeExporter commented 9 years ago

The attached sample seems to behave strangely when used with madVR. Upon seek, 
sometimes the majority of the subtitles for the frame are missing. After a 
second or so, it recovers and displays correctly.

I'd speculate what's happening here, is that LAV hasn't finished sending all 
subtitle lines for the first video frame at a seek point, prior to sending 
future video frames to madVR's CPU queue. For subtitle rendering to function 
correctly, required subtitles lines need to be received prior or synchronously 
with madVR receiving video frames.

Original comment by cyber.sp...@gmail.com on 24 Mar 2015 at 11:36

Attachments:

[Chihiro]_Tonari_no_Kaibutsu-kun_ED.7z

GoogleCodeExporter commented 9 years ago

There is no guaranteed order of sending things, it'll just demux and send as 
fast as the consumers are able to accept data. There isn't really a way to 
change that, either.

Original comment by h.lepp...@gmail.com on 24 Mar 2015 at 11:44

GoogleCodeExporter commented 9 years ago

Considering video would need to be decoded first, one would hope that the 
subtitle render can receive data faster than the video decoder handle video, 
especially since decoding video has a certain delay, so it has to decode like 
2-3 frames at least usually before the first goes to the video renderer (more 
if multi-threading).

Original comment by h.lepp...@gmail.com on 24 Mar 2015 at 11:46

GoogleCodeExporter commented 9 years ago

Should this issue be reported to the madVR people, then?

Alternatively, we could change lav so that it delays sending video packets 
until all subtitle packets for that timecode have been sent (and maybe only 
activate this behavior right after seeking). However, that solution seems like 
a bit of a hack.

Original comment by johnp...@gmail.com on 24 Mar 2015 at 5:48

GoogleCodeExporter commented 9 years ago

Seems rather risky, it might end up in deadlocks at worst, or glitching, if the 
subtitle renderer is being dumb.

Original comment by h.lepp...@gmail.com on 24 Mar 2015 at 6:14

GoogleCodeExporter commented 9 years ago

This is same patch as #43, I just got rid of a variable never used and fixed 
some minor typo.

https://github.com/onomatopellan/FFmpeg/commit/0cf2e2f

https://github.com/onomatopellan/FFmpeg/commit/0cf2e2f.patch

Original comment by onomatop...@gmail.com on 24 Mar 2015 at 7:31

GoogleCodeExporter commented 9 years ago

>Should this issue be reported to the madVR people, then?

After further testing, I was able to reproduce it in EVR as well, so it doesn't 
seem to be madVR specific.

What I did discover, was that I seem to be unable to reproduce this with MPC-HC 
VSFilter or ISR, yet xy-VSFilter/XySubFilter and Gabest VSFilter 2.39 all have 
the issue.

Nevcairiel seems to be correct, that the issue here seems somehow related to 
parsing performance by the subtitle filter. I think I've tracked it down to the 
following commits by MPC-HC which resolved the issue:

https://github.com/mpc-hc/mpc-hc/commit/1219dd2
https://github.com/mpc-hc/mpc-hc/commit/8cefb4f

Merging in those commits seems to resolve the issue with XySubFilter as well. 
I'll go through some other stuff, and try to release a new build sometime soon.

Original comment by cyber.sp...@gmail.com on 24 Mar 2015 at 7:51

GoogleCodeExporter commented 9 years ago

Attached is an XySubFilter test build which hopefully should no longer produce 
partial output when seeking the sample in comment #56.

Original comment by cyber.sp...@gmail.com on 25 Mar 2015 at 12:39

Attachments:

[XySubFilter Test Build (LAV Issue #302).7z](https://storage.googleapis.com/google-code-attachments/lavfilters/issue-302/comment-63/XySubFilter Test Build %28LAV Issue #302%29.7z)

GoogleCodeExporter commented 9 years ago

This PC is pretty slow for file #56 but yes, I think now subtitles appear 
sooner when seeking.

Original comment by onomatop...@gmail.com on 25 Mar 2015 at 1:14

GoogleCodeExporter commented 9 years ago

> This is same patch as #43, I just got rid of a variable never used and fixed 
some minor typo.

I left a comment on GitHub about a potential problem.

Original comment by h.lepp...@gmail.com on 25 Mar 2015 at 10:57

GoogleCodeExporter commented 9 years ago

The updated patch looks good, it should be in the next nightly build.
Thanks for everyone that helped!

Original comment by h.lepp...@gmail.com on 26 Mar 2015 at 11:18

Changed state: Fixed

GoogleCodeExporter commented 9 years ago

I have a problem with files muxed by ffmpeg when there are 2 subtitle lines at 
the same time. In this case ffmpeg seems to use a different cue structure than 
mkvmerge. Mkvmerge creates one CuePoint for each of the lines while ffmpeg puts 
both into a single CuePoint. You can see the difference in mkvinfo.

I have created a sample with two subtitle lines spanning from 0:0:10 to 0:0:25. 
If you seek into the middle of them using LAV splitter nightly:
- mkvmerge version: both lines get displayed like expected
- ffmpeg version: only one of the lines appears

Original comment by sneaker...@googlemail.com on 15 Apr 2015 at 4:28

Attachments:

GoogleCodeExporter commented 9 years ago

[deleted comment]

GoogleCodeExporter commented 9 years ago

I can reproduce this behavior with the supplied ffmpeg.mkv file. In the 
original version of the patch I posted above, I modified parseCues so that it 
handled this correctly by creating a separate entry in the cues array for each 
CueTrackPositions element. It looks like those changes got lost when the patch 
was simplified. Adding back some variant of those changes to parseCues should 
probably fix the issue.

Original comment by johnp...@gmail.com on 15 Apr 2015 at 7:33

GoogleCodeExporter commented 9 years ago

I believe ffmpeg's behavior is wrong. From what I understand, it should create 
one Cue for every subtitle line - and mkvmerge is practically the spec, so 
shrug.

Its supposed to be efficient, ie. every block should have a cue that points to 
it, and not needed to scan ahead a lot.

But I'll look into it.

Original comment by h.lepp...@gmail.com on 15 Apr 2015 at 7:39

GoogleCodeExporter commented 9 years ago

[deleted comment]

GoogleCodeExporter commented 9 years ago

> I believe ffmpeg's behavior is wrong.

The spec seems a bit unclear to me about whether it is permissible to have 
sibling CueTrackPositions elements that are for the same track.

On the one hand, the description of CuePoint says "Contains all information 
relative to a seek point in the segment." which would seem to imply that all 
cue information for a given timecode should be in the same CuePoint; ie., that 
ffmpeg's behavior is not only permissible, but actually required.

On the other hand, the description of CueTrackPositions says "Contain positions 
for different tracks corresponding to the timestamp." which would seem to imply 
that the you should not have multiple CueTrackPositions elements for the same 
track within a given CuePoint. 

> Its supposed to be efficient, ie. every block should have a cue that points 
to it, and not needed to scan ahead a lot.

I don't quite understand what your talking about here. Supporting ffmpeg's 
behavior doesn't require scanning ahead at all. The only difference is that 
when you parse the cues data when you initially read the file, you need to 
create a separate entry in the cues array for each CueTrackPositions element, 
instead of for each CuePoint.

Original comment by johnp...@gmail.com on 15 Apr 2015 at 8:00

GoogleCodeExporter commented 9 years ago

Also, even if ffmpeg's behavior is wrong, the current code will still fail when 
there are CueTrackPositions elements for two different subtitle tracks within 
the same CuePoint. This is because the current parseCues code would only add 
one of them to the cues array, even though both should be added.

Original comment by johnp...@gmail.com on 15 Apr 2015 at 8:41

GoogleCodeExporter commented 9 years ago

I think the efficiency comment was about a comment I got per mail, but 
apparently was deleted after since its no longer here.

I briefly tried to come up with a smart solution, but because there is no 
guarantees that CueTime appears before the CueTrackPositions entry, it got kind 
of annoying. Will probably have to do something evil tomorrow.

Original comment by h.lepp...@gmail.com on 15 Apr 2015 at 10:07

GoogleCodeExporter commented 9 years ago

> there is no guarantees that CueTime appears before the CueTrackPositions entry

I believe the version of parseCues I included in the original patch correctly 
handles this case.

Original comment by johnp...@gmail.com on 15 Apr 2015 at 10:23

GoogleCodeExporter commented 9 years ago

I emailed the mkvmerge author asking him about his thoughts on ffmpeg's 
behavior. He wasn't previously aware that it is common for subtitles to have 
multiple packets (blocks) in the same track with the same timecode, but wrote:

"I don't think that there's anything inherently non-compliant about
having multiple CueTrackPositions elements with the same CueTrack inside
the same CuePoint (the spec language doesn't seem to be strong enough
for disallowing it)."

Original comment by johnp...@gmail.com on 16 Apr 2015 at 3:14

GoogleCodeExporter commented 9 years ago

After further correspondence, he indicated that while ffmpeg's behavior is 
compliant, he'd still prefer that ffmpeg not have that behavior because it 
avoids any confusion.

Original comment by johnp...@gmail.com on 16 Apr 2015 at 7:33

GoogleCodeExporter commented 9 years ago

Latest LAV can deal with that BTW.

Original comment by h.lepp...@gmail.com on 16 Apr 2015 at 7:34

GoogleCodeExporter commented 9 years ago

I don't think the ffmpeg author responsible will change it just because Mosu 
think's it's better. Nev has already pushed a fix and I think it should be left 
at that. I don't really see how ffmpeg's solution is bad, esp. since it 
produces less overhead than mkvmerge.

Original comment by sneaker...@googlemail.com on 16 Apr 2015 at 7:36

GoogleCodeExporter commented 9 years ago

> Nev has already pushed a fix and I think it should be left at that.

I agree.

Original comment by johnp...@gmail.com on 16 Apr 2015 at 7:48

stevenrennermpls / lavfilters

Support Matroska CueDuration and CueRelativePosition for better subtitle handling on seeks #302