Closed rkuroiwa closed 7 years ago
@rkuroiwa thanks for the detailed report!
@theodab @vaage-google if one of you guys wants some practice with our text parsers and mp4 this might be a good bug for it. (There is a program called iso viewer that translates mp4 files into human-readable form, another tool is http://thumb.co.il/).
Otherwise, I'll look at this next week if @TheModMaker doesn't get to it first :)
@ismena I can look into it.
Hello, I have submitted a merge request which I believe will fix this issue. https://github.com/google/shaka-player/pull/721
@kvarro, we have a fix in progress, but we are taking a little extra time to refactor the MP4 parser class. Thanks for the contribution, though!
The fixes related to this will not cherry-pick cleanly to v2.0.x, so this fix will first appear in v2.1.0.
What version of Shaka Player are you using? v2.0.5 on demo page.
Can you reproduce the issue with our latest release version?
Can you reproduce the issue with the latest code from
master
?Are you using the demo app or your own custom app? The demo app on https://shaka-player-demo.appspot.com/demo/ which says "Shaka Player v2.0.5-debug"
What browser and OS are you using? Chrome on Linux. Google Chrome 56.0.2924.87 (Official Build) (64-bit) Revision 0e9a9a6f3676ae439b78cd9b3f62b4193c3ac7d5-refs/branch-heads/2924@{#895} OS Linux
Steps:
Expected result: "It has shed much innocent blood." should show up from 1:51.800 to 1:55.800 and The line with "You're a fool for traveling alone." should show up from 1:58.000 to 2:01.450.
Actual result: Both cues show up at 1:50 (before 1:51) and goes till 2:00.
Additional info: https://storage.googleapis.com/shaka-demo-assets/sintel-mp4-wvtt/s-eng.mp4 tl;dr I think the player is creating a cue per 'mdat' box. For the content above, the 12-th 'mdat' box contains the cues. But the sample timing information is actually in moof > traf > trun box.
Looking at
mp4_vtt_parser.js
,shaka.media.Mp4VttParser
takes start and end timestamps, and all the cues in 'mdat' box are converted to VTTCues using the timestamps. IIUC, these values are coming from the sidx box, hence both cues show up at the same time and for the same duration. The sidx box contains the subsegment information but not the actual cue timings. Instead info in 'trun' box should be used for the corresponding 'mdat' box.I am working on a patch to generate MP4 files with WebVTT, and this is the file that I generated https://storage.googleapis.com/wvtemp/rkuroiwa/vtt_in_mp4_test_delete_me_when_done/sintel-subtitle-eng.mp4 The MPD that uses it is at https://storage.googleapis.com/wvtemp/rkuroiwa/vtt_in_mp4_test_delete_me_when_done/with_subtitle.mpd This mp4 file is not subsegmented and all the cues are in an mdat box so all of them show up from the beginning of the video.