shaka-project / shaka-player

JavaScript player library / DASH & HLS client / MSE-EME player
Apache License 2.0
7.18k stars 1.34k forks source link

Sample information in trun box should be used to get the start time & duration for WebVTT in MP4 #699

Closed rkuroiwa closed 7 years ago

rkuroiwa commented 7 years ago

Steps:

  1. Goto https://shaka-player-demo.appspot.com/demo/
  2. Select "Sintel 4k (multicodec, VTT in MP4)"
  3. Play.
  4. Seek to around 1:50.

Expected result: "It has shed much innocent blood." should show up from 1:51.800 to 1:55.800 and The line with "You're a fool for traveling alone." should show up from 1:58.000 to 2:01.450.

Actual result: Both cues show up at 1:50 (before 1:51) and goes till 2:00.

Additional info: https://storage.googleapis.com/shaka-demo-assets/sintel-mp4-wvtt/s-eng.mp4 tl;dr I think the player is creating a cue per 'mdat' box. For the content above, the 12-th 'mdat' box contains the cues. But the sample timing information is actually in moof > traf > trun box.

Looking at mp4_vtt_parser.js, shaka.media.Mp4VttParser takes start and end timestamps, and all the cues in 'mdat' box are converted to VTTCues using the timestamps. IIUC, these values are coming from the sidx box, hence both cues show up at the same time and for the same duration. The sidx box contains the subsegment information but not the actual cue timings. Instead info in 'trun' box should be used for the corresponding 'mdat' box.

I am working on a patch to generate MP4 files with WebVTT, and this is the file that I generated https://storage.googleapis.com/wvtemp/rkuroiwa/vtt_in_mp4_test_delete_me_when_done/sintel-subtitle-eng.mp4 The MPD that uses it is at https://storage.googleapis.com/wvtemp/rkuroiwa/vtt_in_mp4_test_delete_me_when_done/with_subtitle.mpd This mp4 file is not subsegmented and all the cues are in an mdat box so all of them show up from the beginning of the video.

ismena commented 7 years ago

@rkuroiwa thanks for the detailed report!

@theodab @vaage-google if one of you guys wants some practice with our text parsers and mp4 this might be a good bug for it. (There is a program called iso viewer that translates mp4 files into human-readable form, another tool is http://thumb.co.il/).

Otherwise, I'll look at this next week if @TheModMaker doesn't get to it first :)

vaage commented 7 years ago

@ismena I can look into it.

kvarro commented 7 years ago

Hello, I have submitted a merge request which I believe will fix this issue. https://github.com/google/shaka-player/pull/721

joeyparrish commented 7 years ago

@kvarro, we have a fix in progress, but we are taking a little extra time to refactor the MP4 parser class. Thanks for the contribution, though!

joeyparrish commented 7 years ago

The fixes related to this will not cherry-pick cleanly to v2.0.x, so this fix will first appear in v2.1.0.