video-dev / hls.js

HLS.js is a JavaScript library that plays HLS in browsers with support for MSE.
https://hlsjs.video-dev.org/demo
Other
15.01k stars 2.59k forks source link

WebVTT Caption Alignment #5775

Closed streamingsystems closed 12 months ago

streamingsystems commented 1 year ago

What do you want to do with Hls.js?

Hi All,

I searched google and also did some searching on GitHub without any luck. If there is already an answer please share :)

I am creating my own master and variant playlists including WebVTT captioning. I am having a little trouble shifting/delaying the captions to line up exactly where I need them to be.

I have a DISCONTINUITY in my variant playlist which causes the PTS in the TS file to reset to 0 (or close to 0) right before the captions start.

Per the Apple specification it seems like I can use:

X-TIMESTAMP-MAP

To map between the PTS in the TS chunks and the cue times in the subtitles.

However, I read a posting that hls.js does not seem to use PTS from the TS. In that same post it seems to imply that hls.js uses EXT-X-PROGRAM-DATE-TIME, but I could not understand if that was definitive.

Could you please explain to me the process that hls.js uses to figure out the timing/matching of the cue times (eg. 00:00:02.060 --> 00:00:05.600) to the TS files that are coming down (that contain PTS time stamps).

(I tried to read over the source but I am not very proficient at JavaScript like I am other languages).

Thanks!

What have you tried so far?

I did some searches on GitHub and Google.

robwalch commented 1 year ago

hls.js uses PTS when mapping with WebVTT headers X-TIMESTAMP-MAP (it has to). Just make sure you have the same discontinuities in your subtitle playlists that you have in your video/audio playlists. If you have a DISCONTINUITY at 60 seconds in your media playlists, you need to have one in subtitles as well so that all subtitle segments can be aligned with audio and video appropriately.

streamingsystems commented 1 year ago

Hi Rob,

Thanks for getting back to me, I appreciate you taking your time to reply.

As I understand HLS there is a “PTS” in the TS file itself, which is based on a 90kHz clock cycle. This PTS in the TS file is set to “0” when the stream starts or there is discontinuity and all subsequent frames in that TS file (and following TS files) are all in relation to “0”.

There is also,

EXT-X-PROGRAM-DATE-TIME

Which seems to be the “start time” of the TS that follows it in the playlist. Does hls.js use this at all for anything?

There is also:

X-TIMESTAMP-MAP

Which sets the relationship between the PTS in the TS file and the times for the cues.

In this post you mention:

https://github.com/video-dev/hls.js/issues/3987

Your VTT times should be relative to MPEGTS 0. Hls.js uses initPTS as the starting PTS time of the stream, and then plots VTT cues based on this on the content of your WebVTT files.

HLS.js does not align VTT tracks based on PTS. It depends on ProgramDateTime to align subtitle tracks. MediaSequenceNumber is not used because the spec does not require alignment based on SN.

I am not sure what you meant by those comments exactly, I was thinking that something alignment depends on “EXT-X-PROGRAM-DATE-TIME”.

If you could please let me (and others that might read this in the future) know when hls.js first loads what does it look at and how does it plot the cues? And as the video plays and both new TS files and vtt files are downloaded what is the flow?

Once again today I tried to trace through the code but as mentioned the languages used are not my forte so it’s harder for me than languages I am familiar with.

I have a few follow up questions but will wait as you might answer them in your reply.

Thanks!

-Rob

From: Rob Walch @.> Date: Thursday, August 31, 2023 at 3:51 PM To: video-dev/hls.js @.> Cc: streamingsystems @.>, Author @.> Subject: Re: [video-dev/hls.js] WebVTT Caption Alignment (Issue #5775)

hls.js uses PTS when mapping with WebVTT headers X-TIMESTAMP-MAP (it has to). Just make sure you have the same discontinuities in your subtitle playlists that you have in your video/audio playlists. If you have a DISCONTINUITY at 60 seconds in your media playlists, you need to have one subtitles as well so that all subtitle segments can be aligned with audio and video appropriately.

— Reply to this email directly, view it on GitHubhttps://github.com/video-dev/hls.js/issues/5775#issuecomment-1701764599, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AOI5PRP6RAIMFWRRFCQ64SDXYD2O3ANCNFSM6AAAAAA4CG44MQ. You are receiving this because you authored the thread.Message ID: @.***>

robwalch commented 1 year ago

initPTS, for each discontinuity sequence, maps the difference between the HTMLMediaElement timeline and the PTS or presentation timestamps. If the first segment has a PTS of 0, then the first initPTS will be 0. An initPTS is established for each discontinuity sequence, referred to as cc index.

Playlist alignment may be performed based on PTD (see alignMediaPlaylistByPDT), but not the actual subtitle cue timing which is handled in parseWebVTT. Playlist alignment does not depend on PTD, since it is not required by HLS, but providing inconsistent date value across playlists and across discontinuity sequence could certainly impact Playlist alignment.

Playlist alignment ultimately impacts which segments are loaded (not subtitle plotting). Just as DISCONTINUITY sequences should be aligned across all playlists so should PROGRAM-DATE-TIME. Having different PDTs or discontinuities across playlists can create ambiguity that clients cannot resolve.

The actual plotting of subtitle cues loaded from subtitle segments is handle in the parsing of the VTT . This is where the media PTS offset (initPTS) and the X-TIMESTAMP-MAP MPEGTS and LOCAL values are used to plot subtitle start and end times:

https://github.com/video-dev/hls.js/blob/622781602bd3b209f00001a669a578a1c76802bc/src/utils/webvtt-parser.ts#L119-L152

robwalch commented 1 year ago

I am creating my own master and variant playlists

Have you tried validating your work with the HLS Tools mediastreamvalidator?