Dash-Industry-Forum / dash.js

A reference client implementation for the playback of MPEG DASH via Javascript and compliant browsers.
http://reference.dashif.org/dash.js/nightly/samples/dash-if-reference-player/index.html
Other
5.15k stars 1.68k forks source link

TTML captions not showing #640

Closed jmacfl closed 9 years ago

jmacfl commented 9 years ago

Hi Dash Team,

I am having trouble with getting the TTML closed captions to show for a live stream. I could use some help in understanding how best to get them to show up.

First off, the content provider for the stream is using mime type/content-type of video/mp4 not application/ttml+xml as the Dash JS player seems to expect. When i requested that to be changed from the stream content provider the response i received indicated that application/ttml+xml is not the correct mime-type for TTML any longer but was used in an outdated spec version (response below):

The url which has xml+ttml is 1.7.4, the older version (and has encryption on the textstream which was also removed). In 1.7.7 and on this was changed to be mp4 (so ttml wrapped as isobmff, codec type ‘stpp’) which the UVU/DVB/DASH spec now mandate. In short, the 1.7.7 (and on) output is the output to look at.

So it seems they are not willing to use the older 1.7.4 content-type (plus that has encryption applied which it doesnt look like dash js player supports). I tried just running the content through charles and changing the content-type header and i do see that the dash player is parsing the text fragments and they are valid per the TTMLParser.js but they are never rendered to the screen. I am currently assuming it is a timeline issue of some kind but i am really just guessing. The stream is:

http://s91.acdn.quickplay.com/live/ss/4579/s/livech91028/livech91028.isml/livech91028.mpd

I tested on both Chrome (windows 7) and IE11 (windows 8.1) with the same results.

Note: not all content on this stream provides closed caption (though there are text fragments throughout they may not contain any speaker information if no CC is provided for that show).

Note2: you will need to disable time sync per issue https://github.com/Dash-Industry-Forum/dash.js/issues/601 to allow playback of the stream for anything more than a few seconds.

Note3: I tried to find any details on what specific spec version is supported by the Dash JS player but couldnt find anything other than a reference to the SDP US profile which i believe we are in compliance with.

thanks jeff

dsparacio commented 9 years ago

With that MPD the text adapatation set is set to correct mimeType of application/mp4 and codec stpp. This should invoke fragmented text track in dash.js

contentType="text" mimeType="application/mp4"

Ill test with this MPD if available in a bit.

Thanks Dan

dsparacio commented 9 years ago

Dash.js is seeing the CC content and creating the text track as well as fetching the segmented text. The problem is the text Content itself or the parser for TTML. When parsing the fragment this is the error the parser throws.
errorMsg = "TTML document does not contain any cues";

This is why you will see a CC UI button on the control bar, since the track was created, but not cc content as the cues are NOT created due to this error. Ill see if I can tell you more.

jmacfl commented 9 years ago

Thank you Dan. It seems that the TTMLParser xml2json converter doesnt like the whitespace in the XML. If i add this:

ttml = converter.xml_str2json(data.replace(/(\r\n|\n|\r)/gm,"")); in place of ttml = converter.xml_str2json(data);

it does find the cues (otherwise it fails with an empty "div" array). Unfortunately, it still doesnt show the captions even though i dont see the TTML Parser throwing any errors. Any thoughts? Also i notice the xml2json library is really old not sure if the newer version resolves this issue with whitespace.

Thanks for the help.

Jeff

dsparacio commented 9 years ago

the regex whitespace cleared the first cue or two but then i get that error again. I see it create a cue and add it to the text track as a vtt cue with valid start and end time. I have to check if I have that cue added to video before the playhead currentTime has passed as it will not render if the case. Thus, I wonder if there is a fragmented text timing issue here. We have a few fragmented text streams that play fine so it does work, but not in this case. Most likely something with that content but not sure. I also notice this content does not have a lang defined for text track..

jmacfl commented 9 years ago

Hey Dan,

Maybe i am misunderstanding (in fact it is likely) but the error is normal for the live stream. If the current fragment has no captioning then that error will be thrown and on a live stream there will definitely be cases where there is no captioning (for example commercials seldom have captions). So i would not expect "no cues" for a single text fragment to cause it to cancel all previous cues. Am i misunderstanding?

Jeff

jmacfl commented 9 years ago

Hey Dan,

In looking at the point where the Cue is created it looks like the start and end times in the "currentItem" are relative to the start of the requested segment (for example "0.434" and "1.435") instead of adjusted for the overall timeline. It seems like that start/end value should have been adjusted to the absolute time for the stream instead of relative to the specific fragment requested but i may be incorrect. Does the spec require the content to be provided with absolute or relative start/end timing for the cue points? If this is off base please just let me know.

Thanks, Jeff

dsparacio commented 9 years ago

@jmacfl You first comment is most like accurate and makes sense. To be honest I am new to fragmented text captions. normally work with external xml and embedded captions. I'll review the spec and learn more about empty fragments and how to handle but makes sense.

Regarding the second topic I think you are also right. The start and end times are relative. Here is a fragmented text MPD that does render. http://vm2.dashif.org/dash/vod/testpic_2s/multi_subs.mpd Ill take a look in a bit at the times but feel free to check it out.

dsparacio commented 9 years ago

Looking at your file again I notice that start time on first few captions is like .3 and the fourth caption has a start time of 0. So start and end time is absolutely the issue here. I need to read the spec more and we need to ask some others as I am not sure what it should be. Need to dig more.

jmacfl commented 9 years ago

Thanks Dan.

TobbeMobiTV commented 9 years ago

Hi Jeff and others,

The timing for the subtitles follows the media time as for audio and video. Thus they are relative to the AST, or to the Period@start if that is later. In your MPD, the AST and period start is at 1970-01-01, so the start and end should be in the order of 400 000hours as in http://vm2.dashif.org/dash/vod/testpic_2s/multi_subs.mpd

BR, Torbjörn

Från: Dan Sparacio notifications@github.com<mailto:notifications@github.com> Svara till: "Dash-Industry-Forum/dash.js" reply@reply.github.com<mailto:reply@reply.github.com> Datum: onsdag 15 juli 2015 19:15 Till: "Dash-Industry-Forum/dash.js" dash.js@noreply.github.com<mailto:dash.js@noreply.github.com> Ämne: Re: [dash.js] TTML captions not showing (#640)

@jmacflhttps://github.com/jmacfl You first comment is most like accurate and makes sense. To be honest I am new to fragmented text captions. normally work with external xml and embedded captions. I'll review the spec and learn more about empty fragments and how to handle but makes sense.

Regarding the second topic I think you are also right. The start and end times are relative. Here is a fragmented text MPD that does render. http://vm2.dashif.org/dash/vod/testpic_2s/multi_subs.mpd Ill take a look in a bit at the times but feel free to check it out.

Reply to this email directly or view it on GitHubhttps://github.com/Dash-Industry-Forum/dash.js/issues/640#issuecomment-121683332.

NOTICE: This email message and any attachments are for the sole use of the intended recipient(s) and may contain confidential and/or legally privileged information. If you are not an intended recipient or his or her representative, you are hereby notified that any unauthorized review, use, disclosure or distribution is prohibited. If you have received this communication in error please immediately contact the sender by reply email and destroy all copies of the original message and attachments.

jmacfl commented 9 years ago

Torbjörn,

Thanks, i have reached out to the content provider to see if it is possible for them to configure the text fragments to return a time based on the availabilityStartTime in the MPD. Based on the requests i am seeing against the sample MPD http://s91.acdn.quickplay.com/live/ss/4579/s/livech91028/livech91028.isml/livech91028.mpd it is unclear to me what the dash player is expecting. The request urls look like:

http://s91.acdn.quickplay.com/live/ss/4579/s/livech91028/livech91028.isml/dash/livech91028-text_track_0=1000-185871686.dash http://s91.acdn.quickplay.com/live/ss/4579/s/livech91028/livech91028.isml/dash/livech91028-text_track_0=1000-185873688.dash etc

and the time component of that url is a number based on the start of 1-1-1970. So is the dash player expecting the time in the cue point to be an ever increasing relative value from 1-1-1970 such as:

<p region="speaker" begin="00:00:00.768" end="00:00:02.036">SCREAMING, I THOUGHT IT 1</p>
<p region="speaker" begin="00:00:02.036" end="00:00:04.036">SCREAMING, I THOUGHT IT 2</p>
<p region="speaker" begin="00:00:04.036" end="00:00:06.036">SCREAMING, I THOUGHT IT 3</p>

etc (where those occur in one or more text fragments) or would it be expecting something like:

<p region="speaker" begin="399153:00:00.768" end="399153:00:02.036">SCREAMING, I THOUGHT IT 1</p>
<p region="speaker" begin="399153:00:02.036" end="399153:00:04.036">SCREAMING, I THOUGHT IT 2</p>
<p region="speaker" begin="399153:00:04.036" end="399153:00:06.036">SCREAMING, I THOUGHT IT 3</p>

or something else?

Thanks, Jeff

TobbeMobiTV commented 9 years ago

Hi Jeff,

I don't see the difference between your examples, but for all TTML content the timeline increases linearly from start inside each period, and does not restart at the segment boundary. For live, the start time is availabilityStartTime. You can see what it should look like in the DASH-IF live simulator, e.g. A segment like

http://vm2.dashif.org/livesim-dev/all_1/testpic_2s/S1/718495844.m4s (all_1 is an option to disable any access time limitations).

The TTML timing for this live source element and its mediatime (given by the tfdt box + the startTime is the same).

UTC time is 2015-07-15T20:21:28Z

UTC time is 2015-07-15T20:21:29Z

¨

The next segment (718495845) has 2s later start and end times (as well as media times).

Your content uses the time template, but the principle is the same.

//Torbjörn Från: jmacfl notifications@github.com<mailto:notifications@github.com> Svara till: "Dash-Industry-Forum/dash.js" reply@reply.github.com<mailto:reply@reply.github.com> Datum: onsdag 15 juli 2015 21:30 Till: "Dash-Industry-Forum/dash.js" dash.js@noreply.github.com<mailto:dash.js@noreply.github.com> Kopia: Torbjörn Einarsson teinarsson@mobitv.com<mailto:teinarsson@mobitv.com> Ämne: Re: [dash.js] TTML captions not showing (#640)

Torbjörn,

Thanks, i have reached out to the content provider to see if it is possible for them to configure the text fragments to return a time based on the availabilityStartTime in the MPD. Based on the requests i am seeing against the sample MPD http://s91.acdn.quickplay.com/live/ss/4579/s/livech91028/livech91028.isml/livech91028.mpd it is unclear to me what the dash player is expecting. The request urls look like:

http://s91.acdn.quickplay.com/live/ss/4579/s/livech91028/livech91028.isml/dash/livech91028-text_track_0=1000-185871686.dash http://s91.acdn.quickplay.com/live/ss/4579/s/livech91028/livech91028.isml/dash/livech91028-text_track_0=1000-185873688.dash etc

and the time component of that url is a number based on the start of 1-1-1970. So is the dash player expecting the time in the cue point to be an ever increasing relative value from 1-1-1970 such as:

SCREAMING, I THOUGHT IT 1

SCREAMING, I THOUGHT IT 2

SCREAMING, I THOUGHT IT 3

etc (where those occur in one or more text fragments) or would it be expecting something like:

SCREAMING, I THOUGHT IT 1

SCREAMING, I THOUGHT IT 2

SCREAMING, I THOUGHT IT 3

or something else?

Thanks, Jeff

Reply to this email directly or view it on GitHubhttps://github.com/Dash-Industry-Forum/dash.js/issues/640#issuecomment-121721535.

NOTICE: This email message and any attachments are for the sole use of the intended recipient(s) and may contain confidential and/or legally privileged information. If you are not an intended recipient or his or her representative, you are hereby notified that any unauthorized review, use, disclosure or distribution is prohibited. If you have received this communication in error please immediately contact the sender by reply email and destroy all copies of the original message and attachments.

jmacfl commented 9 years ago

Thanks Torbjörn. Looks like the email had the XML formatting stripped out which is why you didnt see the difference in my post, if you look at it on the github issue directly you can see the hidden text. In any case your example is clear. I will try to get the content provider to provide it in this way.

Thanks again. jeff

jmacfl commented 9 years ago

@TobbeMobiTV, @AkamaiDASH - It sounds like their may be some confusion about what the correct way to represent timing is for TTML on live channels between the TTML captioning vendor and the Dash implementation. Here is their response to my request based on the information and examples you provided:

In fragmented captions, the TTML begin & end times are relative to that fragment. We've noticed that this is a source of confusion with other customers as well. (For instance, Apple's HLS docs are vague about the interpretation of webvtt's time stamps.)

The reason why payload timestamps are offset by the fragment's start becomes obvious when you consider a 24/7 live broadcast where the start of presentation may be unclear, undefined or irrelevant.

While most players handle TTML offsets correctly, some have issues with these type of fragments.

So i am unsure how best to proceed. In an ideal world having the DashJS player support the offset to the fragment start time would be best. If that is not something the DashJS team feels is appropriate though i will need to convince the vendor that their interpretation of the spec is incorrect.

Is there any chance of adding this support to the player?

Thanks Jeff

dsparacio commented 9 years ago

Jeff if I have time today Ill search the spec to see if I can find something on this topic. I think we can handle this use case in dash.js as long as it does not explicitly defy the spec. I think it is just a matter of understanding the use cases that need to be handled here...

jmacfl commented 9 years ago

@AkamaiDASH - Thank you, i really appreciate you taking a look.

Jeff

TobbeMobiTV commented 9 years ago

Hi Jeff and others,

MPEG-4 part 30 (14496-30 Timed Text and other visual overlays in ISO Base Media File Format) defines how TTML is handled inside ISOBMFF boxes and says

5.3 Timing

The top-level internal timing values in the timed text samples based on TTML express times on the track presentation timeline - that is, the track media time as optionally modified by the edit list. For example, the begin and end attributes of the element, if used are relative to the start of the track, not relative to the start of the sample. This is shown in the figure below, using W3C TTML syntax

In this case, the start time for the track is the start of the period, unless there is a PresentationTimeOffset present.

As a side remark: The timing for HLS and WebVTT are totally unrelated. For WebVTT inside ISOBMFF the same document says that the timing is completely handled by the timing using the ordinary media sample mechanisms, and there is no timing sent in clear text at all. That is described in section 6.3 of the same document.

The real authority on this matter is Dave Singer at Apple, so we may ask him if the standard is not clear enough.

BR,

Tobbe

From: jmacfl notifications@github.com<mailto:notifications@github.com> Reply-To: "Dash-Industry-Forum/dash.js" reply@reply.github.com<mailto:reply@reply.github.com> Date: fredag 17 juli 2015 19:39 To: "Dash-Industry-Forum/dash.js" dash.js@noreply.github.com<mailto:dash.js@noreply.github.com> Cc: Torbjörn Einarsson teinarsson@mobitv.com<mailto:teinarsson@mobitv.com> Subject: Re: [dash.js] TTML captions not showing (#640)

@AkamaiDASHhttps://github.com/AkamaiDASH - Thank you, i really appreciate you taking a look.

Jeff

Reply to this email directly or view it on GitHubhttps://github.com/Dash-Industry-Forum/dash.js/issues/640#issuecomment-122350942.

NOTICE: This email message and any attachments are for the sole use of the intended recipient(s) and may contain confidential and/or legally privileged information. If you are not an intended recipient or his or her representative, you are hereby notified that any unauthorized review, use, disclosure or distribution is prohibited. If you have received this communication in error please immediately contact the sender by reply email and destroy all copies of the original message and attachments.

jmacfl commented 9 years ago

Thank you @TobbeMobiTV ! This does seem pretty clear to me as well. Let me circle back to the content provider.

Jeff

jmacfl commented 9 years ago

As an update to this thread the content provider has accepted the spec above as the appropriate way to move forward to i believe this can be closed. I will revisit this if, after the changes, there are any additional issues with the closed captioning. Thank you for the help.

dsparacio commented 9 years ago

@jmacfl Great thanks for the follow up.