WICG / datacue

A TextTrackCue based interface for arbitrary timed metadata, synchronized with audio or video media playback
https://wicg.github.io/datacue/
Other
26 stars 6 forks source link

DASH inband event processing using MSE data model #30

Open irajs opened 3 years ago

irajs commented 3 years ago

This document provides an extended W3C Media Source Extension (MSE) model for the processing of DASH inband events.

DASH-EventProcessing4W3C.docx

chrisn commented 3 years ago

Thank you!

chrisn commented 3 years ago

Thanks again @irajs for sharing this document. It's really helpful. Would it be OK for us to edit this to define the proposed browser processing steps for event messages?

I have a few specific comments and questions:

1 Process@append rule

For on_receive dispatch mode, what should happen if the player seeks to a time before the event start time and then plays through the event start time. Should the event be dispatched again when the playback position reaches the event start time?

For on_start dispatch mode, should this event be dispatched immediately, since event_start <= playback_position <= event_end ?

2 Dispatch buffer timing model

The dispatch buffer described in the document seems equivalent to the HTML text track model, which maintains a list of cues.

Is the dispatch buffer timing model affected by the MSE 'segments' and 'sequence' append mode?

Broadly, we need to define how events are sourced (parsed) from the media container and placed onto the media timeline. From there, the HTML time marches on defines how the events are dispatched to the web application.

3 Implementation (Figure 2)

I think that the HTML time marches on steps will correctly handle dispatching start/end events to the web application at the right time without needing to do divide the event into subranges. This also relates to section 4.2 step 3(e)(ii).

3 Implementation (Figure 3)

As far as I know, the MSE mechanism for overwriting a segment is to call SourceBuffer.remove(startTime, endTime), followed by SourceBuffer.append(bufferSource).

Figure 3 shows that event E2 is unchanged following overwriting part of the segment. Are there any cases where the events would be removed from the dispatch buffer?

4.1 Initialization

To support specifying the dispatch mode, we would need to add subscribe(type, dispatchMode) and unsubscribe(type) methods somewhere. But note that some browsers may not support the on_receive dispatch mode (see this comment).

Step 2, Event buffer initialization, maps directly to TextTrack initialization.

4.2 Append

The "already-dispatched" table would be a new browser feature. We need to figure out how / where to add this.

How long would the "already-dispatched" table look back? This is related to the equivalency lifetime question in https://github.com/WICG/datacue/issues/28

4.3 Dispatch

As already mentioned, I think that dispatch is handled already by the time marches on steps. Dispatch in the context of the TextTrack APIs would mean firing TextTrackCue 'enter' and 'exit' events, and TextTrack 'cuechange' events to the web page.

4.4 Purge

Should this be the website’s responsibility, or do we expect the browser to do this? Purging is implemented by calling SourceBuffer.remove(start, end) and the website could remove / update the corresponding cues itself.

irajs commented 3 years ago

Thanks again @irajs for sharing this document. It's really helpful. Would it be OK for us to edit this to define the proposed browser processing steps for event messages?

Yes. Of course. This is just the basic timing model. It has to be translated to the browser processing model.

I have a few specific comments and questions:

1 Process@append rule

For on_receive dispatch mode, what should happen if the player seeks to a time before the event start time and then plays through the event start time. Should the event be dispatched again when the playback position reaches the event start time?

It depends:

  1. If the seek is to the point existing in the buffer, no. It has already dispatched (at the time of parsing the segment).
  2. If the seek is to the point outside of the buffer, then the segment is fetched, as if the id doesn't exist in the table (not dispatched before) it will be dispatched.

For on_start dispatch mode, should this event be dispatched immediately, since event_start <= playback_position <= event_end ?

Yes if it is not dispatched before.

2 Dispatch buffer timing model

The dispatch buffer described in the document seems equivalent to the HTML text track model, which maintains a list of cues.

Is the dispatch buffer timing model affected by the MSE 'segments' and 'sequence' append mode?

No. The difference between segments mode and sequence mode is the adjustment of the Time Stamp Offset (TSO) explicitly vs implicitly. The position of an event is relative to the position of the segment in the buffer both for v0 and v1 events after applying TSO. So the impact of TSO is already considered in the process.

Broadly, we need to define how events are sourced (parsed) from the media container and placed onto the media timeline. From there, the HTML time marches on defines how the events are dispatched to the web application.

3 Implementation (Figure 2)

I think that the HTML time marches on steps will correctly handle dispatching start/end events to the web application at the right time without needing to do divide the event into subranges. This also relates to section 4.2 step 3(e)(ii).

That's great. It simplifies the process.

3 Implementation (Figure 3)

As far as I know, the MSE mechanism for overwriting a segment is to call SourceBuffer.remove(startTime, endTime), followed by SourceBuffer.append(bufferSource).

Figure 3 shows that event E2 is unchanged following overwriting part of the segment. Are there any cases where the events would be removed from the dispatch buffer?

Yes, if the entire segment is overwritten. This is mentioned in the perge process.

4.1 Initialization

To support specifying the dispatch mode, we would need to add subscribe(type, dispatchMode) and unsubscribe(type) methods somewhere. But note that some browsers may not support the on_receive dispatch mode (see this comment).

Then I suppose such a browser treats all event schemes as schemes with the on_start dispatch mode and that would be a limiting factor for interoperability. For instance, SCTE 35 events are already designed for on_receive mode and they won't work properly with browsers not supporting this mode.

Step 2, Event buffer initialization, maps directly to TextTrack initialization.

4.2 Append

The "already-dispatched" table would be a new browser feature. We need to figure out how / where to add this.

How long would the "already-dispatched" table look back? This is related to the equivalency lifetime question in #28

I think it should be left to implementation as I mentioned in #28.

4.3 Dispatch

As already mentioned, I think that dispatch is handled already by the time marches on steps. Dispatch in the context of the TextTrack APIs would mean firing TextTrackCue 'enter' and 'exit' events, and TextTrack 'cuechange' events to the web page.

Great!

4.4 Purge

Should this be the website’s responsibility, or do we expect the browser to do this? Purging is implemented by calling SourceBuffer.remove(start, end) and the website could remove / update the corresponding cues itself.

A similar process for media buffer purging can be used for event dispatch buffer purging. But I think it is the browser's responsibility (at least I think this is the case in MSE for the media buffer. But I might be wrong).

chrisn commented 3 years ago

I have uploaded a markdown version of the document here.