Open chrisn opened 5 years ago
We discussed on the DataCue API call yesterday (minutes) that we should prefer the API to use a single metadata TextTrack for all cue schemes, rather than have a TextTrack per cue scheme, and so the scheme (type) of each cue would be exposed through the DataCue.type
attribute.
In addition, we discussed possibly deprecating the TextTrack.inBandMetadataTrackDispatchType
attribute. The stated purpose of this attribute isn't something that's used in practice:
This is a string extracted from the media resource specifically for in-band metadata tracks to enable such tracks to be dispatched to different scripts in the document.
Note on existing implementations: In WebKit, inBandMetadataTrackDispatchType
always returns the same value, "com.apple.streaming". In HbbTV (which uses the TextTrack per cue scheme model), inBandMetadataTrackDispatchType
contains the MPEG-DASH scheme_id_uri
and value
values.
Following #11, a question arises as to how a web application should identify the schema of timed metadata cues. This is needed to allow the application to subscribe to receive events related to cues of a particular schema.
There are two parts of the HTML spec related to this:
TextTrack ids
For in-band tracks, the
TextTrack
'sid
is described as:The Media Fragments URI 1.0 (basic) spec does not describe how track URIs are constructed; this is deferred to the draft Protocol for Media Fragments 1.0 Resolution in HTTP (referred to as "Media Framgments URI 1.0 (advanced)"), where track media fragment URIs are only mentioned in informative text in the context of RTSP.
So, one option for identifying in-band timed metadata tracks could be to use the
TextTrack.id
field, and define a suitable media fragment URI format. A<track>
element where theid
field is not set by the page author seems strange, though.inBandMetadataTrackDispatchType
TextTrack
objects also have aninBandMetadataTrackDispatchType
attribute:Examples of how this value is set are given in HTML for different media formats (Ogg, WebM, MPEG-2, MPEG-4). This could be extended for in-band cues such as DASH emsg events.
The stated purpose of this field seems to achieve what we want, i.e., a provide way to dispatch metadata tracks to application code, and to be able to identify the cue schema. (I note that the "different scripts" terminology used here doesn't seem quite right, though).
Terminology
A note on terminology:
Much of the above description of
TextTrack.id
andinBandMetadataTrackDispatchType
could apply to UA-generated cues, not only in-band cues, for those browsers that feature native DASH or HLS players.Questions
Do these two mechanisms achieve the same goal? Or if not, how do they differ?
What browser support currently exists for both, for in-band timed metadata tracks?
Should we use
TextTrack.id
with a suitable media fragment URI to identify in-band timed metadata cues? For example, the URI could say "give me the DASH emsg events of a givenscheme_id
andvalue
in this media stream", or "give me the ID3 cues from this audio stream".Or is
inBandMetadataTrackDispatchType
preferred? If so, what should the format of this string be, for metadata cue formats not currently supported in HTML?I suspect I'm going down a path already visited by a previous group. Pointers to relevant discussions there are welcome!
Proposal (TBD, input needed)
We want to allow web applications to signal to the UA the timed metadata cue schemes that they want to receive.
If we do this using the
<track>
element, we should add aninBandMetadataTrackDispatchType
attribute to this element to allow selection of the appropriate cues in the media content.If we do this by providing APIs for application script to use, we should either:
HTMLMediaElement.addTextTrack
to allow the application to set the track'sinBandMetadataTrackDispatchType
, orHTMLMediaElement
to allow creation of timed metadata cue tracks, e.g.,addTimedMetadataTrack()
, withinBandMetadataTrackDispatchType
as a parameter.