w3c / media-source

Media Source Extensions
https://w3c.github.io/media-source/
Other
267 stars 59 forks source link

Allow non-ISO/IEC14496-12 top-level boxes in ISOBMFF Byte Streams #174

Open davemevans opened 7 years ago

davemevans commented 7 years ago

The Segment Parser Loop states the following:

"If the input buffer contains bytes that violate the SourceBuffer byte stream format specification, then run the append error algorithm and abort this algorithm."

The ISO BMFF Byte Stream Format states: "Valid top-level boxes defined in ISO/IEC 14496-12 other than ftyp, moov, styp, moof, and mdat are allowed to appear between the end of an initialization segment or media segment and before the beginning of a new media segment. These boxes MUST be accepted and ignored by the user agent and are not considered part of the media segment in this specification. "

This appears to imply only valid top-level boxes defined in 14496-12 are allowed to appear in a stream, and that strictly compliant implementations should reject input buffers containing any other top-level boxes.

MPEG-DASH (ISO/IEC 23009-1:2014) specifies a new box (emsg) as a mechanism for signalling both generic in-band metadata related to the media and DASH-specific operations. It further constrains this new box to be placed before the moof - i.e. at the top level.

It seems that the byte stream format specification needs either to be less strict in general in terms of extensions, or include references other than 14496-12 in the list.

Note: at least one implementation has already included emsg in its list of valid top-level boxes [1].

See also:

  1. https://bugs.chromium.org/p/chromium/issues/detail?id=276303
  2. https://bugzilla.mozilla.org/show_bug.cgi?id=1322587
jyavenard commented 7 years ago

The issue with not defining a set list is that it becomes impossible to distinguish rubbish content, from genuinely okay, especially when adding partial boxes is wanted.

wolenetz commented 3 years ago

Discussing this for at least emsg in V2 seems good to me.

chrisn commented 3 years ago

The DataCue API proposes exposing emsg data to web applications (requirements document, explainer). If this issue is strictly about parsing, perhaps we can track exposing to web applications in https://github.com/w3c/media-source/issues/189?

davemevans commented 3 years ago

This issue is really about which boxes are permissible in the ISO BMFF byte stream, outlining one particular example that is seen in the wild, rather than parsing and exposing the content of them.