irajs / CMAF

Common Media Application Format Specification
5 stars 0 forks source link

m38228/v2 timed metadata #78

Open irajs opened 8 years ago

irajs commented 8 years ago

For MLB Advanced Media, timed metadata is an important part of what we do (and even more importantly, it is part of how we get paid).

So, it is mission critical to our streams. For the most part, this metadata is sparse - appearing at most 1 or 2 times a segment, and is a minimal amount of data (on the order of 10's of bytes). At most, we have a 24Kbyte image for our radio only broadcasts.

We feel that CMAF has not sufficiently dealt with this subject, the only mention of it is in section 7.3.3, in the section on CMAF headers.

We feel that the solution of timed metadata as a separate media stream is not optimal; it requires yet another stream that has to be download (increasing player complexity, increased traffic over the CDN, additional complexity in the manifest, etc).

We feel that there are 2 different solutions that would work (both in the audio stream): either an additional trak, or emsg.

While we understand the desire to separate out streams between audio/video and subtitles, we feel that the sparse nature of most timed metadata will not hinder this effort. We would find restrictions such as a separate MDAT for metadata be more than sufficient to allow for existing restrictions.

Even if that was to be taken, timed metadata must be more formally introduced other than a throw away line. Either way, it should be given a trak, sample entry description, and be referenced in at least Annex G.

——

We're also concerned what to do about sparse tracks that have no data (either subtitle or timed metadata, if the separate track approach is taken). We feel that CMAF should use our suggestion of an "empty" sample with duration for this.

ghost commented 8 years ago

The proposal to define "empty" CMAF Fragments or "empty" Segments in particular delivery protocols may be a good solution for metadata tracks because metadata doesn't require continuous presentation like audio and video (or any presentation). This needs to be defined in MPEG File Systems, then can be referenced by CMAF. Testing should be done to determine the effect on deployed file parsers and decoders. Proposals for additional specification of 'emsg' boxes in CMAF Fragments are welcome. However, player, server, system, and application behavior, such as advertising systems, is beyond the scope of the CMAF media format. This issue is postponed pending MPEG File Systems action on empty Fragments, and proposals on addition specification of sparse metadata in event messages in 'emsg' boxes and manifests.

jpiesing commented 7 years ago

Since this is on the agenda for the f2f, is it possible to explain the current situation? The CD already includes the emsg box and provision for that to be included in audio tracks. The use-cases I've seen people discuss for that solution would seem no different from 'sparse metadata'. What's missing?

ghost commented 7 years ago

Currently Event Messages are defined in DASH, and the same message can be packaged as XML EventMessage in the MPD, and optionally in ‘emsg’ boxes in Segments. ‘emsg’ boxes provide realtime notification for live streaming, and the EventMessage EventStream in MPD provides complete event history when an MPD is fetched at the start of a presentation, and a complete event timeline for players random accessing buffered content (e.g. PVR playback).

DASH primarily defines the format of the messages, not the content or processing (except for MPD updating). Other organizations, such as SCTE have defined scheme_id_uri and payload for useful message types, such as SCTE-35 ad insertion and segmentation descriptors.

The current ‘emsg’ box uses timing relative to DASH Segments, and there is a proposal to version it to allow directly referencing the CMAF Track time to make it independent of the packaging used for delivery (different number of Fragments per Segment, Chunks, etc.).

There is also discussion of the existing ISOBMFF timed metadata track tools for sparse use cases. It might be possible to define new “empty” movie fragments and samples that would fill in sparse metadata tracks to provide alternative to ‘emsg’. Timed metadata tracks are already a good solution for continuous metadata, e.g. GPS or speed information sampled once a second.

Kilroy Hughes | Senior Digital Media Architect |Windows Azure Media Services | Microsoft Corporation [cid:image001.png@01CDABBA.71FD8800]http://www.windowsazure.com/media

From: jpiesing [mailto:notifications@github.com] Sent: Wednesday, August 17, 2016 12:16 PM To: irajs/CMAF CMAF@noreply.github.com Cc: Kilroy Hughes Kilroy.Hughes@microsoft.com; Comment comment@noreply.github.com Subject: Re: [irajs/CMAF] m38228/v2 timed metadata (#78)

Since this is on the agenda for the f2f, is it possible to explain the current situation? The CD already includes the emsg box and provision for that to be included in audio tracks. The use-cases I've seen people discuss for that solution would seem no different from 'sparse metadata'. What's missing?

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/irajs/CMAF/issues/78#issuecomment-240517130, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AC1hQvJbt8KhlHdsmHnPTmFeyR9WcOX3ks5qg13kgaJpZM4IjHSi.