New Proposal: Patterns in Segment Timeline

haudiobe commented 1 year ago

Kyle from AWS presented this now in the IOP WG call. New topics • Kyle presents: IOP23011.PatternTemplateManifest_DASH-IF_Reintroduction_2023.pptx o We will add this to the DASH-IF Live TF on Friday, March 3, 2023 We will discuss this in more details in the Live TF on Friday.

haudiobe commented 1 year ago

AHG 2023/03/03

Different options discussed: 1) Add a SegmentTimeline extensions that changes entirely the addressing scheme. Can be used with $Time$. Needs MPEG involvement. Has player impact 2) Use $Number$ for audio and provide auxiliary information on the exact segment duration using a new element. This new element may be combined with @duration or with SegmentTimeline. This can be done quickly in DASH-IF, would not even have a player impact. 3) Do nothing but document the best practices

Other options? Please comment

koceskik commented 1 year ago

From my own notes of the discussion, with listing of some pros/cons

Backwards Compatibility Concerns
- Option: Use a new element (under SegmentTemplate) to be backwards-compatible: fallback to simple-addressing mode if client is not knowledgeable
  - "PatternSegmentTimeline"?
  - Requires adding SegmentTemplate@duration (and SegmentTemplate@duration shall be ignored for NewSegmentTimeline)
- NOTE: Each DASH revision isn't strictly backwards compatible today; for example, 5th edition added Referencing ContentProtections which clearly aren't backwards compatible.
Is there an alternative:
- Rely on simple-addressing mode instead, especially if you're already using $Number$ URL addressing?
  - Pros: existing standard, smaller than with any kind of timeline
  - Cons:
    - segment position inaccuracy: either requires
      1. DASH client to re-request for correct segment (incurring time-to-start/seek increases),
        
        May requires DASH client to parse segment for actual positioning (independent of the demuxer)
        
        Consider, for example, MSE-based players where the DASH client doesn't strictly need to demux segments itself
      2. adjust playback position (potentially infeasible depending on dependency to start on video iframe boundary),
        
        NOTE: DASH allows deviation up to 50% of average segment duration, although configured Audio deviation is generally smaller (ex: aac: +5.3ms, +10.6ms, +15.9ms, 0ms; ec3: +16ms, 0)
        
        Practically, I have seen the first Audio segment of a Period to have duration fluctuate by [-763ms, +816ms] ([-38.1%, +40.8%]) compared to average segment duration (ie there is a practical concern of deviation up to the allowable threshold)
    - No ability to reference segment gaps
      - NOTE: DASH-IF restricted timing model explicitly forbids this anyways...but it can be a practical concern for the less-restrictive MPEG-DASH standard.

I think at a high level, if a content producer is using explicit addressing mode, there are performance advantages to providing a mechanism for reducing the manifest size (primarily on the initial request, as patch syntax allows for reduced update sizes).

haudiobe commented 1 year ago

AHG Call 2023/03/24

Proposal

We can add this as a new element, but keep the old element. This should be backward-compatible. Minimize difference to the proposal
Then we implement this in DASH.js and test it
If this is beneficial (measure to be defined, such as parsing time, size of data), then we create new profile and bring it to MPEG.

@koceskik @haudiobe check on this.

koceskik commented 1 year ago

I'll write up a revised proposal for a more backwards-compatible PatternTimeline syntax (mainly the same as indicated, but simply not embedding in the existing SegmentTimeline).

Of note, I was reviewing documentation and implementations. In particular, one case for having an explicit addressing mode (ie an explicit timeline) is when an Ad Insertion MPD Manipulator proxies directly from an IF-2 or IF-3 input. (see https://dashif.org/docs/CR-Ad-Insertion-r7.pdf section 8.1.3 Architectures for guidance on this scenario)

In particular, there exist cases where ad insertion cannot fill an ad break, and must return to main content, or otherwise adjust the Period@start and prescribed segments in a Period. With simple addressing mode, adjusting Period@start is limited by the DASH-IF timing constraints: https://dashif-documents.azurewebsites.net/Guidelines-TimingModel/master/Guidelines-TimingModel.html#addressing-simple-startpoint

In particular:

The rest of this chapter assumes that the nominal timing of media segments matches the real timing. If you cannot satisfy this constraint but still wish to move the period start point, convert to explicit addressing. See § 18.4.3 Converting simple addressing to explicit addressing.

As such, in an ad-insertion architecture, simple-addressing mode either:

requires the encoder enforce a constant duration equal to the nominal duration always
necessitate an Ad Insertion MPD Manipulator to download media segments to accurately adjust Period@start

With a PatternTimeline, even if an end DASH Client does not consume the PatternTimeline (and instead consumes the simple-addressing based on SegmentTemplate@duration), an ad insertion MPD manipulator proxy could rely on PatternTimeline to adjust Periods accurately.

bbert commented 1 year ago

I was also thinking about dynamic ad-insertion scenario for which I thought it will be very tricky to manage the matching between the segment numbers and the PatternTimeline.

Also, consider the case you currently deliver manifests with SegmentTimeline, primarily to get around issues introduced by simple-addressing mode, especially the ones Kyle listed in previous comment. If you want to reduce the manifest size using this PatternTimeline, you will have to rely on numbers for legacy players, and then introduce issues (seeking, gaps, dynamic ad-insertion, splicing) for these players you would not encounter when using SegmentTimeline.

In short, you would resolve the manifest size issue for up to date players compatible with the PatternTimeline, while introducing some other issues for legacy players who will have to use the numbers.

Accordingly, if one wants to reduce size of manifests using SegmentTimeline, there may be no alternative way than introducing this new Pattern syntax under the existing SegmentTimeline and breaking the backward compatibility with legacy players.

koceskik commented 1 year ago

Notes from meeting today:

Potential options:

PatternTimeline as sibling for accurate timings

<PatternTimeline>
<Pattern t="0" r="1" />
<P d="95232"/>
<P d="96256" r="2"/>
</Pattern>
<S t="768000" d=“44307"/>
</PatternTimeline>

In conjunction:

SegmentTimeline MAY be absent completely, in which case a legacy DASH client will fall back to simple-addressing mode via SegmentTemplate@duration using average segment timelines
- Any MPD manipulator (such as an Ad Insertion MPD Manipulator), would be expected to understand PatternTimeline as a means of enforcing accurate Period@start manipulation
Loosen restriction on time-accuracy for SegmentTimeline repetition (with signaling of accuracy), and require $Number$ -based referencing if using an inaccurate timeline:
```
<SegmentTimeline accurate="false">
<S t="0" d="96000" r="7"/> 
<S t="768000" d=“44307"/> 
</SegmentTimeline>
```
, whereby "loosen" would be restricted to being off by a particular threshold (I'm not sure by how much, however).

For a DASH client which relies on time-accuracy, the PatternTimeline should be read and interpreted. For a DASH client which relies on SegmentTimeline today, but is not restricted by inaccuracies in the SegmentTimeline, the SegmentTimeline with $Number$ should be backwards-compatible.

koceskik commented 1 year ago

@bbert Is your assertion that using a PatternTimeline would necessitate $Number$ addressing? And in doing so, it may result in incompatibility with non-updating DASH clients?

bbert commented 1 year ago

@koceskik yes my understanding of @haudiobe's initial proposal is that if you want to signal a PatternTimeline to reduce the manifest you would necessitate the $Number$ addressing to be compatible with legacy DASH clients. Thus you loose time accuracy for these legacy clients, and indeed it may result in incompatibility issues especially with ad-insertion manifest manipulators.

Now thinking back on last proposal, with this example:

<SegmentTimeline accurate="false">
  <S t="178577070976" d="96000" r="7"/>
</SegmentTimeline>
<SegmentTimelineNew>
  <Pattern t="178577070976" r="1">
    <S d="95232"/>
    <S d="96256" r="2"/>
  </Pattern>
  <S t="178577838976" d="95232"/>
</SegmentTimelineNew>

and for which the segment starting for example at timestamp 178577550208 would be addressed at time 178577550976. Is that right?

Finally the result is the same except that, compared to previous solution, you can signal for example gaps in both timelines since you are not forced to address segments using $Number$ .

However that solution still requires some tricky processing for clients that read and interpret that new pattern timeline since you need to keep matching between the segments from both timelines, especially for content replacement. That will be source of errors and interoperability issues.

Questions:

From DASH spec (section 5.3.9.6): "If a Segment Index ('sidx') box is present, then the values of the SegmentTimeline shall describe accurate timing of each Media Segment, Specifically, these values shall reflect the information provided in the Segment index ('sidx') box,..." => how to maintain compliance with inaccurate SegmentTimeline ?
What about legacy DASH clients that may assume that timestamps from SegmentTimeline are accurate?
What about an up-to-date client that read the SegmentTimelineNew of a manifest that has been meantime manipulated by a legacy manipulator but that is not aware of this new element?

haudiobe commented 1 year ago

AHG 2023/04/14

@agiladi suggests to also indicate the tolerance - a concrete proposal would be welcome.

tobbee commented 1 year ago

@agiladi mentioned milliseconds, but the most natural tolerance would be to use the same units as the rest of the timestamps. Maybe his suggestion would be different, but I think the following example would make sense:

For 48kHz audio with timescale 48000, an AAC frame's duration is 1024 ticks. It is therefore possible to always achieve a duration of an audio segment that is within 1024 ticks from the average. A bigger variation would require a new <S> element. A possible syntax should look like:

  <SegmentTimeline tolerance="1024">

rather than

  <SegmentTimeline accurate="false">

ZmGorynych commented 1 year ago

[edited for @tobbee 's comment above] Instead of

    <SegmentTimeline accurate="false">
        <S t="178577070976" d="96000" r="7"/>
    </SegmentTimeline>

we can explicitly specify tolerance

    <SegmentTimeline tolerance="1024">
        <S t="178577070976" d="96000" r="7"/>
    </SegmentTimeline>

The semantics will be that for any segment the actual value in tfdt will be within 1024 clock ticks (in units of timescale) from the value calculated using S@t and S@d. A possibly better syntax may be introducing SegmentTemplate.tolerance in which case this applies to cases where we do not use SegmentTimeline at all (i.e., we rely on the @duration value)

Assuming audio frame duration of 1024, this lets us do a pattern where we add an extra audio frame to every Nth segment to maintain a/v segment alignment.

ZmGorynych commented 1 year ago

@agiladi mentioned milliseconds, but the most natural tolerance would be to use the same units as the rest of the timestamps. Maybe his suggestion would be different, but I think the following example would make sense:

For 48kHz audio with timescale 48000, an AAC frame's duration is 1024 ticks. It is therefore possible to always achieve a duration of an audio segment that is within 1024 ticks from the average. A bigger variation would require a new <S> element. A possible syntax should look like:
  <SegmentTimeline tolerance="1024">
rather than
  <SegmentTimeline accurate="false">

Completely agree with the above, it should be in units of the same timescale as used in SegmentTemplate. Updated previous comment to account for this

haudiobe commented 1 year ago

It should not be tfdt, but presentation time.

koceskik commented 1 year ago

I'm of the opinion that adding @accurate or @tolerance to the SegmentTimeline and then asserting that the segments within aren't accurately reflections of the segment timings is actually not backwards-compatible. If a DASH client needs to understand these new attributes to behave correctly, they'd need to update, and if the DASH client already handles simple addressing mode, why add a SegmentTimeline at all? If it's purely to add PatternSegmentTimeline for accurate timings (such as for a manifest manipulator), I still don't see the value in adding complexity of a new syntax to the existing SegmentTimeline.

At which point, why not go with the original proposal of adding a Pattern element within the SegmentTimeline and explicitly not being backwards-compatible.

In general, I think it could make sense to include PatternSegmentTimeline as a new child element of SegmentTemplate, which would produce backwards-compatible manifests where simple addressing mode is used and supported by the DASH client. A service offering could, theoretically, support accurate manifest manipulations by updating a manifest manipulation service to support explicit pattern addressing, even if an on-device DASH client couldn't be updated (by falling back to simple-addressing mode since it would ignore the new PatternSegmentTimeline tag completely).

However, if explicit addressing mode is necessary, either because the DASH client can't handle the variability in fragment timings correctly, or the manifest manipulator (ex: for ads insertion) needs accurate timings for handling period bounds manipulation, you're going to need to update your DASH client/manifest manipulator anyways.

So, I'm unconvinced that the backwards-compatibility argument is a major point in making a decision, because service offerings that will rely on explicit addressing mode will need to update anyways.

At that point, does it make sense to add complexity of a new tag (PatternSegmentTimeline) and the ruleset for deciding 1) that if PatternSegmentTimeline exists, it is an accurate patternized timeline, but SegmentTemplate@duration is still required to backwards-ly support cases where PatternSegmentTimeline isn't understood, 2) if PatternSegmentTimeline doesn't exist, but SegmentTimeline does, then SegmentTimeline is an accurate non-patternized timeline, 3) and that if they both don't exist, it's simple-addressing mode.

(it doesn't sound too complex in that simplified paragraph, but I think there's a few other cases to author explicit clarity on)

haudiobe commented 1 year ago

Live TF 2023/05/05

@koceskik summarizes
@ZmGorynych tolerance would be orthogonal
Decision for next steps:
- create an MPEG input contribution to create an alternative segment addressing and timeline pattern based on the proposal above
- requires an update to the existing Amd.1 text (5th edition + Amd.1)
- prioritize the new segment timeline signaling with pattern. Once completed, we can test the impact for backward-compatibility
- invitation for additional potential requirements in case we do a new Segment Timeline signaling (e.g. gaps, segment sequences, etc.)
- we continue to discuss this in the Live TF
- @koceskik summarizes the solution @bbert @ZmGorynych @technogeek00 please help as needed.

haudiobe commented 1 year ago

@koceskik @bbert @ZmGorynych @technogeek00 any updates on the google docs that we can check tomorrow?

koceskik commented 1 year ago

I've created a doc here:

https://docs.google.com/document/d/1O4diz48Lr3LJozloy2MdLO79_Ylj6yjg5f-Jhi0UjPQ

haudiobe commented 1 year ago

f2f June

thanks @koceskik for providing the document - very good.

Decision:

we attempt
- to create a backward-compatible solution for $Number$ with pattern as auxiliary information for clients to be used for seek
- to make sure that our guidelines in DASH-IF and the MPEG standards it is clear that $Number$ with SegmentTimeline does NOT require accuracy
- to possibly add a backward-compatible signaling on the maximum tolerance of the MPD information and the media time
- we encourage people to review the document according to what is discussed
- we continue to discuss the doc in Live TF with the ambition to submit a proposal for MPEG#143 and to create a dash.js implementation that can make use of the proposal
we continue to collect additional potential requirements and motivation use cases to have a completely new addressing/timing system that would warrant to break compatibility, for example also in context with lower latency.

technogeek00 commented 1 year ago

Did a first pass read of the document, very nicely done @koceskik clearly articulated. I'll think on some descriptive aspects, should we provide comments directly back in the doc?

haudiobe commented 1 year ago

Did a first pass read of the document, very nicely done @koceskik clearly articulated. I'll think on some descriptive aspects, should we provide comments directly back in the doc?

Yes, please comment directly in the doc. Until next week, in order to complete submission to MPEG.

Dash-Industry-Forum / Live

New Proposal: Patterns in Segment Timeline #75