cta-wave / device-playback-task-force

9 stars 0 forks source link

testing playback at non-integer speeds >1.0 #71

Open jpiesing opened 4 years ago

jpiesing commented 4 years ago

Playback at non-integer speeds >1.0 is important in low latency situations for catching up with the live edge. The DPC spec currently has 8.16 "Playback Other than Real Time" which 1) has no test defined and 2) is a single-track requirement and not a requirement for WAVE presentations.

8.16 mentions 3 mechanisms for non-realtime play as follows;

There are three ways to do this: • Play at X times as long as it is in the decoder profile/level constraint (app forwards all content to the device and device plays with X times). • Use special content (e.g., I-frame only content, to address any of these issues – and use the above). • Download key frames from regular content and feed those into the MSE source buffer.

We should add a test for at least the first of these. In the case of catch-up to live, a typical playback speed might be in the range 1.1 to 1.5. At these speeds, audio should still be present (perhaps with pitch correction). If we add only one test is should be for CMAF Presentations. IMHO adding single track tests for this would be a lower priority.

jpiesing commented 4 years ago

Duplicate of #79

haudiobe commented 4 years ago

keep this one as high priority for >1. Specific values: 1.1, 2, (4, 8). (make sure that the content is within the profile level constraints for at least 2). Typically this is doable.

Observation:

This is for a single track/media profile

needs implementation.

go for < 1 in #49 and #79 merge

Make sure that we add the max value to the content spec (we may add this content model).

dsilhavy commented 3 years ago

dash.js currently limits playback rate adjustment to +/- 50%, resulting in values between [0.5,1.5]. See also here: https://github.com/Dash-Industry-Forum/dash.js/wiki/Low-Latency-streaming Section "Calculating the new playback rate"

andyburras commented 3 years ago

What is the expected device behaviour for audio in a Presentation test? Sometimes devices seem to mute the audio when playing back at <> 1.0

jpiesing commented 3 years ago

What is the expected device behaviour for audio in a Presentation test? Sometimes devices seem to mute the audio when playing back at <> 1.0

Some would blank it. Some might adjust audio pitch - see https://html.spec.whatwg.org/multipage/media.html#dom-media-preservespitch-dev

gitwjr commented 1 year ago

Defer to future. There does not appear to be covered in spec or with any current tests. It is in DPCAT 22.

jpiesing commented 2 months ago

proposal-playback-non-integer-speeds.docx

Here is proposed text for an additional test for speeds other than 1.0. This is an edited version of sequential track playback with the additions highlighted. I'm not sure what to do about audio here. It's unclear if the audio PNR extraction would work at speeds other than 1.0. If not then it may be better to remove audio and limit this to video. We should also consider "Playback of WAVE presentations at speeds other than 1.0" even if audio PNR extraction does not work at speeds other than 1.0. Probably no observations are possible except for "does not crash" and "audio continues to be audible without silence or worse noise".

jpiesing commented 2 months ago

proposal-playback-non-integer-rates-2024-04-18.docx

Here is a cleaner proposal with more consistent terminology and a better explanation.

jpiesing commented 2 months ago

Included in Draft-CTA-5003-Ae-v2.05.

haudiobe commented 2 months ago

The determination of the exact duration and also the playback time is complex. In the updated v2.07 some suggestions are added to comments, but we need to check further.

haudiobe commented 5 days ago

I added some proposed updates to v2.09 along the following lines

General

If the above algorithm is carried out, the following observations are expected:

    1)       Once the playback is started observe

    a.       the `currentTime` of the media element as `currentTime[0]`.

    b.       The media time stamp of the displayed sample as `mediaTime[0]`.

    2)       Then, for every rate step `i = 1, …., N`  when changing the playback rate after `rate_step/rates[i-1]`,

    a.       Observe the `currentTime` of the media element as `currentTime[i]`. The difference of `currentTime[i] – currentTime[i-1]` shall match the value of `rate_step `with` `some tolerance to be defined for each media type.

    b.       observe the media timestamp of the displayed sample as `mediaTime[i].` The difference of `mediaTime[i] – mediaTime[i-1]` shall match the value of `rate_step `with` `some tolerance to be defined for each media type.

    3)       If samples with an assigned media time **<code>T</code></strong> are displayed, they shall be displayed relative to the first displayed time <code>mediaTime[i]</code> at the display time <code>(T - mediaTime[i])/rates[i]</code> with<code> </code>some tolerance to be defined for each media type.

    4)       The end of the playback shall be announced.

    5)       Each time the `playbackRate `property is changed, a `ratechange `event is sent

1.1.1.2 ​​​​Video

If the track is a video track, then the following additional observations are expected:

1)      Every video frame S[k,s] shall be rendered such that it fills the entire video output window following the properties in clause 5.2.2.

2)       The presented sample shall match the one reported by the currentTime value within the tolerance of +/- (1/framerate + 150ms).

3)      Every video frame S[k,s] shall be rendered and the video frames shall be rendered in increasing presentation time order.

4)       Video start-up delay: The start-up delay should be sufficiently low. As user agents may pre-load the first frame, the time to first frame is not relevant, but what is relevant is that once hitting play, the second frame is rendered within the considered start-up delay. In addition, there may be missing frames at start up. Hence, TR [k, x] – Ti &lt; TSMax where x is the first detected frame change after the play() event.

5)       The general observations apply with any tolerance being at most `frame_presented_tolerance`

6)       If rates[i] &lt;= 1, every frame shall be displayed.

1.1.1.3 Audio

If the track is an audio track, then the general observations as well as the additional observations as per clause 8.2.5.3 are expected.

No specific requirements on audio display apply, for example in playback rates other than zero, audio may be muted.

haudiobe commented 4 days ago

discussed in DPCTF 2024/07/03

good as first baseline, please check, last call for comments. We are not expecting to have everything absolutely correct, but would like start testing the feature and learn on what implementations can support. Likely to be refined later.