w3c / media-capabilities

Media Capabilities API
https://w3c.github.io/media-capabilities/
Other
77 stars 33 forks source link

Define the meaning for CBR and VBR more precisely. #79

Open PauKerr opened 6 years ago

PauKerr commented 6 years ago

I have proposed tightening up the description around VBR, stating more clearly that the bit rate attribute should represent the maximum throughput necessary for the prospective stream.

chcunningham commented 6 years ago

Hey Paul, sorry for the long wait. I agree that the original language is a little confusing. Some questions about the PR

From PR

In the case of a video stream encoded at a constant bit rate (CBR) this shall represent the average bitrate of the video track.

Should we use the word "constant" rather than "average" here? This isn't my expertise, so if the community at large is used to referring to CBR as an "average", then lets go that route. But for me its weird to describe an average if the samples are all the same (constant).

From PR

For the case of variable bit rate (VBR) encoding, this value shall allocate any necessary buffering and throughput capability to the maximum bitrate of the stream.

I think your language clarifies that we mean "maximum", but I'd like to revisit whether that is a what we should be asking for. I raised this point earlier but it got lost in the thread.

My comment from From Issue 9

What do you think about relaxing the advice here to something like: "should therefore query capabilities based on the bitrate that best characterizes their stream."

I realize thats vague - can definitely be improved. My point is that I could see other content providers being less concerned with how it performs at peak vs how it performs overall. For streams containing some bitrate spikes it might would be undesirable to say "not smooth" if in reality we could smoothly playback 99% of the stream. Still, for folks that want a perfectly flawless smooth playback, using maximum is probably better, so I aim to find some language that gives devs that flexibility.

PauKerr commented 6 years ago

Chris, I do agree that the provider, in many cases, will have a good idea of the characterization of the stream to be decoded and displayed. But, to help them choice the correct value for the bitrate attribute in the query, they need to have a clear idea of what that value will mean to the user agent that will respond to the query. Then the content provider can make the trade off based on this understanding.

So perhaps a maximum over a bounded period measured in seconds would work. For example, 30 seconds?

chcunningham commented 6 years ago

I can give the Chrome perspective.

First, the 1000 ft view for how we implemented so far. At present, the smooth/power efficient answers come from a small local DB containing aggregate performance stats (just numbers, no urls) on the performance of past playbacks. We index the stats based on the observed stream properties (resolution, codec, framerate), which allows us to do a lookup when the media capabilities queries are made.

I haven't implemented bitrate yet (tracked here), but my plan is similar to to how we determine framerate. Like bitrate, framerate isn't usually in the container metadata, and it changes over the playback. We keep a moving average of the duration of last 8 frames. We then round that average into a coarse bucket (e.g. 23.3333 -> 24 fps). We check in on this value every couple of seconds. As long as the bucketed framerate holds steady we use it to continue accumulating stats for the playback (e.g. dropped frames). If it changes to a new bucket, we save off the stats accumulated so far and start fresh for the new framerate value.

So, for bitrate, we will keep some moving average, round it to a rough bucket, and check in every couple of seconds to see if its significantly changed.

For API callers, this means total flexibility. Here's some examples..

Preface Assume the caller intends to use the result to establish a cap for quality adaptation. Say you have a stream with average bitrate A, but maximum bitrate M. Assume past performance for this machine has been smooth at A, but choppy at M.

Example 1 Lets say the stream being considered spends a significant amount of time (e.g. 20%) at bitrate M. It would be bad for 20% of the playback to be potentially choppy. Use M to query the API, learn that it will not be smooth, and set some lower bound for your adaptation limit.

Example 2 Lets say the stream is 99% described by bitrate A, with a 1% outlier of bitrate M. Here you're probably better off querying media capabilities with A. Because A is smooth, the app may allow adaptation up to this higher level, accepting that 1% of the playback may stutter.

Of course, different apps might have different priorities ;)

chcunningham commented 5 years ago

@PauKerr @mwatson2 does the above work for you?

mwatson2 commented 4 years ago

IIUC, what you're saying is that the capabilities response describes the expected device behavior (smooth, powerEfficient) for a stream (or part of a stream) with constant bitrate at the rate specified. This makes sense to me.