w3c / webmediaporting

Web Media porting spec
1 stars 10 forks source link

guidelines on integration of web media APIs with hardware video & audio decoders #30

Open jpiesing opened 4 years ago

jpiesing commented 4 years ago

A number of issues that seem to cause inter-operability pain with web media apps relate to the integration of the web media APIs with hardware video & audio decoders.

We could address them in WAVE but addressing them in somewhere else might be better. I think it's unrealistic to try & get the WHATWG HTML spec changed but perhaps a W3C Note with some guidelines on the subject might be both useful to the industry and achievable.

Below is a list of issues which I've encountered which might be suitable to cover with some W3C guidelines.

Where there are multiple different answers in real world media devices, I'd hope it's possible to at least document them & perhaps recommend one for new implementations.


"hardware decoder", "claim" and "released" are shorthand and not intended to be precise or restrictive. Some hardware platforms may have a single video decoder block that can decode more than one stream at one time under some circumstances.

jpiesing commented 4 years ago

@johnluther @JohnRiv @tidoust @chrisn @haudiobe

tidoust commented 4 years ago

Documenting usual behavior in real world media devices and listing recommendations for developers seems a good thing. If a W3C Note seems useful, the Media and Entertainment Interest Group could perhaps host the effort, provided IG participants support it and are willing to work on the document. Should I raise this within the IG right away or should we wait until WAVE gets a chance to discuss it?

For what it's worth, the IG charter already mentions the possible publication of "Primer or Best Practice documents to support web developers when designing applications", so no need to recharter the group for it to publish a note as suggested.

jpiesing commented 4 years ago

Documenting usual behavior in real world media devices and listing recommendations for developers seems a good thing.

What I had in mind for recommendations were ones aimed at those organisations porting / integrating an existing HTML5 UA onto a media device hardware + operating system - not at developers. I'm sure there are issues for developers too - feature detecting how the integration has been done where this makes a difference.

If a W3C Note seems useful, the Media and Entertainment Interest Group could perhaps host the effort, provided IG participants support it and are willing to work on the document. Should I raise this within the IG right away or should we wait until WAVE gets a chance to discuss it?

I'd like to have one round of discussion in WAVE first - at least that way I have the possibility of participating.

For what it's worth, the IG charter already mentions the possible publication of "Primer or Best Practice documents to support web developers when designing applications", so no need to recharter the group for it to publish a note as suggested.

That could be seen as not including documents aimed at browser integrators :(

chrisn commented 4 years ago

I agree this would be useful, and welcome. I wouldn't rule out proposing clarifications to HTML though, and perhaps creating a guidelines document is a good step towards that.

That could be seen as not including documents aimed at browser integrators :(

This is under "Other non-normative documents" as a "such as", so it's not an exclusive list. I agree with @tidoust that this could be done under the current charter.

It's possible someone at my organisation would want to contribute to these guidelines.

tidoust commented 4 years ago

Ah, I had misunderstood the intended audience, indeed, and so my reference is misleading at best... I don't think the shift makes much difference though. The scope section does mention clients and devices (and thus by extension those who implement these clients, directly or indirectly).

jpiesing commented 4 years ago

Another possible subject for the list is the usage of hardware media decoders by the Web Audio API and how that interacts with media APIs.

jpiesing commented 4 years ago

I've just updated the problem description to expand concerns around the load method and add concerns about seeking.

sandersdan commented 4 years ago

Here is how I see these question from the point of view of Chrome's media stack (primarily the video element) and a future that includes WebCodecs.

Current situation:

Resource availability:

Specific questions from above:

chrisn commented 4 years ago

Many thanks for sharing this, @sandersdan.

jpiesing commented 4 years ago

Hardware codecs are not our only constrained resource. CPU memory, CPU cycles, DMA bandwidth, GPU memory, and network bandwidth are all impacted. Preloading: we are able to decode at least two videos concurrently on all of our browser platforms.

Media devices may only be able to decode one video at the same time or may only be able to decode two with limitations. Many media devices would not have the memory capacity or bandwidth to decode two UHD streams. In some silicon, only one decoder is connected to the decryption block.

sandersdan commented 4 years ago

Media devices may only be able to decode one video at the same time or may only be able to decode two with limitations.

Indeed. I specified "browser platforms" above primarily because Chromecast is different.

jpiesing commented 4 years ago

Media devices may only be able to decode one video at the same time or may only be able to decode two with limitations.

Indeed. I specified "browser platforms" above primarily because Chromecast is different.

I wasn't sure of your definition of "browser platforms". Thanks for the clarification and for the original details.

chrisn commented 4 years ago

@sandersdan Do you think some of these issues could be addressed via clarifications in the HTML spec, or are there cross-vendor implementation differences which would make specifying expected behaviour not practical for these resource contention scenarios?

Some issues seem more straightforward to clarify in the spec, e.g., definition of currentTime.

I'd be interested to explore further the idea of adding explicit mechanisms.

sandersdan commented 4 years ago

The spec as written assumes infinite resources, and web pages do not typically implement any recovery mechanisms to handle cases where that turns out not to be true. Our workarounds have to be careful to maintain the illusion (basically always eventually emit the expected events) or there will be breakage.

Note that we are not always successful at maintaining the illusion. For example, we can't seek to the previous keyframe in a realtime stream, and therefore can't use suspend/resume transparently in that case.

We change our workarounds somewhat routinely, but there may be a core set of principles we can document. (Approximately 'always make forward progress eventually, fake events if you have to'.)

Comparatively recent changes like MediaSession (media controls in notifications) and the play() promise have helped, in that new players are more flexible when the browser asserts control over video elements. In the past a majority of player implementations would call play() in a loop if we didn't enter a playing state, but that's not a problem we have with modern players.

I am hopeful that with the right mechanisms now, there will come a future time where we can drop the illusion and just fail playback when resources are constrained. When using software decode or the Page Lifecycle API we are nearly there now.

JohnRiv commented 4 years ago

FYI, as you can see from the flurry of references above, I've migrated the issues listed here to specific, individual issues in the new me-media-integration-guidelines repo