w3ctag / design-reviews

W3C specs and API reviews
Creative Commons Zero v1.0 Universal
328 stars 55 forks source link

WebCodecs support for AV1 screen content coding tools #912

Closed fippo closed 10 months ago

fippo commented 11 months ago

Moin moin TAG!

I'm requesting a TAG review of a WebCodecs extension to add support for AV1 screen content coding tools (asked here)

This adds AV1EncoderConfig (a dictionary containing a boolean forceScreenContentTools (a term from the AV1 bitstream spec)) to the VideoEncoderConfig along these lines:

encoder.configure({
  codec: 'av01.0.04M.08',
  width: 1920, height: 1080, bitrate: 2_000_000, framerate: 5, 
  av1: { forceScreenContentTools: true},
}); 

This allows an application to encode “screen content”, in particular presentation slides, in a more efficient way supported by the AV1 codec. This material is typically static, often includes text, a limited set of colors, lots of repetitive content (e.g. straight lines, shapes) for which the encoder can optimize.

See the explainer for a lot of visual examples. This AV1 feature is already supported by WebRTC and enabled for screen sharing MediaStreamTracks so this increases platform consistency.

You should also know that...

How extensibility is handled is probably the more interesting thing to review!

We'd prefer the TAG provide feedback as (please delete all but the desired option): 💬 leave review feedback as a comment in this issue and @-notify @fippo

fippo commented 10 months ago

While you are in the area could you please also review VideoEncoderConfig.contentHint by @Djuffin? The explainer can be found here, chromestatus entry here and the tl;dr is

encoder.configure({
  codec: 'vp8',
  width: 1920, height: 1080, bitrate: 2_000_000, framerate: 5, 
  contentHint: 'detail'
}); 

Visually this should be similar to this older webrtc sample

The relationship between the two is a bit complicated, bear with me.

In WebRTC setting a MediaStreamTrack's contentHint treats it as "comes from screensharing" which implies screen content coding for AV1. Full details here.

cynthia commented 10 months ago

forceScreenContentTools is AV1 specific and does not interact with resolution or framerate.

Doesn't HEVC have this too? Wondering if it should really be constrained just to AV1 scope (as the name suggests), or consider the possibility of HEVC (or other future codecs) to be able to use this. I suppose while ugly, the other codecs could inherit from AV1EncoderConfig.

contentHint: 'detail'

What's the expectation when contentHint preference conflicts with what the codec-specific config asks for? For example:

encoder.configure({
  codec: 'av01.0.04M.08',
  width: 1920, height: 1080, bitrate: 2_000_000, framerate: 5, 
  contentHint: 'detail'
  av1: { forceScreenContentTools: false}, // for example.
}); 
Djuffin commented 10 months ago

What's the expectation when contentHint preference conflicts with what the codec-specific config asks for?

As the spec says codec specific knobs always take precedence.

Djuffin commented 10 months ago

Doesn't HEVC have this too?

We have contentHint for generic use cases. For web developers who want to use specific codec features we have codec specific option dictionaries. I don't think that we need to try to combine codec specific options even if they have similarities in different codecs.

fippo commented 10 months ago

Doesn't HEVC have this too?

Implementation status seems to be somewhat hard to describe. I just stumbled over a message in the video-dev slack which suggests it may be around on some Intel drivers?

Speaking more generally we could have a more generic name that covers both AV1 and HEVC without developer pain. On the other hand we have per-frame QP for some codecs without inheritance which suggests that the codec-specific options without inheritance are a thing already? Tough decision... which is why we ask! :-)

What's the expectation when contentHint preference conflicts with what the codec-specific config asks for? For example:

Tough one! As Eugene says, codec specific knobs take precedence so this would give you AV1 without screen content coding tools. Not sure what the difference would be TBH. degradationPreference would be my preferred knob but since webcodecs is per-frame I am not sure this applies.

Djuffin commented 10 months ago

Can we have a green light for the contentHint part? It is generic and makes sense for most encoders.

cynthia commented 10 months ago

Can we have a green light for the contentHint part? It is generic and makes sense for most encoders.

We don't necessarily "green light" proposals (we aren't an approval body) - but if you are asking whether the feature makes sense, the contentHint feature seems like a well-intended, non-controversial feature so we're happy with that.

I don't think that we need to try to combine codec specific options even if they have similarities in different codecs.

That wasn't really what I was suggesting when I asked that question, it was a counterargument on the "this is AV1 only" remark - at least if there is an equivalent feature it would make sense to have it interoperable, e.g., hevc: { forceScreenContentTools: true}.

As the spec says codec specific knobs always take precedence. Tough one! As Eugene says, codec specific knobs take precedence so this would give you AV1 without screen content coding tools.

So the reason why I asked this very question is because it smells of counterintuitive behavior. If contentHint: 'detail' for example would enable screen content coding, there is likely an expectation that forceScreenContentTools: false should keep it enabled - since it implies it is a boolean flag forcing the feature to be enabled, that being false would imply not forcing it to be enabled, so it disabling screen content coding would be fairly counter intuitive. The other risk is cascaded configs and copypasta code canceling behavior as a potential footgun...

We would like to see an alternative consideration (the explainer doesn't have any "alternatives considered") for better developer ergonomics if possible.

The naming is not straightforward as well, but that's a feature of the codec so we don't have strong feelings about that detail. Overall, I think having a feature to do per-codec config overrides is a necessary and useful feature, but we think it would be helpful to foolproof the API a bit more for non-codec experts .

fippo commented 10 months ago

Merging this into contentHint ended up being the more pragmatic approach - which also avoids the naming issue ;-)

Thank you for the feedback!