w3c / webrtc-extensions

A repository for experimental additions to the WebRTC API
https://w3c.github.io/webrtc-extensions/
Other
58 stars 19 forks source link

RTCRtpEncodingParameters: scaleResolutionDownTo #159

Closed henbos closed 1 month ago

henbos commented 1 year ago

We all know and love scaleResolutionDownBy... it let's you do this:

Capture 720p and apply expensive video effects on the track. Send {active, scaleResolutionDownBy:1 (720p)} + {active, scaleResolutionDownBy:2 (360p)} simulcast.

But then server tells us 720p is not needed, so we send {inactive, active:360p}. Now you might wonder, why are we applying expensive video effects on a 720p track if we're only sending it in 360p?

So we track.applyConstraints() to 1/2 downscale the track and we setParameters() to 2x the scaling factors to counter the fact that the input frame size is smaller. The end result is:

360p track sent as {inactive} + {active, scaleResolutionDownBy:1 (still 360p)}.

Problem: This is racy. The track changing size and the parameters updating the scaling factors are not in-sync, so you might send 720p on the VGA layers for a few frames. You're wasting a key frame on sending the wrong resolution and then you're wasting another key frame to adjust back to the desired 360p resolution. Working around this by temporarily inactivating all encodings while the track is resizing will appear like glitches to the receiver of the stream and there probably would still have to be key frames when the encodings are re-activated. The API was not designed for this.

Proposal: Add "scaleResolutionTo: 360" API which is equivalent to always having the correct scaling factor. In the above example, changing the track's size from 720p to 360p would not result in reconfiguring the encoder and there would be no key frames or races.

henbos commented 1 year ago

Something like this already exists in libwebrtc (RtpEncodingParameters::requested_resolution) so it might be fairly straight-forward to experiment with this in JavaScript

henbos commented 1 year ago

@aboba

fippo commented 1 year ago

scaling resolution down by a factor (to make things worse: a freely choosable one instead of powers of two) has always been confusing so 👍 for fixing this.

Orphis commented 1 year ago

We may not want to have a single value, but a few more to describe how the change is done. While some resolutions are standard and we usually can recognise the usual ones from one dimension, we need both to work with both portrait and landscape tracks.

Quite often, the ratio will be different, so the behaviour needs to be defined there too. Do we want to rescale? Keep the ratio but crop? Or add black bars? Or fit to the height or width parameter and crop the outside parts? If we're doing this, do we want crop coordinates too in the API? (for example to frame a participants head)

Orphis commented 1 year ago

Also, what happens if the track is smaller than the requested resolution? Should the aspect ratio be changed to match the requested resolution's ratio? So is it a min value, max value or target value?

henbos commented 1 year ago

Landscape vs portrait mode has to be addressed. There are a few different options I can think of:

Regarding if the track is smaller than requested: I don't think we should ever up-scale, so I only see two options:

I don't have strong opinions, just listing options.

dontcallmedom-bot commented 1 year ago

This issue had an associated resolution in WebRTC May 2023 meeting – 16 May 2023 (Issue #159: RTCRtpEncodingParameters: scaleResolutionDownTo):

RESOLUTION: Overall support in the direction, but details need to be fleshed out and complexity assessed

aigarrz commented 8 months ago

What should happen when video frame adapter reduces input resolution to below requested size?

Example:

If the peer is a SFU, only option C avoids bitrate overshooting E2E.

dontcallmedom-bot commented 1 month ago

This issue had an associated resolution in WebRTC August 27 2024 meeting – 27 August 2024 (RTCRtpEncodingParameters: scaleResolutionTo):

RESOLUTION: proceed with a PR for #159 with revised names

henbos commented 1 month ago

In the meeting we decided that "maxWidth" and "maxHeight" are more descriptive names for the attributes. For this reason I think it makes more sense to call this API "restrictedResolution" rather than "scaleResolutionDownTo" but I think we can bikeshed on the name in the editor's meeting.

Edit: nope, scaleResolutionDownTo was the preferred name