Encoding parameters for spatial layers

w3c / webrtc-svc

W3C Scalable Video Coding (SVC) Extension for WebRTC

https://w3c.github.io/webrtc-svc/

Other

39 stars 14 forks source link

Encoding parameters for spatial layers #14

Open Orphis opened 5 years ago

Orphis commented 5 years ago

It would be interesting to be able to control the encoding parameters for the spatial layers in a similar way we can control simulcast layers. I was thinking of an array of spatial coding parameters nested in the RTCRtpEncodingParameters dictionnary, one entry per layer.

I believe we should be able to have parameters like maxBitrate, maxFramerate, scaleResolutionDownBy and the active flag.

Bitrate and framerate are quite self explanatory.
scaleResolutionDownBy should be decreasing for each spatial layer (aka layers have to be bigger). There will be some interaction with the same encoding parameter for simulcast to define.
active is a bit tricky when there are layer dependencies and should disable that layer and all layers having dependencies on disabled layers. This might not apply in "S" modes.

aboba commented 5 years ago

@Orphis The ORTC API described dependencies between SVC layers via framerateScale, encodingId and dependencyEncodingIds . However I believe this had some limitations with respect to KEY and KEY_SHIFT modes.

Note that the AV1 Bitstream Specification supports both pre-canned Scalability Modes as well as custom Scalability Structures via the syntax in Section 5.8.6.

Orphis commented 5 years ago

As per discussions we had, let's just try to keep the active boolean and maxBitrate values. It could look like this:

let encodings = [{
    scalabilityMode = 'L3T3',
    spatialEncodings = [
        { active: true, maxBitrate: 100000 },
        { active: true, maxBitrate: 300000 },
        { active: true, maxBitrate: 600000 },
    ],
}];

aboba commented 5 years ago

The above proposal provides some additional flexibility with respect to spatial ratios as a side effect. For example:

// Provide a full resolution simulcast encoding as well as a quarter-size (thumbnail) encoding
let encodings = [{
    scalabilityMode = 'S3T1',
    spatialEncodings = [
        { active: true },
        { active: false },
        { active: true}
    ],
}];

aboba commented 5 years ago

Additional flexibility can also be provided by defining additional pre-canned modes. For example, popular 16 x 9 resolutions include:

1080p (1920 x 1080)
720p (1280 x 720)
360p (640 x 360)

Since the spatial ratios vary between layers (1.5:1 as well as 2:1), this is not covered by existing modes. However, a new mode could be defined to cover this use case.

vr000m commented 2 years ago

Curious about the status of this issue, what do we need to do to move this forward?

aboba commented 2 years ago

We would need implementer interest to move forward.

@Orphis @hta -- Is there continued interest in this?

Orphis commented 2 years ago

@vr000m We need the rest of the SVC APIs implemented first to get some implementation experience. We'll move forward with this then. We have certainly not forgotten about it.

aboba commented 2 years ago

Can we get a PR submitted for discussion at TPAC?