aboba / hevc-webrtc

Other
5 stars 2 forks source link

RPSI RTCP feedback support #13

Open taste1981 opened 10 months ago

taste1981 commented 10 months ago

RFC7798 lists RPSI message as one of the supported RTCP type. This is different from H.264 where no spec clearly defines the RPSI payload format.

This spec should provide suggestion on whether RPSI should be supported by WebRTC.

aboba commented 10 months ago

Does libwebrtc support RPSI for any codec?

Aside from HEVC (https://www.rfc-editor.org/rfc/rfc7798#page-77) RPSI is mentioned in VP8 (https://www.rfc-editor.org/rfc/rfc7741#page-18) and VP9 (https://datatracker.ietf.org/doc/html/draft-ietf-payload-vp9#page-15) but not AV1 (https://aomediacodec.github.io/av1-rtp-spec/#8-feedback-messages).

taste1981 commented 10 months ago

RPSI used to be implemented for VP8 and VP9 as feedback of successfully decoded picture id; It was later removed from the implementation as there seems no good usage of it for VP8/VP9.

With VP9/AV1 later switch to support temporal/spatial scalability, the support of RSPI seems to introduce too much complexity for libWebRTC for the reference frame finder implementation on receiving side.

However, we're seeing products like Zoom is utilizing LTR encoding capability of H.26x codecs, and RPSI is well-suited for signaling usage of long term reference during encoding.

fippo commented 10 months ago

Encouraging the support for RPSI in WebRTC is already done in RFC 8834, section 5.1.4:

The RPSI message allows this to be signaled. Receivers that detect that encoder-decoder synchronization has been lost generate an RPSI feedback message if the codec being used supports reference-picture selection. An RTP packet-stream sender that receives such an RPSI message act on that messages to change the reference picture

(I do think the reference finder is going to be a very big rock to getting support for LTR -- @Philipel-WebRTC will know more)

Philipel-WebRTC commented 10 months ago

AFAIK libwebrtc does not support RPSI at all. There was an effort many years ago to try out LTR, but then a custom RTCP message was implemented as RPSI was found to be insufficent in some ways.

Not sure why reference finders would play a particular roll for LTR (they just keep state and calculate references, they are not aware of LTR as a concept).

taste1981 commented 10 months ago

LTR should be treated as one of the frame's property that reference finder can lookup:

Today it knows {picture_id, frame_type} as a combination for each frame(for VP8/VP9). For LTR in H.26x codec that picture_id becomes PicOrderCnt,, and for frame _type, a new type (LTR) should be added.

Philipel-WebRTC commented 10 months ago

Yes, the reference finder should correctly determine what other frames a particular frame depends on, but it won't treat an LTR reference differently from any other reference. It's just a reference.

ssilkin commented 10 months ago

It would be good to have a codec-type agnostic solution that should probably be tied to the dependency descriptor (https://aomediacodec.github.io/av1-rtp-spec/#dependency-descriptor-rtp-header-extension)

taste1981 commented 10 months ago

ssilkin@ DD is more of conveying the current reference structure, as a way of sender side signaling to receiver side to facilitate the reference finder. RSPI is a way of notifying the sender side that it needs a delta frame instead of key-frame, and to that end, safeguards it by letting sender side know a certain long-term ref has been decoded. So they're for different purpose. And since RPSI is formally defined for HEVC in RFC, I would expect it will be more commonly supported by HEVC endpoints, and provides better interop.

Philipel-WebRTC commented 10 months ago

I think what ssilkin@ means is that the DD can be used in conjunction with some other extension to create a working LTR solution. I also think doing something along those lines is much better for two reasons; we really dislike codec specific solutions for a general concept, and, the RPSI mechanism is basically insufficient since you can only do LTR in one particular way.

taste1981 commented 10 months ago

Do we have other RTCP extension for this? LNTF contains information of last decoded sequence number, and last received sequence number delta. If we use LNTF, the part that is open to me would be, how to feedback to encoder on required reference change (sender side will need some logic to translate seq number to PicOrderCntVal or frame_num). I wonder, anywhere we can find a concrete sample of LNTF end-to-end usage?

tianjunwork commented 10 months ago

Either GFD or DD can take frame dependency for LTRP, and it can be handled by frame ref finder already correctly. A codec agnostic solution for LTR ack and LTRP request would be better.

aboba commented 10 months ago

AVTCORE virtual interim slide is here: https://docs.google.com/presentation/d/1I944U1DmQ4C2f5r5_uEJiOm2N9s4x4B0c1-SPTPPuS0/edit#slide=id.g2473a55bcdf_0_84

Other than citing RFC 7798 and 8834, do we have any other recommendations?

taste1981 commented 10 months ago

RFC 8082 also mentioned this for layered coding.

aboba commented 9 months ago

FYI, Stephan Wenger provided feedback here: https://mailarchive.ietf.org/arch/msg/avt/mrQ53xo_UI_8nR80MQQnH6CtAlY/

aboba commented 9 months ago

I rewrote the PR to take Stephan's feedback into account. PTAL.