versatica / mediasoup

Cutting Edge WebRTC Video Conferencing
https://mediasoup.org
ISC License
6.21k stars 1.13k forks source link

Set RTP extensions on sending transport side #1107

Open ibc opened 1 year ago

ibc commented 1 year ago

Related to https://github.com/versatica/mediasoup/pull/1079.

Overview

We are gonna save lot of bandwidth by just adding into each RTP packet RTP extensions supported by the receiver, and we are gonna make the MID RTP extension has dynamic value instead of being always 8 bytes (value + required zeroed padding).

Current Approach in v3

Currently we set all the RTP extensions in Producer::MangleRtpPacket() and later we rewrite their values when sending the RTP packet to each consumer. This comes with pros and cons:

Pros:

Const:

New Proposal

Pros:

Const:

penguinol commented 1 year ago

uv_udp_try_send accept several buffers at a time. Maybe we can save header and payload in different buffers, so that we don't need to shift payload.

penguinol commented 1 year ago

Well, we have to encrypt payload before sending, and currently libsrtp don't support read header and payload from different buffer. But in SrtpSession::EncryptRtp, we will copy original rtp packet to EncryptBuffer fisrt, we can make up rtp header and payload here to avoid unnecessary shift of payload.

penguinol commented 1 year ago

We can also do this when sending rtx packet.

ibc commented 1 year ago

@penguinol I will come to this a few weeks later after some priority work and after vacations. Same for other pending PRs written by you.

penguinol commented 1 year ago

In webrtc, they use a Copy-On-Write buffer to store rtp packet, it can save memory, however i think it can not reduce the number of memcopy.

ggarber commented 11 months ago

I'm very interested on this.

Regarding "these extensions must be reseted" maybe it is a good time to reconsider the approach taken in mediasoup of modifying and reverting the convent of the packets.

My reasoning:

ibc commented 11 months ago

I'm very interested on this.

Regarding "these extensions must be reseted" maybe it is a good time to reconsider the approach taken in mediasoup of modifying and reverting the convent of the packets.

My reasoning:

  • at the end mediasoup is doing a copy in most of the cases anyway (for srtp encryption)
  • modifying the packets do a lot of assignments and memmoves, I'm not sure efficient wise is much worse to make the memcpy
  • it is too error prone and makes the code much more complicated to reason about and maintain

This also happens within each XxxxxConsumer, Consumer and Transport class. Those may modify the RTP payload (VP8 seq ids, etc), may assign a different value to the Transport-Wide-Seq RTP header extension and so on.

Are you proposing that when the Router receives a RtpPacket from a Producer it clones it for each Consumer it passes it to?

ggarber commented 11 months ago

I'm very interested on this. Regarding "these extensions must be reseted" maybe it is a good time to reconsider the approach taken in mediasoup of modifying and reverting the convent of the packets. My reasoning:

  • at the end mediasoup is doing a copy in most of the cases anyway (for srtp encryption)
  • modifying the packets do a lot of assignments and memmoves, I'm not sure efficient wise is much worse to make the memcpy
  • it is too error prone and makes the code much more complicated to reason about and maintain

This also happens within each XxxxxConsumer, Consumer and Transport class. Those may modify the RTP payload (VP8 seq ids, etc), may assign a different value to the Transport-Wide-Seq RTP header extension and so on.

Are you proposing that when the Router receives a RtpPacket from a Producer it clones it for each Consumer it passes it to?

Exactly that. Basically doing the memcpy at the beginning instead of at the end (SrtpSession::EncryptRtp).

Or the Consumer can clone it the first time it needs to modify it, that's almost the same thing but more optimal for the SimulcastConsumer case where many packets are discarded.

jmillan commented 11 months ago

Are you proposing that when the Router receives a RtpPacket from a Producer it clones it for each Consumer it passes it to?

Exactly that. Basically doing the memcpy at the beginning instead of at the end (SrtpSession::EncryptRtp).

Right now we store in memory each RtpPacket coming from the remote Producer, and use that same memory for all Consumers. If we did as it's being suggested here we would store in memory the same-ish RtpPacket number-of-consumers times.

ibc commented 11 months ago

Yes, but it would also simplify our life a lot. What about if each Consumer instance holds a buffer to clone the received RtpPacket and such a buffer is (somehow) also used later when SRTP encryption is needed?

ggarber commented 11 months ago

Are you proposing that when the Router receives a RtpPacket from a Producer it clones it for each Consumer it passes it to?

Exactly that. Basically doing the memcpy at the beginning instead of at the end (SrtpSession::EncryptRtp).

Right now we store in memory each RtpPacket coming from the remote Producer, and use that same memory for all Consumers. If we did as it's being suggested here we would store in memory the same-ish RtpPacket number-of-consumers times.

Don't we do the same before encryption? It is just doing the same thing but a bit earlier in the consumer path imo.

jmillan commented 11 months ago

Yes, but it would also simplify our life a lot. What about if each Consumer instance holds a buffer to clone the received RtpPacket and such a buffer is (somehow) also used later when SRTP encryption is needed?

RtpPacket::Clone() already creates an internal a buffer to hold the content so there is no need for any other buffer. If each Consumer clones each RtpPacket then its own buffer can be used for SRTP encryption as it would not be reused.

Exactly that. Basically doing the memcpy at the beginning instead of at the end (SrtpSession::EncryptRtp).

Right now we store in memory each RtpPacket coming from the remote Producer, and use that same memory for all Consumers. If we did as it's being suggested here we would store in memory the same-ish RtpPacket number-of-consumers times.

Don't we do the same before encryption? It is just doing the same thing but a bit earlier in the consumer path imo.

The difference is that we now use the same buffer (per thread) to encrypt every packet, so we do a memcpy before encrypting. Cloning every RtpPacket per Consumer would require having as many buffers as RtpPackets (the buffer is part of the clonned packet), and they must be stored for retransmission purposes. Also RtpPacket::Clone() performs itself few memcpy's.

In essence we would loose the memory savings we have now by sharing the original RtpPacket received from the Producer and would have one clone of the RtpPacket for each Consumer, that would need to remain in memory for retransmission purposes.

I don't think this is a blocker for the main issue described here.

ibc commented 11 months ago

Indeed, the proposed solution in the description of this story avoid cloning each packet in every Consumer. Of course it requires some shift and memcopy operations in the packet within every Consumer, but that's way better than having to clone the whole packet per Consumer, right? The proposal in this story covers all our needs IMHO. And as Jose said, it doesn't make sense that we developed an optimized way to just store a packet ONCE (for retransmission purposes) and now clone the packet for each Consumer. It's indeed a non sense.

ggarber commented 11 months ago

Indeed, the proposed solution in the description of this story avoid cloning each packet in every Consumer. Of course it requires some shift and memcopy operations in the packet within every Consumer, but that's way better than having to clone the whole packet per Consumer, right?

  • For maintanability and reliability of the code I think it is clearly worse.
  • For performance I'm not so sure. My guess is that it is also worse in most of the cases but it is easy to check it.

The proposal in this story covers all our needs IMHO

Agree, I'm just suggesting that you take advantage of this feature to change that mediasoup behaviour if you think it makes sense.

ggarber commented 11 months ago

Don't we do the same before encryption? It is just doing the same thing but a bit earlier in the consumer path imo.

The difference is that we now use the same buffer (per thread) to encrypt every packet, so we do a memcpy before encrypting. Cloning every RtpPacket per Consumer would require having as many buffers as RtpPackets (the buffer is part of the clonned packet), and they must be stored for retransmission purposes. Also RtpPacket::Clone() performs itself few memcpy's.

Isn't that an implementation detail? Cloning a packet could take an existing buffer from a pool.

Also when I said "cloning the packet" I didn't mean it needs to be the existing RtpPacket::Clone(), maybe there is a more efficient way to do it. I meant just "making a copy" somehow.

For retransmission purposes you can still store the original packet, right?

I don't think this is a blocker for the main issue described here.

I agree. I just think that the solution to this kind of requirements could be much simpler by not having to reset anything, without any/much perf impact, but it is your call obviously :)

ibc commented 11 months ago

Also when I said "cloning the packet" I didn't mean it needs to be the existing RtpPacket::Clone(), maybe there is a more efficient way to do it. I meant just "making a copy" somehow.

Cloning a RtpPacket means creating a different RtpPacket instance with copied memory. That's what clone() does and there is no way to make it more efficient. It does exactly what it must be done.

For retransmission purposes you can still store the original packet, right?

Yes, but we just store the original packet received by the Producer and we store it just once.

I agree. I just think that the solution to this kind of requirements could be much simpler by not having to reset anything, without any/much perf impact, but it is your call obviously :)

"much simpler" means cloning the packet, with the performance penalty it involves. Resetting changes in the packet after processing it in each Consumer is the best option we have assuming it is done right.