scottlamb / retina

High-level RTSP multimedia streaming library, in Rust
https://crates.io/crates/retina
Apache License 2.0
244 stars 48 forks source link

consider more efficient buffering model #6

Open scottlamb opened 3 years ago

scottlamb commented 3 years ago

Accumulate NAL unit fragments into a ring buffer instead of via ref-counted Bytes fragments.

Mirrored ring buffer?

My first idea: use a mirrored ring buffer (umsafe VM tricks) slice-deque or similar for the read buffer.

Advantages:

Disadvantages:

Here's how it would work:

safe ring buffer

But actually we could use a plain safe ring buffer. The only problematic case is when a ring wraps back over. In that case we can either:

The overhead from the extra copy is less than a single read so it's not a big deal.

slab for UDP

The ring buffer approach makes sense for TCP (both requests/replies and interleaved data messages). Less sense for UDP, particularly if using recvmmsg on Linux to support reading multiple messages at once. The problem is that we don't know the byte length of the packets, and the biggest possible packet (65,535 bytes - 20 (IP header size) - 8 (UDP header size) = 65,507 bytes) is much larger than a typical packet (non-fragmented, 1,500-byte MTU - 28 bytes of headers = 1,472 bytes), and I don't want 97+% waste.

So for UDP, here's another idea: have a slab of packet buffers, read into one of them with recvmsg or maybe multiple simultaneously with recvmmsg. Use scatter/gather into a slab-based primary buffer that's of reasonable length and an overflow buffer for the remainder. When overflow happens, we extend the primary buffer with the overflow. "Reasonable length" can be initially our PMTU length (to the peer, on the assumption the path is roughly symmetrical), then based on the largest length we've seen.

buffer size limits and API changes

We don't really have any limit on how much stuff we buffer right now. It's even worse when using TCP and the ring buffer and Demuxed, because if one stream stops mid-access unit, we'll grow and grow even if the other streams are all behaving. We should have a limit on the total size of the TCP ring buffer. Also for UDP we should have a limit on the total size we keep around. Perhaps within SessionOptions and UdpTransportOptions, respectively.

Related open API question: do we want to just allow access to the most recent packet (with the packet-level API) or frame (with the Demuxed API)? Probably makes sense then to stop using futures::Stream (which because Rust doesn't have GAT can't return items borrowed from the stream). Or do we want to allow the caller to hold onto things for longer (subject to the limits above)? If the latter, I think we need some sort of extra step where you check if the item you got is still good. And if we continue to have these be Send/Sync, we need to have some sort of internal locking.