rom1v commented 10 months ago

To get the lowest possible latency (and avoid retransmission), we may want to add FEC (like RaptorQ) to send redundant packets, so that even if some packets are lost (but enough are received), we can still reconstruct the source data, without retransmission.

This is mentioned in the draft Use Cases and Requirements for Media Transport Protocol Design:

It may make sense to use FEC [RFC6363] and codec-level packet loss concealment [RFC6716], rather than selectively retransmitting only lost packets. These mechanisms use more bytes, but do not require multiple round trips in order to recover from packet loss.

It makes sense to never retransmit such FEC packets if they are lost (that's the purpose), so we would like to send them in datagrams (over QUIC or WebTransport).

But in MOQT, it seems that objects must always be sent over QUIC/WebTransport streams:

Objects are sent on unidirectional streams.

Therefore, some Objects containing FEC payloads might be retransmitted by the transport layer (QUIC/WebTransport), which is undesirable.

Are they any plan to support this FEC use case over datagrams?

rom1v commented 10 months ago

Oh, I just found #316.

However, it says:

OBJECT messages are transmitted over unidirectional streams and can also be transmitted over datagrams if the object or objects fit into single datagram. For larger objects, a stream needs to be used.

Typically, I think I would like to use 1 Object for a single video frame.

But if I use FEC and datagrams, each packet containing a single video frame will need to be split into n FEC packets (most of them of 1280 bytes for example). In that case, I will need to define 1 Object = 1 FEC packet and combine several Objects at the application level to reconstruct frames, is it correct?

kixelated commented 10 months ago

So we're going to have trouble implementing FEC as it stands.

A QUIC library can coalesce multiple QUIC datagrams per UDP packet. A FEC scheme that assumes QUIC datagrams are individual packets (not fate bound) won't work over generic implementations. This is especially a problem with MTU discovery, as even if you're sending 1.2kB datagrams, they might get bundled together into large packets. At the very least MoQ would have to require that QUIC datagrams MUST NOT be coalesced for this to work over generic relays. You kinda run into this same problem with FEC in general, as what looks like an individual packet at one layer might be coalesced into a jumbo packet at a lower layer.

316 just lets you can send individual OBJECTs as datagrams. It wouldn't be useful unless the application builds a FEC layer on MoQTransport, splitting each OBJECT into 1.2Kb OBJECTs. But you would have no way of preventing spurious retransmissions like you mentioned. It's also a burden for relays, as they're forced to proxy these numerous FEC OBJECTs even over pristine links.

Frankly, I want FEC at the QUIC layer. Any FEC encoding should be applied over the entire connection, not individual frames/objects. I think we could combine this with priorities/hints in MoqTransport to designate important media, without building full-blown FEC into MoqTransport. One possibility: https://datatracker.ietf.org/doc/draft-michel-quic-fec/

englishm commented 10 months ago

Seconding @kixelated's suggestion that we probably want to have FEC at the underlying QUIC layer. Part of the value of MoQ is being able to benefit from these types of enhancements to QUIC itself as it continues to grow and mature.

kixelated commented 10 months ago

Yeah, and just one more point. Disclaimer, I haven't used FEC so I could be talking out of my ass.

FEC is meant to conceal random loss which occurs on a link-by-link basis. The frequency and distribution of random loss is very different if you're on cellular, versus wifi, versus ethernet. All of these protocols have some form of FEC built but it can be inadequate, which is why the application wants to augment.

I think performing FEC hop-by-hop within QUIC makes the most sense. This would support:

the FEC scheme could be dynamic, based on measured hop-by-hop properties.
viewer A could use a separate FEC scheme from viewer B.
QUIC paths could use different FEC schemes, ex. a path over a satellite link would be different from path over ethernet.
QUIC paths could work in tandem, for example the satellite path could be used to provide parity only.
a relay wouldn't be forced to use FEC within the backbone

FEC end-to-end is an option for 1:1 but I just don't think it scales to multiple paths.

rom1v commented 10 months ago

Thank you for your answers, very instructive!

Frankly, I want FEC at the QUIC layer. Any FEC encoding should be applied over the entire connection, not individual frames/objects. I think we could combine this with priorities/hints in MoqTransport to designate important media, without building full-blown FEC into MoqTransport. One possibility: https://datatracker.ietf.org/doc/draft-michel-quic-fec/

That would be awesome! I was not aware of such work, thank you for the link. :+1:

as what looks like an individual packet at one layer might be coalesced into a jumbo packet at a lower layer.

I faced a similar problem when I tried to simulate packet loss using iptables, due to UDP segmentation offload (several UDP packets were coalesced into one big UDP packet). However, it was just a local issue, UDP packets were still sent individually "on the wire".

Indeed, if UDP packets are coalesced at a lower layer, FEC becomes useless. If this "lower level" is lower than UDP, it would also impact FEC at the QUIC layer though.

To give some more context, we are developing a prototype for remote gaming, experimenting with several custom protocols over QUIC and WebTransport.

In practice, if we use QUIC/WebTransport streams, it works pretty well, but we may experience stuttering due to packet loss: sometimes the game (mirrorring) freezes for few hundreds milliseconds (presumably on packet loss), then catches up.

If instead, we split each frame packet into RaptorQ packets having a size computed from max_datagram_size(), send them over QUIC/WebTransport datagrams, and reconstruct the source packets on the other side, then this avoids the problem in practice.

It's also a burden for relays

FEC end-to-end is an option for 1:1 but I just don't think it scales to multiple paths

Yes, I totally agree. It's not clear to me how it should work with relays.

In addition, if a burst of packet loss is too severe, data may be completely lost without retransmission: this is not good for a relay, which might want to record (without being impacted by packet loss).

But for the 1:1 remote gaming use case, I think FEC packets over datagrams may still be useful.

kixelated commented 10 months ago

Indeed, if UDP packets are coalesced at a lower layer, FEC becomes useless. If this "lower level" is lower than UDP, it would also impact FEC at the QUIC layer though.

Yeah absolutely. UDP packets aren't necessarily independent nor are the loss events. It's a difficult problem to solve and it's understandable why FEC algorithms can get quite complicated.

To give some more context, we are developing a prototype for remote gaming, experimenting with several custom protocols over QUIC and WebTransport.

I spent a while talking to JB at the last year about WebTransport and QUIC. I'm glad you're making great progress!

In practice, if we use QUIC/WebTransport streams, it works pretty well, but we may experience stuttering due to packet loss: sometimes the game (mirrorring) freezes for few hundreds milliseconds (presumably on packet loss), then catches up.

Yeah, it's quite difficult to tell random loss apart from queue loss. FEC will help with random loss, but it might actually exacerbate queue loss; depends on how the router decides to shed load.

What congestion control algorithm are you using with QUIC? There's probably some improvements you can make there to avoid these freezes. Ah you linked the Quinn docs below, so that means New Reno (by default) or BBRv1 (experimental).

New Reno will absolutely cause random freezes like that, either due to bufferbloat or a sudden cwnd reduction. BBRv1 is much better in my experience but can also cause random freezes due to the PROBE_RTT phase. It's also not clear if it's even implemented correctly in Quinn given the experimental flag, and upgrading to BBRv2 or BBRv3 will definitely help live video.

Oh and check out my recent blog post if you haven't already. We'll need a QUIC extension if you want to use something like GCC to match WebRTC performance.

If instead, we split each frame packet into RaptorQ packets having a size computed from max_datagram_size(), send them over QUIC/WebTransport datagrams, and reconstruct the source packets on the other side, then this avoids the problem in practice.

FEC can definitely help if you have a high RTTs, for example if you have a limited edge deployment. But if that's not the case then I'm not quite sure why it would help when using Reno or BBR.

Oh and with your QUIC streams, make sure you're prioritizing them otherwise they'll fight for limited bandwidth. It makes a huge difference and is comparable to manually dropping packets via datagrams.

But for the 1:1 remote gaming use case, I think FEC packets over datagrams may still be useful.

Yeah, FEC does make sense when latency is critical and RTTs are high.

fluffy commented 10 months ago

I'm big fan of FEC but I think it is complicated at a level that I was waiting until the basic of MoQ were more stable before jumping into it.

Some cases you might only want FEC on one link in which case doing it at the QUIC layer might make sense. In other cases only the application really knows what type of FEC would work best for the current application and network environment. There are also cases where you want the applications to be able to do end to end FEC without waiting for the relay network to support that form of FEC. And clearly what type of FEC is optimal is highly application dependent.

Right now, I feel like the best experiments could be done with just putting the FEC on separate tracks and having it be application defined. This does have the issue with some video frames not fitting in a single MTU but, for now, an applications could deal with splitting up the frame.

francoismichel commented 10 months ago

Hey, I am the author of https://datatracker.ietf.org/doc/draft-michel-quic-fec/ . I am jumping in to let you know that I am in Prague this week, including at the Hackathon if you want to discuss.

I have performed a few FEC experiments with different applications, including low latency video streaming ones such as GStreamer/FFMpeg, obtaining good results. Everything is in my thesis and in our coming papers.

afrind commented 7 months ago

draft-02 added some mechanisms for datagrams. Is that sufficient to close this issue for now and, as fluffy suggested, experiment with FEC at the application layer?

ianswett commented 6 months ago

I feel like this is outside the scope of MOQ Transport, so marking it as such.

martinduke commented 1 month ago

MoQT has datagrams, so I think this is fixed. Closing, please reopen if there are still unmet requirements.

moq-wg / moq-transport

Support datagrams for FEC data? #320