Reconsider support for track-per-stream

vasilvv commented 4 months ago

When we originally added track-per-stream, I was skeptical of it being useful, as it's basically equivalent to sending media over TCP, which is a poor fit for latency-sensitive applications (and something that people already can do equally well using other widely-adopted protocols).

Now that we have some further experience with this, the drawbacks of supporting track-per-stream become more apparent:

There's a non-trivial amount of code required to support each individual stream mapping.
There's a non-trivial amount of design effort and specification text required to support each individual stream mapping.
Due to its nature, track-per-stream has its unique design challenges that other mappings don't have (if anything goes wrong, the entire subscription has to be thrown out, making things like timeouts hard to define).

In the narrow use cases where stream-per-track is actually useful, the application can use stream-per-group, and reassemble things itself. Thus, I'm strongly in favor of removing it, at least as a publisher-selected stream mapping (we can discuss what FETCH does when we get back to that).

TimEvens commented 4 months ago

@vasilvv ,

[!IMPORTANT] I'm not in favor of removing stream per-track for the following reasons...

Based on our implementation, the same amount of code/effort to support stream-per track is the SAME for stream-per-group. We don't duplicate the code, the same code that is used for per-group is used for per-track. Likewise, datagram and stream per-object are vary similar and use same internal state tracking and much of the same code.

I 100% agree with you about item 3 and I brought this up with some folks months ago regarding a challenge with MOQT lack of handling stream transitions due to error or otherwise.

Please correct me if I'm wrong, but I believe your statement above could be summarized as:

"Long lived streams, such as bidirectional streams, streams per-track, and streams per-group that have long sequences could suffer from various state or malformations, causing the stream to be invalidated. It's too complicated to solve this problem, so we instead use stream per-group or per-object to mitigate any errors within a given stream. If there is malformation or some state related problem with the current stream, all objects in the group or data in object will be lost and NEW data will resume upon next group/object."

We have an implementation that works at scale to support "track" recovery/mitigation, regardless of the reason the stream becomes invalidated. It's important to call out that Control stream also suffers from the same challenges with invalidated streams as per-track. Is it really okay in MOQT that an aggregated QUIC relay-to-relay peering connection of millions of tracks being forwarded, close/terminate and stop every downstream subscriber because a control bidir stream error? We know that the control stream can be easily recovered, why is MOQT not supporting that? Likewise, why couldn't a per-track or long sequence group be recovered using a new stream in MOQT?

A track is an overlay that can be depicted as a line from publisher to each subscriber. QUIC streams are per-hop and are NOT end-to-end via relay(s) in MOQT. The stream used or number of streams used to deliver data via the track overlay should not suffer from a single hop that has a problem. A more robust infrastructure would have mitigation and recovery per-hop to limit the amount of churn experienced by the unaffected.

Regardless, the challenges of invalidated streams as mentioned above and in (3), are realized not only with per-track, but also with long sequenced per-group, control-stream (this really should have been called control track or channel to reduce confusion with QUIC stream), and future drafts in MOQ to add bi-dir stream tracks.

kixelated commented 4 months ago

+1 This mode is a constant edge case that is susceptible to congestion. We shouldn't encourage RTMP-like behavior since the alternative (stream per group) is pretty straightforward.

fluffy commented 4 months ago

at the time we decided this, some people were adamant about it and I seem to recall many of the use cases did not involve audio or video. I'd rather defer changing things we already decided on until we get more of the rest of draft done and a bit more implementations experience. We have not yet seen what start happening as we run out streams.

I guess I am just saying I would rather leave this along for next 6 months and work on more critical stuff.

martinduke commented 3 months ago

One point from @TimEvens

A track is an overlay that can be depicted as a line from publisher to each subscriber. QUIC streams are per-hop and are NOT end-to-end via relay(s) in MOQT. The stream used or number of streams used to deliver data via the track overlay should not suffer from a single hop that has a problem. A more robust infrastructure would have mitigation and recovery per-hop to limit the amount of churn experienced by the unaffected.

But in effect, they are end to end, because Forwarding Preference is a track property. If there is any stream-constrained hop in the path, everyone has to use stream-per-track to mitigate the problem. But this decision is made at the original publisher with no a-priori knowledge of any such constraints.

fluffy commented 2 months ago

I asked around more for use cases and did not get any that could not easily be done with HTTP. I really don't care if we get rid of these or keep stream per track .

ianswett commented 2 months ago

The only compelling uses for sending on a single stream I've heard are conserving streams and creating flow control backpressure when a large amount of data is available to send all at once.

Now that we have peeps and they have defined properties, I believe a publisher could decide to put multiple on a stream for a single subscription, particularly for fetch type use cases, so I wrote #516

afrind commented 2 months ago

Individual Comment:

The only compelling uses for sending on a single stream I've heard are conserving streams and creating flow control backpressure when a large amount of data is available to send all at once.

I'm still in favor of having some kind of flow-control backpressure in moq, but I don't think putting everything on a single stream in order to re-use QUIC's mechanism directly is a great way to accomplish that goal.

Chair Comment:

The strongest argument to have stream-per-track seems to be "it's not that hard". Proponents of stream-per-track should articulate how stream-per-group with a single group, or "something like HTTP GET" are not going to meet their requirements. We can put a call out on the list if needed, and remove this until a true use case emerges.

TimEvens commented 2 months ago

@ianswett, not sure you would consider the below "compelling" or not...

It's important to call out that HoL IS A FEATURE for use-cases that require this behavior. This mimics TCP flow and use-cases that need it, such as all transactional use-cases. For example; ledgers, state sync, replication, IP overlays (e.g., VPN), etc.

TCP and QUIC streams provide a guarantee that data will be transmitted in FIFO order of what the publisher sends, regardless if the publisher actually ordered them correctly by group or object ID. The FIFO order received can be replicated (e.g., relayed) to subscribers in that exact order as received with a high guarantee of same FIFO order without gaps or having to reorder.

This cannot be said for per-object or per-group considering both of them use multiple streams and more than one stream can be inflight (e.g., group ID 10 was written and is complete but is stuck in HoL and blocked, but the application continues to start group ID 11 because there is no ACK that group 10 end object was consumed by the relay/server. In this case, it's probable that group 11 is received and processed before the end of group 10. Per-object suffers from this as well, but it is worse in that several objects could be inflight. For example; objects 10, 12, 13, and 15 are dealing with retrans, while objects 11 and 14 were received before the others.

The relay and applications could of course buffer and reorder, but there are performance impacts to implement that. It also complicates it because unless #358 adds a MUST for object and group IDs to increment unitary (serial increment by 1 from the previous), the relay/server/receiving subscriber would have no idea of what the next group or object ID would be. Therefore gap detection of group or object IDs is not feasible in the current state of MOQT. All that can be detected is that there is a delta between them, but that may or may not be normal. In this case the relay/server/subscriber does not does not know that something will arrive late that would fill in the delta. If #358 is added, then it would be possible to detect the gap and to buffer/wait for the gap to be filled, but then that's reinventing HoL blocking that QUIC stream (per-track) provided.

It's also important to call out that groups within per-track are useful and would likely be used by transactional use-cases, such as state sync protocols and others.

An example use case, but not at all limited to this use-case, that would require per-track is:

User Story

I'm a developer of a BGP route controller and I need to maintain routing information bases (RIBs) of thousands of routers. I use BGP monitoring protocol using BMP Local RIB to get the live local router RIB. BMP conveys raw BGP encoded messages, which gets parsed and converted to JSON by the BMP collector that is globally distributed. BMP (and BGP) doesn't provide a method to handle partial sync updates to fill in the gaps of the RIB. In other words, unless you are part of the initial RIB dump on start of state session (e.g. TCP connection start) and every update sent after that, the receiver would have no way of knowing the full RIB.

I am looking to use MoQ in each of my BMP collectors to send the JSON data to one of many route controllers I want to update my global edge of BMP collectors to use MoQ pub/sub so that I can fan out multiple route controllers, but I MUST have complete FIFO order of messages as the value of the previous to the current MUST be tracked/logged in order to establish network layer reachability information (NLRI) state for the given point of time (includes live). Point-in-Time states are a very important use-case that I must support in the route collector so that operators, and now AI, can detect and mitigate problems. BMP does not convey partial diff updates and it doesn't provide a method to rebuild the RIB without having to do a full state sync dump, which is intrusive for the router to perform just to sync one downstream route controller.

For example, the initial RIB can be very large with over several million NLRIs conveyed (e.g. internet routes and VPN routes for all customers). This is referred to the RIB dump (aka initial state full sync). I plan to use a group by itself to represent the RIB dump. I then plan to use following groups to represent change, where the first object of each group provides a summary diff of changes from the previous. The RIB collector as a subscriber can then maintain incrementally by group first object diffs or by first object diffs with live updates. The process for a fast sync of a RIB collector is to subscribe starting at the RIB dump and then process only the group diffs and not each individual update in those groups. When it's caught up, it will then process live incremental state change objects for the current group. For this to work, the FIFO order of messages are required, which is how it works today with TCP. A previous group and/or object cannot be received out of order, such as arrive late.

The above is just one example of a real world use-case... many others do exist but it gets long to draft them all here. Regardless, it should be pretty obvious that per-object would never work for this use-case. Although, one might believe the solution to the above use-case is to use a single group, but that won't work as the user/developer needs the ability to segment data in to groups yet have all the benefits of TCP HoL FIFO sending and receiving. In other words, the user is asking for the same over behavior as TCP but with the ability to use groups and objects are outlined in MOQT. per-track offers this.

afrind commented 2 months ago

Individual Comment:

@TimEvens I buy that stream-per-group with 1 group robs the application of a surface that it might have wanted to use. I think we need to weigh this against having only a single mechanism to use streams in moq (per peep). An application that wants to enforce in-order delivery can also do this on top of a multi-stream delivery in moqt. I see your point that a relay cannot so this without a unitary increase constraint for group and object IDs, but I'm not sure how critical that is? Maybe it's better to let the data flow to the end consumer and be reassembled there.

TimEvens commented 2 months ago

@afrind ,

An application that wants to enforce in-order delivery can also do this on top of a multi-stream delivery in moqt.

I don't see how that's possible unless the application buffers and reorders, which requires the application to know how to detect gaps and what is missing. This puts a lot more burden on the application that was solved with per-track (and TCP) inherently.

Maybe it's better to let the data flow to the end consumer and be reassembled there.

This puts a lot more burden on the application that was solved with per-track (and TCP) without having to do anything extra. Seems to me that removing per-track weakens the usefulness of MoQT and puts more coding/development requirements on the application to have to detect/buffer/reorder when it could have used a simple QUIC stream for transmission that did that already.

TimEvens commented 2 months ago

@vasilvv,

Due to its nature, track-per-stream has its unique design challenges that other mappings don't have (if anything goes wrong, the entire subscription has to be thrown out, making things like timeouts hard to define).

This is a fault of MoQT to suggest that because one stream goes wrong the subscription needs to be thrown out. As I mentioned previously, but I didn't see any responses to, suggests that at least for per-track and the control stream, which is a long lived stream just like per-track, we should use simple recovery of that track/subscription. We already support this. If the stream goes bad, control or per-track, small amount of code can rectify it with zero loss by creating a new stream and resending the object that was inflight that was incomplete.

TimEvens commented 2 months ago

In consideration of https://github.com/moq-wg/moq-transport/pull/494... I read the text for peeps as allowing a single "peep" (which is another way of saying peep ID==quic stream) to contain multiple groups and objects as they did with per-track. If this is correct, then per-track wouldn't be needed. Instead, I believe peeps needs more documentation on uses that describe some of the per-group, per-group split over streams, and per-track equivalency.

afrind commented 2 months ago

A peep cannot span more than one group.

A peep is a sequence of one or more objects from the same group

moq-wg / moq-transport

Reconsider support for track-per-stream #480

User Story