moq-wg / moq-transport

draft-ietf-moq-transport
Other
87 stars 22 forks source link

Handling subgroup discontinuity at a relay #610

Open afrind opened 3 weeks ago

afrind commented 3 weeks ago

This came from the comments on #587

The context is that a track has a subgroup with only even numbered objects, eg: 2, 4, 6, 8.

sending 2, 4, 8 = illegal (in order, but with an omitted object in the middle.)

If a relay gets a subgroup with objects 2, 4, 6, 8, I expect it can send one of the following

Nothing
2
2, 4
2, 4, 6
2, 4, 6, 8

However, consider this case:

          /--- relay 1 ---\
Publisher                  relay3 ---> Subscriber 
          \--- relay 2 ---/

For a particular track/group/subgroup (eg: group=77, subgroup=11) relay3 gets the following from relay1:

Control Stream: <GOAWAY>
Subgroup Stream: 2, 4<RELIABLE_RESET>

as well as a new ANNOUNCE from relay2.

relay3 starts making a new subscription to relay2, but doesn't close anything downstream.

relay3 issues SUBSCRIBE(latest object) to relay2, and gets a stream with just 8 for group=77 subgroup=11. If it gets any non-sequential object (even 6) it has no idea if it missed any objects.

Would the relay be allowed to send 2, 4, 8 on the subgroup stream downstream? If not, then what is it supposed to do?

TimEvens commented 2 weeks ago

Your scenario is pretty specific on the mobility/resume use-case where the publisher moves from relay 1 to relay 2. Is that correct?

Mobility/resume/reuse does need to be handled, which I don't see in the current MoQT draft. Effort is needed to define how to resume and properly handle GOAWAY to migrate/transition existing tracks from one to another.

Regarding peering between relays, such as relay 1 and relay 2 to relay 3... fetch should not be used by relays unless the subscriber requested it. IMO, fetch should be focused on subscriber specific uses and not be used to globally sync/replicate caches. If we want to define a cache sync protocol, then that's a different discussion... for example, we really need to look at distributed centralized caches that support relays that are behind load balancers and anycast (load balancers).

As I would implement the above, the following would happen:

  1. Publisher would make a new connection to relay 2
  2. Publisher would undergo the normal connect/subscribe/announce states for the tracks.
  3. If publisher does not disconnect (unsubscribe, unannounce) from relay 1, it will result in the publisher appearing as two publishers for the same track(s). In other words, multiple publisher scenario.
  4. Assuming the publisher is not sending duplicate data to both relay 1 and relay 2, publisher would send only new data to relay 2. In this sense, resume, and the group would start at the next sequence.
  5. Assuming peering is established between relay 2 and relay 3, the data would be forwarded based on FIFO (publisher in to peer out) order. In this case, publisher wouldn't start over with old data, it would simply pick up where it left off, so subgroup 12.

Upon new subscription and publicacation, as in this scenario, to relay 2, the publisher is required to start the stream as it normally would for start of group/subgroup. While the publisher could pickup where it left off in group 11, it really should have started on subgroup 12 or group 78. If it didn't, it would still work, but would be a bit messy till next group/subgroup.

Lossless methods for transitions/migration between relays of existing/live tracks is possible, but it requires specific handling that is not defined in MoQT to facilitate it.