Tighten definition of ObjectID and GroupID

wilaw commented 9 months ago

Current prose should be updated to indicate that

ObjectID and GroupID are integers. The Object header implicitly defines this, but it should be stated explicitly.
ObjectID MUST only increment by 1
The GroupID MUST always increase.
The GroupID may start at any value >= 0
The GroupID can increase by any value >= 1.

fluffy commented 9 months ago

So I think this needs to be clear that this is for object inside a given quic stream and does not apply cross stream.

I would say on point 1, they are var len unsigned integers.

On #2, do we have a reason that we don't allow object to increase my more than one.

suhasHere commented 9 months ago

on 2 , i don't see a need for us to enforce monotonic unitary sequence for objectIds either. If the application needs it for certain purposes, it will generate the objectIds to match that pattern.

I would just say GroupId and ObjectId needs to be increasing and SHOULD NOT be duplicated. (fine to not even have the latter part)

vitaly-castLabs commented 9 months ago

Wouldn't it be beneficial to have ObjectIDs incrementing strictly by 1 in order to detect dropped objects? It's an extremely common scenario in video decoding: run into a gap in media and stop decoding until a new key frame arrives. Are there any real-life scenarios when arbitrary increments might be necessary?

suhasHere commented 9 months ago

Its application defined. Even with today's video applications, one can have b-frames sent over different track/different stream encoding and thus causing gaps within a group. Only application is aware of that end to end and MoQT shouldn't enforce application rules.

martinduke commented 3 months ago

Would ObjectID have to start at zero?

martinduke commented 3 months ago

Will discuss this in Boston.

kixelated commented 3 months ago

Now that there's an object status code of 0x1 to signal gaps, object ID MUST start at 0 and increment by 1.

fluffy commented 2 months ago

The more I think about, I'm not seeing why the moq layer needs any restrictions on any of these. Yes, an applications like warp on top might define how they are used but behavior of things at the moq level does not seem to require constraints on any of this. I guess I am trying to understand what problem we are trying to solve ?

martinduke commented 2 months ago

It makes it quite a bit easier to implement. If I'm doing stream-per-group, I know that object ID 0 starts a group and a stream.

afrind commented 2 months ago

Individual Comment:

These bullets seem like editorial clarifications that are implicit:

ObjectID and GroupID are integers. The Object header implicitly defines this, but it should be stated explicitly.

We've already agreed, at least implicitly, that group IDs and object IDs are numeric and comparing them numerically has meaning to a relay. References to cases where they are compared this way exist throughout the draft (eg SUBSCRIBE, object streams, etc). We should make this explicit.

The GroupID may start at any value >= 0

This is already true by definition (it's a varint) so this is just a clarification, and seems fine.

Let's go ahead and make a PR that addresses these?

ObjectID MUST only increment by 1

There's a use case listed here which is implicit gap detection, but with the introduction of peeps it no longer works. If I have a peep with objects 0, 2, 4, I cannot assume that 1 and 3 are dropped -- they might just be in another peep. A status of "your object is in another peep" seems like overkill. If an application wants the unitary increase restriction, it may need to go into a streaming format spec rather than moqt?

The GroupID MUST always increase.

What are you trying to accomplish here?

The GroupID can increase by any value >= 1.

This seems like there is no restriction at all (which would be a clarification). Does anyone want to advocate for group ID unitary increases?

suhasHere commented 2 months ago

There's a use case listed here which is implicit gap detection, but with the introduction of peeps it no longer works. If I have a peep with objects 0, 2, 4, I cannot assume that 1 and 3 are dropped -- they might just be in another peep. A status of "your object is in another peep" seems like overkill. If an application wants the unitary increase restriction, it may need to go into a streaming format spec rather than moqt?

[Suhas] +1 . I really want to avoid over burdening relays to understand gaps and use that to do something . Given that they don't have access to catalog and only end applications know the distribution , I want to be careful on adding expectations to the relays and prefer leaving it to the streaming format .

Individual Comment:

These bullets seem like editorial clarifications that are implicit:

ObjectID and GroupID are integers. The Object header implicitly defines this, but it should be stated explicitly.

We've already agreed, at least implicitly, that group IDs and object IDs are numeric and comparing them numerically has meaning to a relay. References to cases where they are compared this way exist throughout the draft (eg SUBSCRIBE, object streams, etc). We should make this explicit.

[Suhas] These should be comparable for the relays to make filtering choices.

The GroupID may start at any value >= 0

This is already true by definition (it's a varint) so this is just a clarification, and seems fine.

Let's go ahead and make a PR that addresses these?

ObjectID MUST only increment by 1

There's a use case listed here which is implicit gap detection, but with the introduction of peeps it no longer works. If I have a peep with objects 0, 2, 4, I cannot assume that 1 and 3 are dropped -- they might just be in another peep. A status of "your object is in another peep" seems like overkill. If an application wants the unitary increase restriction, it may need to go into a streaming format spec rather than moqt?

The GroupID MUST always increase.

What are you trying to accomplish here?

The GroupID can increase by any value >= 1.

This seems like there is no restriction at all (which would be a clarification). Does anyone want to advocate for group ID unitary increases?

[Suhas] There are several use-cases where unitary increase is not always the best way for the applications. Yes if a application want to enforce a unitary increasing groupId, it will generate the groupIDs as such . I don;t think the pub/sub layer needs to enforce it,

We need to follow a simple rule " Don't burn application logic and semantics into the relays ". Video applications of tomorrow will be different from ones than today ( due to encoder advancements, ml , for example), all of these will be different from chat which is different from telemetry and which is different from a 3D game state and so on.

wilaw commented 2 months ago

Does anyone want to advocate for group ID unitary increases?

Yes - for simplified debugging. While I acknowledge Suhas's comment that future applications may be different form the ones of today and that applications can enforce numbering constraints, the networks over which MOQT is distributed are going to be complex, along with the caches that store the objects and logic for routing them. Having a predictable numbering scheme simplifies the task for the humans who much build, manage, debug and maintain these distribution networks. I would instead ask a different quesiton - which future apps couldn't be built around a unitary increasing Group ID?

afrind commented 2 months ago

@wilaw Now I'm confused - you filed this issue asking for:

The GroupID can increase by any value >= 1.

And in your last comment, you are advocating for "The GroupID can only increase by 1". Which one is it?

To answer your last question,

which future apps couldn't be built around a unitary increasing Group ID?

The moq logging and moq metrics drafts use a unix timestamp as a group ID (eg: https://datatracker.ietf.org/doc/draft-jennings-moq-metrics/). This makes it possible to simply subscribe at a point in time. If moqt requires sequential group IDs, then those drafts would need a different solution (eg: a timeline track).

wilaw commented 2 months ago

And in your last comment, you are advocating for "The GroupID can only increase by 1". Which one is it?

@afrind - I filed this issue acting as a scribe for our group discussion during our interim in Denever in Feb. It reflects the group consensus at the time. My personal opinion, expressed via https://github.com/moq-wg/moq-transport/issues/358#issuecomment-2328504741 , is that group ID should be a unitary increase, for the reason stated.

Regarding the use-case of the metrics drafts using the groupID to express a timestamp. I feel the timestamp is an application-level value that should not be using a transport-level ID for conveyance. There should be a clean decoupling between application data (which is stored inside the object payload) and the transport identifiers (groupID and objectID). The transport IDs should be designed to make the carriage, debugging and maintenance of the transport as efficient as possible. There are several other ways to handle random-temporal-access with metrics (such as templates, or a timeline track as mentioned). It seems undesirable to burden all implementations of moqt with the complexity of sparse groupIDs for the convenience of one application, especially when there are other reasonable means for that application to achieve its goals.

gwendalsimon commented 2 months ago

The main motivation for unitary increment of GroupID is cache management. If GroupID increments by random numbers, then a relay cannot use what it has in cache. Indeed, it has no clue whether the data it has stored is the continuous series of groups or some gaps due to discontinuous subscriptions. Example: A Relay receives a subscription for live stream and caches consecutive non-unitary groups ID 12-14-17-39. Then this client unsubscribes. The same relay receives another subscription for the same live stream and caches consecutive groups ID 43-48-72. The same relay receives a subscription Absolute Range 14-48. When looking at its cache, it can only see random numbers in there. By default, it will forward the full subscription message to the publisher. The cache is useless.

I understand that there exist some use cases in which it is strongly believed that non-unitary groupID is the best setting. OK. So, as a CDN developer, let me suggest another option: Include a Boolean flag in the track, which states whether the track contains unitary incrementing GroupID or not. If not, then the relay would not cache the data.

suhasHere commented 2 months ago

The same relay receives a subscription Absolute Range 14-48. When looking at its cache, it can only see random numbers in there. By default, it will forward the full subscription message to the publisher. The cache is useless.

Leaving the group numbering scheme aside. When the relay receives range14-48, it will return 14, 17, 39, 43, 48. The end subscriber is aware of group distribution and for it the group numbers are not random and If it sees there is a gap that shouldn't have existed, it can always ask for the same and caches will get the data populated accordingly. There is also a way to have few groups marked as never existed and that can also be used to ask upstream or not.

vitaly-castLabs commented 2 months ago

I bet 99.9% of implementation will assume and only support/test it with unitary increase, adding logic to handle gaps will make it a lot more messy and bug-prone

wilaw commented 2 months ago

Leaving the group numbering scheme aside. When the relay receives range14-48, it will return 14, 17, 39, 43, 48.

But it won't. A server can't only return what was cached by prior requests. There may be no prior requests, or maybe only 17 was requested, or the cache is volatile and objects got evicted. Knowing nothing about the structure of the range, the server has no choice but to make one or more upstream requests. It can make an upstream request for 14-18 (which will result in duplicate objects being sent), or it has to make requests for 15-16, 18-38, 40-42, 44-48.

As another illustrative example, consider a publisher which only uses odd group IDs. If a server had previously served the content (resulting in 1,3,5 ...2001) in its cache), a second subscription for 1 - 2001 would result in the server having to make 1,000 forward requests (2, 4, 6 .. 2000) to try to fill in the gaps. That is not practical.

suhasHere commented 2 months ago

There may be no prior requests, or maybe only 17 was requested, or the cache is volatile and objects got evicted.

this is true regardless of group numbering scheme.

gwendalsimon commented 2 months ago

There may be no prior requests, or maybe only 17 was requested, or the cache is volatile and objects got evicted.

this is true regardless of group numbering scheme.

If a relay sees that its cache contains groups 13-14-15-16-17-18-19-20, it knows for sure that it can serve directly the subscription Absolute Range 13-20 without assistance from the publisher. If its cache contains 13-14-17-18-19, it knows for sure that it misses groupIDs 15 and 16, so is does subscribe Absolute Range 15-16 to the publisher. In the case of non-unitary groupID increment, the cache is useless, the relay forwards the full subscription, it causes more traffic in the network and more requests at the publisher.

I bet 99.9% of implementation will assume and only support/test it with unitary increase, adding logic to handle gaps will make it a lot more messy and bug-prone

Indeed. The "non-unitary groupID increment" will become a corner case for lawyers in the future provider-CDN MoQ agreements. CDN will have to state that they can deliver MoQ streams... unless the groupID increment is not sequential.

vasilvv commented 2 months ago

The caching point is part of the reason sequential numbering makes things easier. But also, the data structures for continuous sequential numbers are much nicer in terms of CPU performance.

vasilvv commented 2 months ago

An alternative approach here (one that I still need to think a bit more about) is that we could just decouple group IDs and object IDs: let's say all objects have a unique sequential object ID within the track, and group IDs are just numbers that tell us where one can join (an index rather than an identifier). I've been annoyed by the "2D" nature of MoQ sequence numbers before when coding, so that would solve that too.

ianswett commented 1 week ago

Assigning to Victor to write a PR based on the discussion of Option 2 in Boston.

moq-wg / moq-transport

Tighten definition of ObjectID and GroupID #358