Should PUBLISH be a separate message?

afrind commented 1 year ago

PUBLISH (as defined in #123) conveys the name of an available track to the recipient, and PUBLISH_OK gives the publisher permission to begin sending OBJECT messages for that track.

My read is that this interchange:

-> PUBLISH <- PUBLISH_OK -> OBJECT(track data)

Can also be accomplished without PUBLISH via the following sequence:

<- SUBSCRIBE catalog -> SUBSCRIBE_OK -> OBJECT(catalog) <- SUBSCRIBE -> SUBSCRIBE_OK -> OBJECT(track data)

There are a couple differences worth highlighting:

1) PUBLISH can contain per-track authorization credentials, but it's not clear where those go in SUBSCRIBE-only approach 2) PUBLISH_ERROR provides an explicit signal that the recipient DOES NOT want the named track. In the SUBSCRIBE-only method, the absence of a SUBSCRIBE conveys no information. 3) The PUBLISH message doesn't have the same amount of metadata about the track that is conveyed in the catalog, and may not be sufficient for the recipient to decide if they want it or not. 4) SUBSCRIBE-only requires that the recipient (possibly a recipient at the end of a relay chain) parse the catalog

afrind commented 1 year ago

<- SUBSCRIBE catalog -> SUBSCRIBE_OK

I'll also note if we define a special Track ID for catalog, and assume catalog tracks are implicitly subscribed, then these messages can be omitted in the above exchange.

suhasHere commented 1 year ago

Its important to also note at what leg is this handshake happening

is it between end-user client and edge relay
is it between 2 components that implement moq relay functionality ( no access to catalog)
is it happening between 2 moq relays that are from different distribution networks ( no access to catalogs or concept of bundle/grouping of things for connecting the authz to the connection)

Only in first case, a catalog is accessible and only by the end user client. Again tying the catalog access to the track access will be what some applications want and but that his shortcomings too

A catalog authorized for an end-user client would not have sufficient information for the relay to know if the track being published is indeed part of the catalog.

I think making tracks as central component will help resolve many of the above use-cases and deployments. However, If certain controlled deployments can shortcut authz to catalog to imply authz to all the tracks from an end-point, that is still possible. But it is application design but not protocol level consideration though

kixelated commented 1 year ago

The proposed PUBLISH in #127 is a method of track discovery. Fun fact, I originally proposed something similar in #43 before settling on #63.

I think it falls short for two reasons:

1. Insufficient track properties

For distribution, we're all in agreement that there's a catalog to describe the tracks. The player parses the catalog and the rich media information it contains, deciding if it wants to subscribe to a track based on properties such as codec/resolution/bitrate/etc.

However for contribution, when using this PUBLISH message, only the track name is available to describe each track. The receiver somehow needs to decide if it wants to subscribe to said track (via a PUBLISH OK) and the only option is to parse the track name. This is acceptable (albeit gross) when both sides agree on a naming schema, but it's not acceptable for generic implementations (ex. OBS).

2. Race conditions

A moq endpoint will usually publish multiple tracks in parallel. There would be multiple PUBLISH messages sent in parallel, potentially over the same control stream.

However, this introduces a whole class of race conditions, as packet loss can cause messages to arrive out of order (multiple control streams) or with significant delay. The receiver is unaware of the number of total tracks being published, and yet has to make a decision on if it wants to reply OK to each publish. For example, a client may PUBLISH h264 and PUBLISH av1, indicating the capability to support either codec, and the receiver could make a completely different decision based on when these messages arrive over the network.

kixelated commented 1 year ago

The CATALOG message in the current draft is meant to address both of these issues.

It includes the track name and a track description (init segment).
All tracks are contained in the same message (no races).

We should absolutely send the catalog over a track as part of #66. But I think it's the only thing we should push unless we can figure out a way to deal with the issues I've described.

As to how to push the catalog on startup, I think there's a few options:

The consumer SUBSCRIBEs to a well-known catalog track name ("catalog").
The producer PUBLISHs a well-known catalog track name ("catalog").
The producer sends OBJECTs with a well-known track ID (0).

My vote is not 2, unless we can justify other situations where a dedicated PUBLISH message would be used. I kind of like 3; think of it like an automatic subscription when SETUP contains ROLE=publisher.

kixelated commented 1 year ago

Its important to also note at what leg is this handshake happening

is it between end-user client and edge relay

is it between 2 components that implement moq relay functionality ( no access to catalog)

is it happening between 2 moq relays that are from different distribution networks ( no access to catalogs or concept of bundle/grouping of things for connecting the authz to the connection)

Only in first case, a catalog is accessible and only by the end user client. Again tying the catalog access to the track access will be what some applications want and but that his shortcomings too

A catalog authorized for an end-user client would not have sufficient information for the relay to know if the track being published is indeed part of the catalog.

I think making tracks as central component will help resolve many of the above use-cases and deployments. However, If certain controlled deployments can shortcut authz to catalog to imply authz to all the tracks from an end-point, that is still possible. But it is application design but not protocol level consideration though

You raise some good points, but I do think that something has to parse the catalog so the relay can return a PUBLISH OK.

If I'm running an ingest edge with no access to the catalog, I can't blindly reply PUBLISH OK to every PUBLISH REQUEST. I would forward the PUBLISH REQUEST to some origin server that would have access to the catalog, so it can decide if the track is worth ingesting. That way the broadcaster could make multiple tracks available, such as different codecs or renditions, and the ingest origin can choose which ones to receive. There's not enough bandwidth to push all available tracks.

Now hypothetically, if the origin has the catalog (pushed via some mechanism), then for the tracks where it would reply with PUBLISH OK, then it would send SUBSCRIBE instead. Semantically they're the same thing; please send me this track name using this track ID. I think that's a clean design, and fixes a few problems with current ingest protocols (mostly RTMP) since the receiver is in charge.

fluffy commented 1 year ago

TL;DR of message ahead: The Relays can't read the catalog inside the object message and they need some way of knowing what client to route a subscribe too.

OK, so reading this I have a much better idea of why people are thinking different things.

The first issue is I don't think we all have the same view of what entity creates the catalog in the ingest case. I will start a separate issue on that. See #144

The next issue is really what is the difference of object with a catalog and and the publish message. If you look at the first message in this issue, it illustrates how they carry similar information. However there is one big difference. The relay can not read the information inside the payload of the object so it can not read or parse the catalog. That is a good thing given we want may have multiple types of catalogs. Deploying new catalog types should not be stalled by waiting for the chicken and egg problem of getting CDNs to support them. That leads to ossification of the catalog format.

Here is why that matters. Image a CDN with many relays. A ton of clients connect to that CDN for the same webex meeting. Now something subscribes to one of the tracks URIs. The CDN has no clue of which one of the clients can publish that specific track. Even if it did, it does not know of all the which CDN node that right client for that track is attached to.

It seems to me that the key requirement here is we need some way for the CDN to know which tracks any given client connected to the CDN can publish. There are multiple ways to solve this but the PUBLISH message is a simple way that is: 1) very symmetric to the subscribe, and like the subscribe can be a place to support per track authorization 2) have very low RTT before the client can start sending data

To be very crisp on what the problem with the catalog based flow is in the first message of this issue., The 4th message in the 2nd flow is "<- SUBSCRIBE" but the CDN has no clue which of the many clients connected to it to send the subscribe for a given track URI to.

Even in a case with no relays, I think you send up wanting something that allows the a load balancer to do things like send audio to a different server than video. ( Longer story on why but most the major web conferencing systems process the the audio and the video on different servers ).

fluffy commented 1 year ago

as pointed out in another thread, PUBLISH may be wrong name for this message as it is more an indication of intent to publish and the publish is when the OBEJCT gets sent

kixelated commented 1 year ago

as pointed out in another thread, PUBLISH may be wrong name for this message as it is more an indication of intent to publish and the publish is when the OBEJCT gets sent

Yeah, I think PUBLISH is 0-RTT while ANNOUNCE is 1-RTT

suhasHere commented 1 year ago

Just on the name, pr #123 names it as PUBLISH_REQEST and not PUBLISH. PUBLISH_REQUEST is a transaction and is 1-RTT as described

suhasHere commented 1 year ago

However for contribution, when using this PUBLISH message, only the track name is available to describe each track. The receiver somehow needs to decide if it wants to subscribe to said track (via a PUBLISH OK) and the only option is to parse the track name. This is acceptable (albeit gross) when both sides agree on a naming schema, but it's not acceptable for generic implementations (ex. OBS).

PUBLISH REQUEST as described in #123 doesn't say that the end-point doesn't have access to the catalog. It is assumed that catalog is the starting point (as is the case with subscribes) and track list to send publish_request is coming from the catalog. We can add an explicit note if that makes things clear. Also send PUBLISH OK is a implicit subscription from the peer for that track and we don't require another SUBSCRIBE coming in from that peer.

kixelated commented 1 year ago

However for contribution, when using this PUBLISH message, only the track name is available to describe each track. The receiver somehow needs to decide if it wants to subscribe to said track (via a PUBLISH OK) and the only option is to parse the track name. This is acceptable (albeit gross) when both sides agree on a naming schema, but it's not acceptable for generic implementations (ex. OBS).

PUBLISH REQUEST as described in #123 doesn't say that the end-point doesn't have access to the catalog. It is assumed that catalog is the starting point (as is the case with subscribes) and track list to send publish_request is coming from the catalog. We can add an explicit note if that makes things clear. Also send PUBLISH OK is a implicit subscription from the peer for that track and we don't require another SUBSCRIBE coming in from that peer.

If the endpoint already has access to the catalog, then it already knows all of the track names. It should issue a SUBSCRIBE message directly instead of waiting for an optional PUBLISH_REQUEST to announce each track name.

The PUBLISH message can only add value when the endpoint does NOT have access to the catalog. For the sake of argument, what if the PUBLISH_REQUEST message contained the entire catalog? It would be a strict upgrade, as it would include both the track name and a media description for each track.

kixelated commented 1 year ago

I think some of the disagreement stems from the question: "what does a relay do if it does not have access to the catalog?"

An edge relay would forward the PUBLISH_REQUEST message to the next hop. This would continue until it reaches an endpoint that can parse the request, ie. the origin. The origin then issues a SUBSCRIBE/PUBLISH_OK (they're effectively the same message) for the track that propagates back to the broadcaster.

The same is true if we include the catalog in the PUBLISH_REQUEST message. The only difference is that a relay has more information now, as it has both the track name and a detailed media descriptor that is optional to parse.

suhasHere commented 1 year ago

If the endpoint already has access to the catalog, then it already knows all of the track names. It should issue a SUBSCRIBE message directly instead of waiting for an optional PUBLISH_REQUEST to announce each track name.

Here the client endpoint is the publisher and not the subscriber. Say a video end-point producing media in a conference call or a relay producing to another relay where the peer relay has no catalog access.

suhasHere commented 1 year ago

The same is true if we include the catalog in the PUBLISH_REQUEST message. The only difference is that a relay has more information now, as it has both the track name and a detailed media descriptor that is optional to parse.

Catalog might be end to end encrypted too and a relay may not be even able to parse the catalog. Also catalog parsing is expensive at high scales and at every relay hop.

afrind commented 1 year ago

@suhasHere : Can you give some example end-to-scenarios involving PUBLISH_REQUEST?

I see that PUBLISH_REQUEST contains a subset of the contents of the catalog (currently only track name) and it is efficiently relay readable. It's therefore possible for any relay to issue a PUBLISH_OK before any consuming endpoint with catalog access could issue a SUBSCRIBE. Is that the intent?

I think @kixelated's question is how would a generic relay decide which tracks to allow when all it knows is their name?

suhasHere commented 1 year ago

how would a generic relay decide which tracks to allow when all it knows is their name?

By design so far, Relays don't know anything about application level details regarding tracks (such as descriptors, media information and so on). For deciding if a track should be allowed or not, They need to look at the authorization information for the track in the publish_request

afrind commented 1 year ago

@suhasHere : Can you give some example end-to-scenarios involving PUBLISH_REQUEST (including how you envision the publisher acquiring various authorization tokens)?

suhasHere commented 1 year ago

@suhasHere : Can you give some example end-to-scenarios involving PUBLISH_REQUEST (including how you envision the publisher acquiring various authorization tokens)?

Authorization token can be obtained via the catalog or some out of band mechanisms, either of which is not in scope for this spec. May be I am missing something ?

afrind commented 1 year ago

@suhasHere : I understand the proposed mechanics. I'm asking for an end-to-end (application-level) example to help illustrate the usage of this message.

kixelated commented 1 year ago

The same is true if we include the catalog in the PUBLISH_REQUEST message. The only difference is that a relay has more information now, as it has both the track name and a detailed media descriptor that is optional to parse.

Catalog might be end to end encrypted too and a relay may not be even able to parse the catalog. Also catalog parsing is expensive at high scales and at every relay hop.

The CATALOG message, as it currently stands is a tuple of the track name and a payload depending on the track format. A relay that does not understand the track format, or does not care to parse it, could still use the track name and just forward the contents.

I would still like to move the catalog to a track, but as it currently stands, the CATALOG message is the same as PUBLISH_REQUEST but contains more information.

kixelated commented 1 year ago

how would a generic relay decide which tracks to allow when all it knows is their name?

By design so far, Relays don't know anything about application level details regarding tracks (such as descriptors, media information and so on). For deciding if a track should be allowed or not, They need to look at the authorization information for the track in the publish_request

I think you're confusing authorization and track selection.

A generic client like OBS will connect to live.twitch.tv via a connect URL. I propose it would include the auth token in that URL, although feasibly the client could send same token for each PUBLISH_REQUEST.

The problem is that the client (ex. OBS) and server (ex. live.twitch.tv) don't necessarily agree on what tracks should be sent. This is partly a matter of exchanging compatibilities, but both the client and server also have limited resources. There needs to be some sort of negotiation, allowing either side to say NO prior to transferring any content. This has nothing to do with authorization, and everything to do with negotiating limited resources.

The CDN MUST avoid propagating unsupported or unrequested OBJECTs through the network. It uses resources, such as limited backbone capability. There needs to be some entity that decides that a resource should be transferred at each hop based on downstream demand.

This is the premise behind HTTP CDNs. The origin does NOT push resources to all edges. Instead, the each node waits until there's a request (SUBSCRIBE) and propagates that request to the origin, deduplicating along the way. This ensures that there's at least one consumer for a particular resource to avoid wasting resources.

This should be our default world-view for MoQ; no media is transferred until requested.

However, a HTTP CDN is allowed to pre-fetch content if 1) it has extra resources and 2) it understands the relationship between resources. This is a potentially wasteful process, but may improve the responsiveness if a resource is eventually requested.

If we extrapolate this to the MoQ world, this means an ingest server may SUBSCRIBE to tracks before they are requested downstream. This doesn't require the catalog, only the track name, but the catalog does contain useful information about the relationship between track (ex. renditions).

The PUBLISH_REQUEST message is trying to perform the similar functionality. It's meant to push tracks so they are available a few hops early. However, the broadcaster cannot be in control of deciding if a CDN wants a track. The CDN absolutely needs the ability to select what tracks are transferred and when.

fluffy commented 1 year ago

I no longer understand what everyone is trying to say here. Can we put this on the agenda for Friday. I think we could make progress on this.

afrind commented 1 year ago

@fluffy : I added an agenda item to discuss publish use cases in general, which cover some of this discussions here, in 144 and PR 123.

I think it would help guide the discussion if we had written examples showing the following use cases working at the message level:

Client broadcasting a live-stream through a CDN/Caching/Relay Network to N subscribers
Clients publishing as part of a media conference through a CDN/Caching/Relay Network

Essentially taking what is in the scenarios draft (2.2 - 2.4) and moq-transport (6.5) and filling in the next level of detail including the ordering of messages from all participants. I asked @kpugin and @kixelated to look at 1), maybe @suhasHere you can look at 2)?

wilaw commented 1 year ago

(For Friday) I think our workflow can be much simpler if we implement a basic tenet that a publisher doesn't send anything over the wire unless it first gets a subscription request for it. Additionally, the CATALOG track already describes the availability of content from the publisher. We don't need extra methods to signal publish intent.

Publisher connects to a server, prepares a catalog track, but doesn;t send it. It just waits. This is efficient. We don't want bytes moving over our network if no one is subscribing to them.
Client connects to CDN edge and subscribes to the catalog track. This subscription request is relayed through each node until it reaches the publisher.
The publisher begins sending the catalog track over the wire. It is relayed up through each node until it reaches the client. The nodes cache the catalog track so that future subscribes don't need to go back to the original publisher.
The client reads the catalog and subscribes to additional content. Once the publisher receives these subscription requests, it pushes the content over the wire. This content is relayed up to the client.
When a client no longer wants to receive a track, it sends a UNSUBSCRIBE request to the sender (or closes its connection, which is an implicit unsubscribe for all tracks initiated within that connection). These UNSUBSCRIBE messages cascade back to the publisher. Once it no longer has an audience for its content, it stops pushing content over the wire.

Any authorization for what a client is allowed to publish should be handled by application defined access control. We don't need an explicit message for that. This is exactly what we propose for clients when they subscribe . We envisage a token which defines which tracks they may consume. Symmetrically, a similar token can be used on the publish side to define what a publisher may publish. We don't propose a SUBSCRIBE-REQUEST handshake and I don;t believe we need a PUBLISH-REQUEST handshake either.

kixelated commented 1 year ago

I no longer understand what everyone is trying to say here. Can we put this on the agenda for Friday. I think we could make progress on this.

My fault for the wall of text.

(For Friday) I think our workflow can be much simpler if we implement a basic tenet that a publisher doesn't send anything over the wire unless it first gets a subscription request for it. Additionally, the CATALOG track already describes the availability of content from the publisher. We don't need extra methods to signal publish intent.

Publisher connects to a server, prepares a catalog track, but doesn;t send it. It just waits. This is efficient. We don't want bytes moving over our network if no one is subscribing to them.

Client connects to CDN edge and subscribes to the catalog track. This subscription request is relayed through each node until it reaches the publisher.

The publisher begins sending the catalog track over the wire. It is relayed up through each node until it reaches the client. The nodes cache the catalog track so that future subscribes don't need to go back to the original publisher.

The client reads the catalog and subscribes to additional content. Once the publisher receives these subscription requests, it pushes the content over the wire. This content is relayed up to the client.

When a client no longer wants to receive a track, it sends a UNSUBSCRIBE request to the sender (or closes its connection, which is an implicit unsubscribe for all tracks initiated within that connection). These UNSUBSCRIBE messages cascade back to the publisher. Once it no longer has an audience for its content, it stops pushing content over the wire.

Any authorization for what a client is allowed to publish should be handled by application defined access control. We don't need an explicit message for that. This is exactly what we propose for clients when they subscribe . We envisage a token which defines which tracks they may consume. Symmetrically, a similar token can be used on the publish side to define what a publisher may publish. We don't propose a SUBSCRIBE-REQUEST handshake and I don;t believe we need a PUBLISH-REQUEST handshake either.

+100

I will note that a transcoder or archiver might be the consumer. To the client, it would look like it's pushing data, because the server immediately issued a SUBSCRIBE for the catalog and some tracks. However this is optional, and like Will said, we don't want to push any bytes over the network until necessary.

Here are some examples where we do NOT want the broadcasting client pushing media:

The broadcast has 0 viewers.
The conference call is empty.
The backup feed is inactive (ex. satellite truck).
The track properties are unsupported (ex. codec, bitrate, etc).
The track has an alternative (ex. OPUS instead of AAC).
The broadcast does not require the CDN yet (ex. multi-CDN).

In all of these cases, the broadcaster is unaware of these conditions; they're solely determined by demand. It should be up to the CDN to determine what to subscribe to, rather than the broadcaster deciding what to publish.

suhasHere commented 1 year ago

a publisher doesn't send anything over the wire unless it first gets a subscription request for it

This is application decision/choice and not the transport requirement.

kixelated commented 1 year ago

a publisher doesn't send anything over the wire unless it first gets a subscription request for it

This is application decision/choice and not the transport requirement.

I want this to be a transport requirement. Using HTTP terminology, the server can't push a response until it gets a request.

We should leave the door open for something like PUBLISH as an RTT optimization between coordinated endpoints. But much like HTTP Push, I don't see how it would work with generic endpoints and CDN fanout.

suhasHere commented 1 year ago

This is a pub/sub transport and one can easily emulate HTTP like approach on top of it .. We need to keep the flexibility for application to innovate with either of the approaches.

wilaw commented 1 year ago

@suhasHere - what behavior do you want to productize that you feel cannot be enabled with a pub-after-sub architecture?

suhasHere commented 1 year ago

We envisage a token which defines which tracks they may consume. Symmetrically, a similar token can be used on the publish side to define what a publisher may publish.

That is what is proposed in the PR today, there is a token in the publish_request that let's relay know what a publisher can publish.

suhasHere commented 1 year ago

@suhasHere - what behavior do you want to productize that you feel cannot be enabled with a pub-after-sub architecture?

it is very typical in a conferencing scenario for a conference participant to join in and send media even if no receiver has joined yet. The idea is , as more people join in they get the media instantly from multiple publishers (avoids latency). If there are N publishers, one need not wait for N end to end subscribe roundtrips to each such publisher in order to get media delivered, otherwise.

This is just one use-case and we haven't see all the possible applications one can innovate.

Also publish_request is telling Relays about authorization status for a given publisher for the track name.

Even for the case of subscribe-then-publish, the media producer still needs to prove that it is authorized to publish before sending OBJECT messages.

wilaw commented 1 year ago

it is very typical in a conferencing scenario for a conference participant to join in and send media even if no receiver has joined yet.

Who are they sending it to if there are no receivers? Your origin? If so, you could accomplish that quite easily by having the origin issue a subscribe request to each participant.

If there are N publishers, one need not wait for N end to end subscribe roundtrips to each such publisher in order to get media delivered

Those N subscribes would happen in parallel. And they would only need to happen for the first participant attached to each node. Statistically, most participants would find the streams already pulled to either the edge node or a parent node.

Even for the case of subscribe-then-publish, the media producer still needs to prove that it is authorized to publish before sending OBJECT messages.

But once a connection has been accepted , any intermediary or endpoint can send data down a QUIC stream, whether they are authorized to or not. Isn't it true that the authorization enforcement must therefore happen at the receiver? With a sub-then-pub approach, the receiver can throw away any incoming data it has not previously issued a subscription for. A participant connecting to a conferencing solution would be given an access token. That token would be parsed by the edge node (or the origin if it is connected directly) and it would specify which tracks the node can subscribe-to from that participant.

fluffy commented 1 year ago

I want to focus on one point from WIll's post that says

"Client connects to CDN edge and subscribes to the catalog track. This subscription request is relayed through each node until it reaches the publisher."

The question is how to do this. The CDN receives a subscribe to a given catalog track named Foo. The CDN has lots of of clients that can publish, and they are connected to many different relays in the CND. The CDN needs to know how to get this subscription to the right relay and have it send it to the right client. In our implementation the way we find that client is by having the publisher say "I am willing to publish on track name Foo". The CDN keeps that information in its distributed routing table and when a request to subscribe to that track comes to the CDN, the CDN knows where to send that subscription.

wilaw commented 1 year ago

The question is how to do this.

I envisage it working in the same way that an edge server, who is asked to HTTP GET /foo.mp4, knows how to retrieve it from the millions of origin servers that the CDN is connected to. IN the HTTP case, the request has accompanying HOST + PATH information that the CDN uses to map to a particular origin. That is part of the configuration of the CDN that was established when it was engaged and authorized to deliver foo.mp4. In the MoQ case, the subscription would be to something similar to example.com/some/path/foo (per our ongoing discussions for track names). The MoQ CDN would have a routing table which tells it how to go to either a forward relay in the same CDN, or to the correct origin.

In our implementation the way we find that client is by having the publisher say "I am willing to publish on track name Foo". The CDN keeps that information in its distributed routing table and when a request to subscribe to that track comes to the CDN, the CDN knows where to send that subscription.

I appreciate that this is how QUICR may work today. I am concerned about the scalability of that model in a multi-tenant CDN in which every edge node must hold a dynamic table of the potential tracks that can be published by every publisher connected to every customer of that CDN. On a global scale, consistency in updates is a challenge. Much of the traffic flowing over that CDN will be to update these tables for content that may never actually be consumed. That seems inefficient. Additionally, you don't have the concept of the "catalog", which is the offer from the publisher on what it can produce. It is literally a list of the tracks it is willing to publish and is a substitute for the offer mechanism you describe above. This is a useful construct, functioning as a contract between the source and the end clients. It is conveniently distributed as a track, opaque to the CDN. In the pub-after-sub model, the CDN only has to hold static routing tables for content domains at the edge. The dynamic nature of what each client may publish is handled by catalog tracks and their updates.

suhasHere commented 1 year ago

Even in the publish request case, i don't see the table is dynamic as defined above. The endpoint has a catalog and it is sending publish request based on that list. The tables are as dynamic/static as it is with subscribe case.

The same publish request now works uniformly across all hops and there is no need for catalog understanding by relay nodes either.

kixelated commented 1 year ago

I want to focus on one point from WIll's post that says

"Client connects to CDN edge and subscribes to the catalog track. This subscription request is relayed through each node until it reaches the publisher."

The question is how to do this. The CDN receives a subscribe to a given catalog track named Foo. The CDN has lots of of clients that can publish, and they are connected to many different relays in the CND. The CDN needs to know how to get this subscription to the right relay and have it send it to the right client. In our implementation the way we find that client is by having the publisher say "I am willing to publish on track name Foo". The CDN keeps that information in its distributed routing table and when a request to subscribe to that track comes to the CDN, the CDN knows where to send that subscription.

This sounds like a gossip protocol. A producer connects to any CDN edge, announces that they are an origin for specific tracks, and then expects the CDN to serve subscriptions for those tracks from any edge.

In the HTTP world, this would be analogous to sending a PUT to the nearest akamai edge with the contents of google.com/index.html, and some way to prove you are actually Google. It certainly could work, although it's certainly not how CDNs are architected.

afrind commented 1 year ago

I'm closing this issue since there are no proposals for PUBLISH REQUEST at this time. #123 is now calls it ANNOUNCE, and I think it has value, and #150 tracks details related to the ANNOUNCE proposal.

moq-wg / moq-transport

Should PUBLISH be a separate message? #143

1. Insufficient track properties

2. Race conditions