livepeer / go-livepeer-basicnet

Basic p2p video streaming for Livepeer
MIT License
18 stars 8 forks source link

Replace StreamID with an opaque bigint #31

Open j0sh opened 6 years ago

j0sh commented 6 years ago

For a Livepeer node, this corresponds to the JobID, from which the StreamID can be looked up easily on-chain. Other off-chain applications can use the bigint as a lookup key.

Currently, the only time StreamID is used concretely within basicnet is to determine the broadcaster's NodeID. We should be able to work around this. Treat the ID as an opaque identifier that's used only by the API consumer, not basicnet.

Benefits:

yondonfu commented 6 years ago

Reposted from #21

Although I get that using jobIDs in the networking protocol would make certain implementation related tasks easier, it does feel like it is tightly coupling the networking protocol with the smart contract protocol when the networking protocol should be able to be standalone BUT able to integrate with the smart contract protocol if needed. My immediate thought is that by switching from streamIDs to jobIDs in the networking protocol, you lose the ability to request content with JUST a content identifier since the streamID encodes nodeID info that can be used to route to the node serving the content absent other information while the jobID is not able to support this (unless you were thinking of something else besides just using a regular big integer). Additionally I think semantically there should be a difference between "streams" and "jobs" - you should be able to have a stream, but not necessarily have a job.

j0sh commented 6 years ago

you lose the ability to request content with JUST a content identifier since the streamID encodes nodeID info that can be used to route to the node serving the content

Yes, node addressing is something we'll have to fix, but I feel that's an easier problem that would lead to a better architecture overall. We're already making strong assumptions about the structure of StreamID, which IMO increases coupling; it might not be appropriate for all projects using basicnet.

there should be a difference between "streams" and "jobs" - you should be able to have a stream, but not necessarily have a job.

That's the benefit of keeping the identifier opaque -- basicnet doesn't need to know whether the identifier corresponds to a stream, or a job. Could be either, or something else entirely. It's up the the API consumer to interpret that identifier.

yondonfu commented 6 years ago

Yes, node addressing is something we'll have to fix

As long as we're using libp2p peerIDs to route to a node, why not just keep that information in the content identifier so anyone that acquires the content identifier knows exactly how to get it? The ability to always route to the node serving content given a content identifier seems like the right architecture.

We're already making strong assumptions about the structure of StreamID, which IMO increases coupling

Hm I actually don't think the structure of streamID is based on any strong assumptions - its just NodeID|VideoID|Rendition. Given that basicnet is supposed to be a video streaming p2p networking protocol, it seems like the information encoded is reasonable.

basicnet doesn't need to know whether the identifier corresponds to a stream, or a job

I think if basicnet is supposed to be a video streaming p2p networking protocol I think it is reasonable that the native content that is being passed around is assumed to be a video stream since libp2p which basicnet is a layer on top already provides content agnostic messaging such that it does not make any assumptions about the native content being passed around

j0sh commented 6 years ago

The ability to always route to the node serving content given a content identifier seems like the right architecture.

While we should be able to access content given an identifier, that doesn't necessarily mean that identifier should encode a singular source for that content. Consider DHTs, relays, etc.

streamID is based on any strong assumptions - its just NodeID|VideoID|Rendition

We might not even want the NodeID there. First, for the reason stated above. Second, In the current architecture, the StreamID exposes the broadcaster to the p2p network, while the transcoder remains hidden, which is quite the opposite of what we want IMO. I'll go as far to say that is the core of most of our current difficulties with networking. (Will unpack all that in a separate github issue). In any case, the idea of embedding NodeID is something that we should revisit, at least for the broadcaster.

The rendition string, as typically expressed on-chain in the job, doesn't actually tell us anything useful. It's almost always P720p30fps16x9 , which even if accurate, is not actionable (or necessary) information. The use of the rendition string seems to primarily be in 1) providing a somewhat human readable suffix to transcoded stream names, and 2) uniquely identifying streams. Since this discussion is about 2), we can discuss options for 1) elsewhere, but it's not a critical feature.

Another thing here is that the broadcaster only has one StreamID per job. Most (all?) uses of the StreamID in basicnet right now are in terms of the broadcaster's StreamID.

We can still assign transcoded profiles their own ID for selective relaying, etc; the key point being that the ID is opaque to basicnet, so it doesn't have to be aware of the semantics of the ID. That's left up to the API consumer.

So for the sake of argument, supposing we don't need the NodeID or the rendition, that leaves us with Video ID. Which could well be "JobID" or whatever other value the API tells basicnet. Again, basicnet doesn't have to care about the meaning of that value.

I think it is reasonable that the native content that is being passed around is assumed to be a video stream

Basicnet right now is mostly oblivious to the details of content, which is a good thing. IMO its role is in the concrete implementation of a networking protocol for shuffling around that content (video in this case). We can certainly go much deeper into content-awareness later (eg, transcoder capability negotiation) but it's not needed right now.

There's a tension in making protocol implementation as generally useful as possible (eg, for off-chain uses), and minimizing the amount of implementation (eg, invoking specific messages or sequences) that needs to be hoisted up to the protocol consumer.

yondonfu commented 6 years ago

Good points! I originally was thinking about the utility of the content identifier being able to used to find the content provider but in the current scheme the content identifier is only able to be used to find the single original content provider as opposed to any content provider that can serve the content.

On Thu, Mar 15, 2018 at 6:43 PM Josh Allmann notifications@github.com wrote:

The ability to always route to the node serving content given a content identifier seems like the right architecture.

While we should be able to access content given an identifier, that doesn't necessarily mean that identifier should encode a singular source for that content. Consider DHTs, relays, etc.

streamID is based on any strong assumptions - its just NodeID|VideoID|Rendition

We might not even want the NodeID there. First, for the reason stated above. Second, In the current architecture, the StreamID exposes the broadcaster to the p2p network, while the transcoder remains hidden, which is quite the opposite of what we want IMO. I'll go as far to say that is the core of most of our current difficulties with networking. (Will unpack all that in a separate github issue). In any case, the idea of embedding NodeID is something that we should revisit, at least for the broadcaster.

The rendition string, as typically expressed on-chain in the job, doesn't actually tell us anything useful. It's almost always P720p30fps16x9 , which even if accurate, is not actionable (or necessary) information. The use of the rendition string seems to primarily be in 1) providing a somewhat human readable suffix to transcoded stream names, and 2) uniquely identifying streams. Since this discussion is about 2), we can discuss options for 1) elsewhere, but it's not a critical feature.

Another thing here is that the broadcaster only has one StreamID per job. Most (all?) uses of the StreamID in basicnet right now are in terms of the broadcaster's StreamID.

We can still assign transcoded profiles their own ID for selective relaying, etc; the key point being that the ID is opaque to basicnet, so it doesn't have to be aware of the semantics of the ID. That's left up to the API consumer.

So for the sake of argument, supposing we don't need the NodeID or the rendition, that leaves us with Video ID. Which could well be "JobID" or whatever other value the API tells basicnet. Again, basicnet doesn't have to care about the meaning of that value.

I think it is reasonable that the native content that is being passed around is assumed to be a video stream

Basicnet right now is mostly oblivious to the details of content, which is a good thing. IMO its role is in the concrete implementation of a networking protocol for shuffling around that content (video in this case). We can certainly go much deeper into content-awareness later (eg, transcoder capability negotiation) but it's not needed right now.

There's a tension in making protocol implementation as generally useful as possible (eg, for off-chain uses), and minimizing the amount of implementation (eg, invoking specific messages or sequences) that needs to be hoisted up to the protocol consumer.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/livepeer/go-livepeer-basicnet/issues/31#issuecomment-373546242, or mute the thread https://github.com/notifications/unsubscribe-auth/AFqI2R-9MemDBDEaqY-9Cotbepx3jIzRks5teu51gaJpZM4SsmH9 .