livepeer / go-livepeer-basicnet

Basic p2p video streaming for Livepeer
MIT License
18 stars 8 forks source link

Broadcaster Initiated Connections #34

Open j0sh opened 6 years ago

j0sh commented 6 years ago

Rationale

Handling failure cases in the current implementation is difficult (eg, transcoder address changes, broadcaster unavailability), and poses challenges for designing large-scale systems.

Broadcasters are overly exposed to the network (their Node ID is public), while transcoders could be exposed more given their role as providers of infrastructure. Reverse this dynamic.

This proposal solves these issues simultaneously. The specific challenges in the current network protocol are elaborated below, in the context of the proposal's benefits.

Proposal

For reference to the current transcoder behavior, see the proposal here https://github.com/livepeer/go-livepeer-basicnet/issues/21#issuecomment-369310870 . In summary:

Benefits, as related to the role of the transcoder

Given the limited pool size, transcoder operators are likely to run multiple physical nodes to accommodate higher demand. A given transcoder Eth address could correspond to any number of nodes.

Benefits, as related to the role of the broadcaster

The broadcaster knows exactly when it's going to need a transcoder. Let the broadcaster drive that; take the onus of initiating the job off the transcoder.

Additional Future Potential

dob commented 6 years ago

Nice proposal Josh. Thanks for the writeup.

I think one of the initial thoughts around attempting to hide transcoder addresses, was that because they need to persist as they are constantly running jobs for multiple parties, it is better to hide them so that they aren't susceptible to spam and ddos. But I don't know if this is actually the case, as if the address is exposed to peers who are broadcasting anyway, then the address is revealed and can be published and ddos'ed.

I like the benefit that you mention of a transcoder being able to load balance across many nodes by providing different connection information.

With regards to redundancy and failure states, @f1l1b0x proposed the property that the network should expect failures and have resiliency built in. His suggestion was that you actually have n (5?) transcoders encoding each segment, and use HLS "backup segments" in the playlist such that the players know how to request a segment from a backup source if the original source isn't serving the segment in time. (Separate issue than what you're proposing, but wanted to bring it up because it's a different way of thinking about things).

I still like the eventual goal of the network topology and decentralized routing to be that any node on the network can request content from "the network", and the routing scheme will route the request towards its source like chord or kademlia. This proposal seems to make sense for now though with our direct connections between transcoder and broadcaster, which is likely the more efficient (if not resilient) setup. If we did have multiple transcoders, but the source didn't have enough bandwidth to serve to all of them on direct connections, we'd have to introduce the relay or p2p based delivery scheme in here somewhere.

j0sh commented 6 years ago

the network should expect failures and have resiliency built in. His suggestion was that you actually have n (5?) transcoders encoding each segment

Curious how the incentives for that would work. Sounds like a good topic for another discussion.

If we did have multiple transcoders, but the source didn't have enough bandwidth to serve to all of them on direct connections

Yeah, this brings up a lot of questions, which aren't necessarily related to this specific issue.

As-is, this proposal is really meant to accommodate the current blockchain protocol with a minimal number of changes (none, I think), while potentially giving us some escape hatches for the future.

I still like the eventual goal of the network topology and decentralized routing to be that any node on the network can request content from "the network", and the routing scheme will route the request towards its source like chord or kademlia.

DHT-style lookups can be useful for finding a relay node. I do wonder if we can also factor in other metrics, such as ping time/distance, relay load, etc. Not sure if we actually want to be relaying content using the same route though; this is the difference between providing the "discovery service" and actually performing the "relay service". Once a relay is found, the subscriber should be pulling directly from it. Otherwise, we add another hop to the media route, and burden upstream peers with potentially unbounded number of subscriptions from downstream peers.

dob commented 6 years ago

Yes, definitely a lot of problems to be solved around coordination and orchestration if we spread out the transcoding.

In general it seems like the eventual strategy for p2p content delivery should be that the DHT style routing are reserved for finding the tracker, who performs the coordination through the DHT. But then when the tracker is passed to a subscriber, it forms another p2p overlay network for the torrent-style content delivery. Again...a little further off though, let's stay focused on the short term broadcaster initiated connections.