chrishobcroft commented 1 year ago

Summary

While working with MistServer, we are experiencing what it feels like to be able to give a Publisher a way to publish content to a livestream network from a browser app. And it feels interesting.

This issue is to propose to add webRTC ingest to Livepeer Broadcaster, as shown below:

IMG_20230520_151446

Such an approach would unlock ways to integrate the EIP-712 signing keys (shown in every component of the picture), and allow an expansion of the options for funding the end-to-end workflow.

Reference

Can perhaps look at pion/webrtc which seems to be active at time of writing.

Alternative considered

To run an additional service, such as Mist, in front of the Broadcaster, to:

ingest webRTC, translate to rtmp and forward to the B
pass signatures back and forth

This option is still under consideration, however the main approach (above) is preferred while Mist does not contain any infrastructure for transacting.

leszko commented 1 year ago

I think it'd be nice to have webRTC support in the go-livepeer broadcaster.

@Thulinma @iameli @mjh1 @0xcadams interested to hear your thoughts.

thomshutt commented 1 year ago

In terms of strategy, we've been moving more towards "Mist (or similar) is always sitting in front of the Broadcaster and transitioning the Broadcaster to be focussed just on managing the network and coordinating work, rather than also bundling video capabilities.

chrishobcroft commented 1 year ago

Hey @thomshutt when you say "strategy" and "we've" are you talking about Livepeer Inc. strategy, or a broader Livepeer Community strategy?

I only ask because these things appear to be divergent in some ways - one group seeking to gain institutional customers and the other group with a sole focus on individual users.

The group with a sole focus on individuals, is building things which interact with public blockchains (where an individual user can act with full autonomy), so as go-livepeer is "chain aware" and Mist isn't, it might be worth taking these things into account.

hthillman commented 1 year ago

It's a reference to an architectural pattern rather than any interest group-specific strategy. While Inc has been leading implementation for this architectural pattern, I wouldn't read anything into it other than a preference for separating concerns.

Rock-solid media server functionality is critical to delivering video, and a clean architecture helps enable that.

chrishobcroft commented 1 year ago

I agree re: clean architectures. I'm all for keeping things simple, and subtracting unnecessary components from the architecture.

Adding webRTC to the B, enabling it to interface directly with a webRTC publisher app in browser, potentially with "connect wallet" attached, would open substantial scope to start experimenting with things like:

Decentrally-governed publication and consumption permissions, like the crowd deciding what content gets promoted/suppressed.
General concepts in terms of integration with DeFi, NFTs, Governance, Prediction, Onchain Identity, MEV and whatever else might emerge.
webRTC consumption, as well as publication, to enable just the simplest of 1-to-1 conversations, where users call each other using their ENS name. That would be just great! And then, allow users to stream/record their call live, both people collaborating together, for all the world to see. Then look at expanding to 3-way, 4-way etc. as we are ready.
"Collaborative content creation, coordinated via smart contracts." (source: dob.eth article

And Mist is nice and all, provides great inspiration, and if you guys enjoy working with it, it's a powerful mechanism. As long as it remains chain-oblivious though, it will struggle to find a place in a web3 stack.

Thulinma commented 1 year ago

I've been working on a proposal for a next-generation media transport for between the various nodes. Draft can be found here: https://hedgedoc.ddvtech.com/s/yvtLsqLcd

The basic ideas of the plan are:

It's a good idea to have traffic go as direct as possible, in the future bypassing B/O nodes entirely
Big changes are tricky to execute cleanly
Segmented transport is bad
Chain-related and media-related tasks should be done by separate applications as much as possible
We do need a path from node to node to flow non-media data through
First, a move should be made from segmented HLS to websocket carrying raw codec data, leaving everything else the same. B nodes can translate the old format to the new format internally, for backwards compatibility.
Then, we add the ability for a WebRTC handshake to go over the websocket. Still node-to-node.
Finally, we remove the intermediate nodes from the path, enabling a direct WebRTC connection between the two media-related endpoints (requester & transcoder).

Most of this could be executed step by step without breaking any existing setup. In the end, though, we end up not just with a more direct and modern media transport, but also one that is web-native-capable (both WebRTC and WebSockets are natively available in all modern browsers, after all). That opens up the path to browser-based transcoding nodes and the like, as well. 🧐

chrishobcroft commented 1 year ago

Thanks @Thulinma for sharing. Looks really interesting, probably this is the future.

Question on this bit:

bypassing B/O nodes entirely

Does this include bypassing the Livepeer Transcoding Network entirely?

Thulinma commented 1 year ago

Question on this bit:

bypassing B/O nodes entirely

Does this include bypassing the Livepeer Transcoding Network entirely?

No - the media data would go directly from source to transcoder and back, while only metadata about the media flows to the broadcaster and orchestrator (signed hashes, basically). That way all guarantees can be upheld without needing to waste bandwidth on the B/O to transporting media that these nodes really don't need to be touching. For backwards compatibility, they would still have the capability to transport media anyway, but it would be optional. In the future, there could be strictly no-transport B/O nodes that don't even implement media transport at all, but those would of course not be backwards compatible anymore as a result.

chrishobcroft commented 1 year ago

Got it, thank you for clarifying. The multi-hop between Publisher and Transcoder (P to B to O to T to O to B to Consumer) does appear somewhat indirect when you put it like that.

Question on this:

media data would go directly from source to transcoder and back

Do you then imagine that ultimately the source (Publisher) will distribute the content themself, or is this then a question of them engaging a distribution service, and marshalling the data (with associated bandwidth costs) to/from the T and to the distribution service?

I wonder also how you consider the potential addition of services for an Orchestrator to orchestrate, such as "recogniser" or "archiver" or "distributor" or other desirable functionality to offloads heavy workloads from the Publisher? Or whether you imagine this will all be done by the T anyway, as this has the content and a GPU?

Thulinma commented 1 year ago

In that case, media would flow P->T->D (distributor), or P->T->P->D. Either one of those is better than P->B->O->T->O->B->D 😁. Most cases I've personally worked with the P and D are the same entity, hence why I made that simplification, but you're totally right that the end destination need not be the same as the origin of the data at all. In general you usually don't want to go straight to distribution, because it gives very limited ability to "check" the transcode before anything goes live, and makes it a lot harder to retry/resume transcoding at a different node without dropping distribution. (Not impossible, though...)

I envision that nodes specialize more in their specific tasks, so a transcoder would transcode only and do very little else. That does indeed leave the door open for orchestrators to orchestrate potential further services in the same way, but that's likely still a ways off in terms of timeline. It does make it a lot easier to add these kind of things on later, as the task itself need not necessarily be understood by the orchestrator anymore. 🧐

chrishobcroft commented 1 year ago

@Thulinma I dont agree that it is a way off. We are already actively integrating another service for the O to orchestrate, one primarily of Archiver, and speculatively of Distributor, both via the same service. It is still early days, but results appear positive. But we are now with a choice between P->B->O->T->D or P->B->O->T->O->D wdyt and I think we are still on-topic, right?

cyberj0g commented 1 year ago

I'm getting a deja vu from reading this. We already tried (successfully) to eliminate B and O from video streaming, but there are major blockers from decentralization perspective. In short, the nodes need to operate in a trustless manner, while any kind of hash exchanges imply trusting the hashes. More info here.

livepeer / go-livepeer

Add webRTC ingest #2798

Summary

Reference

Alternative considered