libp2p / specs

Technical specifications for the libp2p networking stack
https://libp2p.io
1.56k stars 273 forks source link

feat(webrtc): add WebRTC (prev. browser-to-browser) spec #497

Closed mxinden closed 1 year ago

mxinden commented 1 year ago

The libp2p WebRTC (prev. browser-to-browser) specification.

Open items:

Menduist commented 1 year ago

Isn't this "any-to-browser"? "Servers" will also be able to use the same mechanism to dial browsers, right?

thomaseizinger commented 1 year ago

Isn't this "any-to-browser"? "Servers" will also be able to use the same mechanism to dial browsers, right?

Even two servers could use if for some reason they wanted to use WebRTC to communicate to each other, right?. Isn't it more like dcutr for WebRTC?

mxinden commented 1 year ago

Isn't this "any-to-browser"? "Servers" will also be able to use the same mechanism to dial browsers, right?

Even two servers could use if for some reason they wanted to use WebRTC to communicate to each other, right?. Isn't it more like dcutr for WebRTC?

Agreed with both of you @Menduist and @thomaseizinger. That said, I don't think "any-to-any" or "direct-webrtc-connection-upgrade-through" is very intuitive. Thus I suggest we stick with browser-to-browser, as it is the most intuitive name I can think of.

mxinden commented 1 year ago

@marten-seemann thanks for the review.

https://github.com/libp2p/specs/pull/497/commits/29d166b92b6fa4b446a1c11fcc0e0d127086232f addresses some of your comments. Still lots more to come.

Menduist commented 1 year ago

Maybe just "webrtc-holepunching"? I'd rather have something correct than something catchy (and B2B isn't that catchy anyway, since server to browser is as important as B2B, imo)

thomaseizinger commented 1 year ago

I'm not sure what we should use as the protocol id. webrtc-direct is already in use, but I have been using webrtc-peer in my experiments.

Is it? The protocols.csv table only mentions /p2p-webrtc-direct: https://github.com/multiformats/multiaddr/blob/master/protocols.csv

Is it too confusing to use /webrtc-direct now, i.e. just dropping the /p2p prefix?

mxinden commented 1 year ago

I'm not sure what we should use as the protocol id. webrtc-direct is already in use, but I have been using webrtc-peer in my experiments.

Is it? The protocols.csv table only mentions /p2p-webrtc-direct: https://github.com/multiformats/multiaddr/blob/master/protocols.csv

Unless I am missing something, the discussion about protocol names referred to the name of the stream protocol, not the name of the Multiaddr protocol component.

That said, you are raising a good point, namely whether we need a Multiaddr protocol for this specification as well. The rational is documented in https://github.com/libp2p/specs/pull/497/commits/78fae2669f726da0b123e7c838e3e00b0f6eabd2:

  • Do we need a mechanism for browsers to advertise support for WebRTC browser-to-browser?

    Say that browser B supports WebRTC browser-to-browser. B listens via a relay and advertises its relayed address. A discovers B's relayed address. At this point A does not know whether B is a browser and thus supports WebRTC browser-to-browser, or whether B is e.g. a laptop potentially supporting TCP and QUIC hole punching via DCUtR but not WebRTC browser-to-browser. In the latter case, A can not establish a direct connection to B.

    Potential solution would be for B to advertise some protocol after the /p2p-circuit within its Multiaddr, e.g. /ip6/<RELAY_IP>/udp/4001/p2p/<RELAY_PEER_ID>/p2p-circuit/webrtc-direct/p2p/<B_PEER_ID>. As an alternative, A can discover B's support via the identify protocol on the relayed connection or by optimistically opening a stream using the signaling protocol. Both of the latter options would on failure happen at the expense of a wasted relayed connection.

mxinden commented 1 year ago

STUN

If I understand browser WebRTC correctly, it is required that browsers use a STUN server. The spec is completely silent about this. At the very least, we should acknowledge that implementations will need to pick a STUN server, and leave it up to the implementation to pick one (or to expose a configuration option).

Most people would then probably Google's public (and free) STUN service, as this seems to be the most popular service around. This might be considered problematic, as we might not want to have a large fraction of STUN traffic sent to Google. We could also mention that setting up a dedicated STUN server yourself (as a project maintainer) is within the realm of possibilities, and we might even consider doing that for the IPFS network.

The trade-offs need to be mentioned in the specification.

badcb1f details the usage of STUN, the availability of existing public servers and the optional deployment of servers for specific libp2p deployments. I don't think we need to expand on the trade-offs of the various public services (e.g. Google's STUN servers) in this specification as that very much depends on the specific libp2p deployment (network).

mxinden commented 1 year ago

TURN

Not every (WebRTC-initiated) hole punching attempt will be successful. Many video-conferencing tools therefore set up TURN servers to help those 5-15% of unlucky users connect throught that TURN server. This is a problem, since we (as PL) can't and won't provide a (centralized) TURN server.

If you need a TURN server or not depends very much on the use case. It's useful for video conferencing, but not so much for WebTorrent, for example. I found the discussion of the tradeoffs invovled in this podcast episode very interesting.

Anyway, this should be mentioned and discussed in this specification.

I don't think this specification should detail options on direction connection establishment failure. In my eyes it should stay focused on WebRTC direct connection establishment only. Whether to use a relayed connection, fail entirely, or communicate through some other means is deployment (network) dependent. This is consistent with the DCUtR specification that does not expand on the usage of the relayed connection after failure either.

mxinden commented 1 year ago

Authentication

The SDPs are exchanged over an (authenticated) libp2p connection, and they contain the certificate hash of their respective endpoint (not sure if that's what Chinmay was pointing out). So technically, there’s no need to run a Noise handshake, given that the browser checks the certificate hashes. Nodes can just “transfer” pubkeys / peer IDs from that connection to the newly established WebRTC connection and expose them on the connection API.

:+1: this is described in 35841c3be6b44e3a93d9982838f6eda4e571b9ef.

mxinden commented 1 year ago

I addressed all comments. Thanks everyone for the review. Please take another look.

At this point I consider this specification ready for general review. Missing pieces:

As always, I expect there to be some unknown unknowns, which we will only discover once we have a working implementation.

mxinden commented 1 year ago

A note on the structure of this document and the previous WebRTC browser-to-server document. I plan to move the latter into a separate file, linking to browser-to-browser and browser-to-server from the main README.md. I suggest we do so right before merging here to keep the git diff small during review.

mxinden commented 1 year ago

Status Update

Open discussions: codepoint

I would consider neither of the two a blocker for merging here. Please comment in the above linked discussions in case you think otherwise.

That said, I am not suggesting to merge yet. Let's give folks more time for reviews / input. Also, let's wait for implementation(s) to catch up to the current state of the specification, thus providing feedback.

vmx commented 1 year ago

I currently work on getting brower-to-browser connections for browsers within the same network without any intermediate server work. This is very different from this proposal and a very niche/special case. The reason I bring it up is, that here are lots of discussions about naming things. As I'll also need new Multiaddress protocol names, I mention it here, in case there is overlap. Currently I don't think there really is any overlap, but better be safe than sorry.

My idea is that only Multiaddresses would be needed to dial each other. The browser would need to learn about each other addresses somehow out of band (e.g. a QR-code or so). The Multiaddresses are basically their ICE-candidates together with their certhash. Once both sides have the Multiaddress(es) of the other side, they will both try to dial/connect.

And here's the catch: in WebRTC you have one peer that initiates the connection and one that is receiving it. Therefore the Multiaddresses somehow need to signal whether it is one that expect to initiate or to receive a connection. That's the information I need to put into the Multiaddress somehow, where there's currently no protocol identifier for. In my prototype I currently just append /webrtc-initiator and /webrtc-receiver to the Multiaddress, but I guess there might be better ways (I'm not that well versed in Multiaddresses).

ckousik commented 1 year ago

@mxinden Does webrtc-direct here replace https://github.com/multiformats/multicodec/blob/master/table.csv#L104 and the previous webrtc-direct implementation? I've created a pull request in multiaddr for a separate webrtc-direct protocol https://github.com/multiformats/js-multiaddr/pull/309

mxinden commented 1 year ago

@mxinden Does webrtc-direct here replace https://github.com/multiformats/multicodec/blob/master/table.csv#L104 and the previous webrtc-direct implementation?

Note that the previous js-libp2p-webrtc-direct implementation depends on preceding signaling over an established HTTP connection to the same node. Thus requiring the remote to be public. With that in mind I would rather compare it to the browser-to-server protocol rather than the brower-to-browser protocol.

I've created a pull request in multiaddr for a separate webrtc-direct protocol multiformats/js-multiaddr#309

Thanks. Note that this is only needed for advertising support for the browser-to-browser protocol. See https://github.com/libp2p/specs/pull/497#discussion_r1083223169. Unless I am missing something, this is not needed for the first implementation of the browser-to-browser protocol in js-libp2p. For now, until we reach consensus in the just mentioned discussion, you can depend on option two:

As an alternative, A can discover B's support via the identify protocol on the relayed connection or by optimistically opening a stream using the signaling protocol.

That said, I don't have a strong opinion, and thus fine building the first version with the multiaddr /webrtc-direct protocol component that you are adding via https://github.com/multiformats/js-multiaddr/pull/309.

0X-JonMichaelGalindo commented 1 year ago

STUN

If I understand browser WebRTC correctly, it is required that browsers use a STUN server. The spec is completely silent about this. At the very least, we should acknowledge that implementations will need to pick a STUN server, and leave it up to the implementation to pick one (or to expose a configuration option).

Most people would then probably Google's public (and free) STUN service, as this seems to be the most popular service around. This might be considered problematic, as we might not want to have a large fraction of STUN traffic sent to Google. We could also mention that setting up a dedicated STUN server yourself (as a project maintainer) is within the realm of possibilities, and we might even consider doing that for the IPFS network.

The trade-offs need to be mentioned in the specification.

badcb1f details the usage of STUN, the availability of existing public servers and the optional deployment of servers for specific libp2p deployments. I don't think we need to expand on the trade-offs of the various public services (e.g. Google's STUN servers) in this specification as that very much depends on the specific libp2p deployment (network).

(tldr; The libp2p peers themselves can provide STUN and TURN to browsers. Since WebTransport doesn't replace WebRTC, this should be addressed eventually.)

  1. Any (non-browser) node with a public IP can provide STUN | TURN.
  2. Any peer can determine (via STUN) whether it has a public IP.

Any libp2p browser node can connect (bootstrap + discovery) to peers, some of whom have public IPs, and can therefore provide that browser with a STUN | TURN url, no domain name required. (TURN is a bandwidth-heavy process; it needs to be distributed, not centralized.)

As far as I can tell, this means nodes can and should provide native holepunching to their peers, doesn't it?

And that might be as simple as a coturn dependency. (simple compared to re-implementing the STUN / TURN protocols from scratch.)

But I should add that I really want libp2p's browser-to-browser support up and running ASAP, so maybe this is best left for future improvements? It could ship right now "ignoring" the problem (and we all use Google's STUN servers), then in the future, that centralized reliance can be replaced by the p2p network.

mxinden commented 1 year ago

With the last two commits, all remaining discussions are addressed. Do you want to take another look @MarcoPolo @achingbrain @marten-seemann @ckousik @ddimaria @thomaseizinger @Menduist?

Menduist commented 1 year ago

Can I bikesched again about renaming this "WebRTC hole-punching"? Browser to browser is just one of the 3 useful use cases here (Private node to browser, browser to private node, and browser to browser. Even private node to private node with webrtc might be useful in some cases)

thomaseizinger commented 1 year ago

Can I bikesched again about renaming this "WebRTC hole-punching"? Browser to browser is just one of the 3 useful use cases here (Private node to browser, browser to private node, and browser to browser. Even private node to private node with webrtc might be useful in some cases)

I am in favor of the hole-punching terminology.

mxinden commented 1 year ago

Can I bikesched again about renaming this "WebRTC hole-punching"? Browser to browser is just one of the 3 useful use cases here (Private node to browser, browser to private node, and browser to browser. Even private node to private node with webrtc might be useful in some cases)

I am in favor of the hole-punching terminology.

@Menduist are you fine with WebRTC W3C as suggested by @thomaseizinger? The here described connection establishment flow follows the W3C specifications, among others implemented by all browsers, but thus no longer implying browsers-only.

The existing WebRTC browser-to-server specification does not follow the W3C specifications (no signaling, misuse of STUN, SDP mangling), thus the new name for the browser-to-browser specification (i.e. WebRTC W3C) stresses the contrast.

mxinden commented 1 year ago

With the last two commits, all remaining discussions are addressed. Do you want to take another look @MarcoPolo @achingbrain @marten-seemann @ckousik @ddimaria @thomaseizinger @Menduist?

There has been no discussions beyond naming for a week. Can I interpret that as general agreement with the specification? If so, would you mind giving this pull request an approval?

marten-seemann commented 1 year ago

Sorry for the late reply. As I've described in https://github.com/multiformats/multiaddr/pull/150#issuecomment-1457262773, I don't think we should call this webrtc-w3c. Happy to continue the discussion there or here, as you prefer.

p-shahi commented 1 year ago

Sorry for the late reply. As I've described in https://github.com/multiformats/multiaddr/pull/150#issuecomment-1457262773, I don't think we should call this webrtc-w3c. Happy to continue the discussion there or here, as you prefer.

I don't have a strong opinion for the multiaddr but as I was writing the corresponding docs it was clear that the spec name is obfuscating the feature. It will be confusing to a layperson/new developer. Maybe call the spec name WebRTC for Private Nodes?

mxinden commented 1 year ago

Sorry for the late reply. As I've described in multiformats/multiaddr#150 (comment), I don't think we should call this webrtc-w3c. Happy to continue the discussion there or here, as you prefer.

Thanks for cross-posting here. Let's keep the discussion on https://github.com/multiformats/multiaddr/pull/150.

mxinden commented 1 year ago

For the record, https://github.com/libp2p/specs/pull/497/commits/25e5774eff4a1268232faefba32aacdc444e83a8 does the renames discussed in https://github.com/multiformats/multiaddr/pull/150/#issuecomment-1460912728.

mxinden commented 1 year ago

With the last two commits, all remaining discussions are addressed. Do you want to take another look @achingbrain @marten-seemann @ckousik @ddimaria @thomaseizinger @Menduist?

There has been no discussions beyond naming for a week. Can I interpret that as general agreement with the specification? If so, would you mind giving this pull request an approval?

Friendly ping. Please add your approvals.

mxinden commented 1 year ago

Thanks for the reviews @thomaseizinger and @marten-seemann.

@marten-seemann mind taking another look?

mxinden commented 1 year ago

Friendly ping @marten-seemann.

BigLep commented 1 year ago

@mxinden : what's remaining given https://github.com/libp2p/js-libp2p-webrtc/pull/90 was merged and there are approvals here?

mxinden commented 1 year ago

With /webtransport, /webrtc-direct and here /webrtc we now have all protocols specified that are needed for universal libp2p connectivity, more specifically integrating the browser. :tada: thanks to everyone for making this possible :tada: