Closed mxinden closed 1 year ago
I see several discussion about browser-to-browser. For the sake of shipping things, I would suggest to keep the scope of this RFC to browser-to-server, which is a way less complicated problem.
Note that both within Protocol Labs as well as Little Bear Labs (@ckousik and @lightsofapollo) our priority is on browser-to-server. Based on the above and previous conversations this is the same for Parity (@melekes and @tomaka).
Thus I think we are aligned here.
I would suggest to keep the scope of this RFC to browser-to-server
I am fine discussing both here at this point in time. That said, once the browser-to-server specification part is solid, I am happy to proceed and merge, marking the browser-to-browser part as an early draft only. In other words, I don't think the browser-to-browser work should block the browser-to-server work.
@mxinden @marten-seemann Summarizing our conversation, please note the following:
The STUN spec has security considerations here: https://datatracker.ietf.org/doc/html/rfc5389#section-16.1.2
A rogue client may try to launch a DoS attack against a server by sending it a large number of STUN requests. Fortunately, STUN requests can be processed statelessly by a server, making such attacks hard to launch.
The problem here is that STUN messages are not processed statelessly, and messages from a previously unknown address are treated as a new peerconnection, which will take up resources on the server.
@mxinden @marten-seemann Given the above mentioned issue, @lightsofapollo and I currently have a couple of recommendations:
A rogue client may try to launch a DoS attack against a server by sending it a large number of STUN requests. Fortunately, STUN requests can be processed statelessly by a server, making such attacks hard to launch.
The problem here is that STUN messages are not processed statelessly, and messages from a previously unknown address are treated as a new peerconnection, which will take up resources on the server.
Can you expand on the data that needs to be cached by the server between the initial STUN message and the actual connection attempt? For the browser-to-server use-case what information does the server need from the initial STUN message send by the client @ckousik?
The problem here is that STUN messages are not processed statelessly,
Is that even possible unless you ignore them completely?
NOTE: https://github.com/webrtc-rs/ice/blob/0365dbc3ac739495570de8a69f7e8020a70de4a2/src/udp_mux/mod.rs#L175-L211 webrtc-rs/ice library also process STUN messages statefully.
A rogue client may try to launch a DoS attack against a server by sending it a large number of STUN requests. Fortunately, STUN requests can be processed statelessly by a server, making such attacks hard to launch.
The solution might be
1) rate limit the number of messages per IP (as noted here) 2) limit the number of new addresses "in flight" (currently 1)
This would require a trusted relay to be present
I'd like to note that at Parity, our entire use case is to avoid requiring a trusted relay. If a trusted relay is mandatory, then we have no interest in WebRTC anymore.
@mxinden The spec itself refers to simple STUN servers used mostly for finding the public IP, in which case processing is stateless. In the case of this spec, the stateful operations are described below.
@melekes True, this attack cannot be avoided unless the packets are dropped. The STUN spec refers to STUN-only servers and not ICE implementations. The difference between webrtc-rs and the current spec is as follows:
webrtc-rs case:
When a new STUN message is received on the muxed UDP, the muxer will check to see if there is an existing PeerConnection
object based on the source address, failing which it will check based on the ufrag contained within the STUN message. If no existing connection is found based on either of these criteria, the packet is dropped. Therefore, if a malicious client generates and sends a STUN message to the port without a prior SDP exchange, the packets will simply be dropped.
current spec:
With reference to this implementation, consider the case where a malicious client sends multiple STUN messages with different ufrags from different ports. For each message, a new PeerConnection
instance is created alongside all the timers, tasks, etc. assosciated with it. The malicious client need not even generate a certificate locally, only a string that looks like a fingerprint and this would be enough to negotiate ICE. This consumes significantly more server resources per STUN message than the webrtc-rs case.
@tomaka As per @melekes suggestion, rate limiting should help mitigate this problem considerably at the cost of performance in certain edge cases (consider multiple legitimate users behind the same NAT trying to connect to the same libp2p node). It should be noted that this does not work for the browser-to-browser case but that is currently lower priority.
The commits above restructure the spec and add the latest discussions as open questions. Input more than welcome.
Discussed out of band, @ckousik will start implementing the current state of the specification for go-libp2p.
The most recent 5 commits track the discussions above as well as an out-of-band meeting with @ckousik, @marten-seemann and me.
@melekes and of course everyone else, please let me know in case you have any questions or concerns.
@ckousik following up on the conversation on who should initiate the Noise handshake, the browser or the server, and whether the current Rust implementation is in-line with the specification. Please see https://github.com/libp2p/rust-libp2p/pull/2622#discussion_r918934992
Let me know if that makes sense.
This came up during the IPFS þing: What are the connection options in local networks? It would be nice if we could connect two nodes on the same network by simply scanning a QR code (which might encode the SDP). Somewaht similar to what https://github.com/AquiGorka/webrtc-qr is doing.
I am assuming that the local network does not impose any firewalls or NATs between the two nodes @marten-seemann? If so, a QR code encoding the Multiaddr of node B would suffice for A to connect to B.
I am assuming that the local network does not impose any firewalls or NATs between the two nodes
It's usually all or nothing - some (usually public access) WiFi hotspots enforce client isolation so it's impossible to send traffic between two nodes directly on the same network, others will allow everything.
@marten-seemann I would love to see more exploration of what we could do with mDNS. offline-browser-communication is an example of connecting two Peers without any signaling. Some bootstrapping/pre-configuration needs to be done though.
Currently this isn't possible Browser <-> Browser
but maybe that doesn't matter to us? If libp2p finds it useful it could be the impetus to make browser vendors take this idea seriously.
@marten-seemann @mxinden
Regarding the implementation of Reset
in webrtc datachannels, please note the following:
CloseAbruptlyWithError
method on it, this is not exposed in the DataChannel interface and seems to be used only internally. Also, this method does not signal any error to the peer, and proceed with normal datachannel closure. Reference links: 1 2 3Close
is called on a datachannel, any data that is queued is sent and any further writes will be discarded [ 4 ]. If the datachannel state is not open ('connecting', 'closing', or 'closed'), incoming data is queued but never sent to the user through the observer. Therefore, after closing, we are unable to read any inflight data when it is received. It will only be queued until the queue is full, after which the datachannel will close [ 5 ]. For the record, a solution for half-close and reset of streams (https://github.com/libp2p/specs/pull/412#discussion_r932487837) is proposed via https://github.com/mxinden/specs/pull/1. Input welcome.
https://github.com/libp2p/specs/pull/412/commits/7009f943d5b9ee67cb7a19bbf8ca60549c1ad683 specifies minimum required support for sha256
and base64url
. Comments are welcome, though I suggest for these comments to be posted on https://github.com/multiformats/multiaddr/pull/130 instead of this pull request to keep the discussion in one place.
One note: there's an implicit assumption that WebRTC data channels are ordered (which is the default in webrtc-rs and Go Pion I believe), but they could also be unordered since SCTP supports both options. Should we maybe make this explicit (i.e. add smth like All opened data channels must be ordered.
)?
One note: there's an implicit assumption that WebRTC data channels are ordered (which is the default in webrtc-rs and Go Pion I believe), but they could also be unordered since SCTP supports both options. Should we maybe make this explicit (i.e. add smth like All opened data channels must be ordered.)?
Ordering is not a property of the underlying channel, ordering is a property that an API can expose. You can also have an API that gives you random access. See https://datatracker.ietf.org/doc/html/rfc9000#section-2.2 for how QUIC describes this.
Should we maybe make this explicit (i.e. add smth like
All opened data channels must be ordered.
)?
Good idea @melekes. Let me know what you think of https://github.com/libp2p/specs/pull/412/commits/48f5fe767c73cebd0d751de87d8fe7a697db1884.
Ordering is not a property of the underlying channel, ordering is a property that an API can expose. You can also have an API that gives you random access. See https://datatracker.ietf.org/doc/html/rfc9000#section-2.2 for how QUIC describes this.
:+1: once we design unordered streams for the libp2p project in general, we can revisit how to expose such functionality on WebRTC in specific.
That is not accurate. libp2p implementations use ordered streams, but the streams themselves can be unordered (QUIC streams, for example), and there's nothing in any of our protocols preventing a random access into received stream data.
I'm not sure I fully understand the
ordered
property of theRTCDataChannel
, but what I gather from https://www.w3.org/TR/webrtc/#attributes-17, this is a local property and has no effect on the wire behavior of the protocol. If that interpretation is correct, we shouldn't have a MUST here: A libp2p implementation in the browser MAY expose an ordered byte stream abstraction in its API, or it MAY expose an unordered API to the application.
https://github.com/libp2p/specs/commit/48f5fe767c73cebd0d751de87d8fe7a697db1884#r83389191
This is a fair point. I too read it as a local property only. https://github.com/libp2p/specs/pull/412/commits/9ce0d45e54f7e67a6ed4489901c8437a83687590 adjusts the specification accordingly, allowing implementations to expose an unordered API to users.
Other things that should probably be discussed here:
Certificate
messages are sent in the clear, meaning that an on-path observer can correlate connections from the same browser to different endpoints. This is a privacy risk for people taking steps to mask their IP information, e.g. by using a VPN. In any case, this spec should discuss this, and maybe even recommend using different certificates for connections to different IP addresses.Following up on an out-of-band discussion:
There were race conditions in the connection establishment process which have now been documented in comments. This was to do with setting peerconnection callbacks after the noise handshake. If one peer was faster than the other, it could initiate a stream before the remote had even set the datachannel callback.
@ckousik would you mind proposing a short paragraph to be included in the specification to prevent other implementers to introduce the same race condition?
@marten-seemann as discussed out-of-band https://github.com/libp2p/specs/pull/412/commits/88036826703881e4c6cf2ae4088a207e3f9fef48 drafts the browser-to-browser section. As mentioned previously I suggest we do not block the browser-to-server specification on the browser-to-browser specification. Instead I suggest we tackle the latter in a second iteration. I added the draft in https://github.com/libp2p/specs/pull/412/commits/88036826703881e4c6cf2ae4088a207e3f9fef48 to still make sure we are on the right track for the the browser-to-browser specification.
@mxinden : given this spec is already has a lot of comments for browser-to-server, would it maybe make sense to do browser-to-browser in another PR (on top of this one)? I'd like us to be able to close this one when we meet the requirements of having browser and server implementations that meet the browser-to-server portion of the spec.
@BigLep yes, complete browser-to-browser specification will happen in a second iteration of this specification, i.e. another pull request.
While describing both the browser-to-server and browser-to-browser use-case, priority is on the former. Once the browser-to-server use-case is sufficiently specified, I will remove the browser-to-browser use-case, merge this pull request, and start iteration two, namely to sufficiently specify the browser-to-browser use-case.
Any reference to browser-to-browser in this specification is for:
/webrtc
multiaddr component in both use-cases)I know I'm not the most diplomatic person when it comes to libp2p, but I'd like to once again complain a little bit.
We at Parity have a deadline for shipping WebRTC in production at the end of the year. In two and a half months. This is not an arbitrary self-enforced deadline that we can push back, we are victims of Google's Manifest v3 enforcement for browser extensions.
Having this in mind, I had initially created (in April!) a proof of concept and proposed a more simple design, that was apparently rejected (without really any good justification IMO, but that's besides the point here). For the benefit of the collaboration I was willing to give the benefit of the doubt to this specification instead.
This pull request has now been opened for 5 months, with no real end in sight.
The smoldot implementation for example has been ready for a long time now, only waiting for decisions to be made here so that it can align with the rust-libp2p implementation.
I personally don't think it matters at the slightest for example in which way the certificates are ordered in the Noise payload, however I am slightly frustrated that it takes months to resolve questions like these that, sorry to say, almost don't matter.
I want to remind everyone that whatever happens here, a spec written by committee will contain mistakes. You cannot write a spec first and expect it to be flawless, humans simply are not capable of that. This will need a version 2 of the WebRTC libp2p protocol no matter what.
With all that being said, can I suggest that we close all discussions, merge this as a known-to-be-flawed v1 soon (today?), then implement it, then gather back for a v2?
merge this as a known-to-be-flawed v1 soon
If you don't agree with that idea, please let us know as soon as possible so that we can start finding a solution to our WebRTC problem instead of waiting more.
I want to remind everyone that whatever happens here, a spec written by committee will contain mistakes. You cannot write a spec first and expect it to be flawless, humans simply are not capable of that. This will need a version 2 of the WebRTC libp2p protocol no matter what.
For the unknowns I have the same mindset. We have multiple mechanisms in place (multiaddr, Protobuf) that will allow us to evolve this protocol on live networks in the future.
For the knowns I do tend towards in-depth design instead of shipping things fast.
Talking about the knowns: The only outstanding item from my end is the datachannel ID discussion above. More specifically whether to use negotiated: true
and thus control channel ID reuse and safe 0.5 RTT on each new channel or whether to use negotiated: false
. See https://github.com/libp2p/specs/pull/412#discussion_r984746570. I would like to get @ckousik and @John-LittleBearLabs input here before making a decision.
Unless I am missing something, all remaining discussions (i.e. knowns) above are resolved / decided. For the record, these are also tracked in the pull request description.
With all that being said, can I suggest that we close all discussions, merge this as a known-to-be-flawed v1 soon (today?), then implement it, then gather back for a v2?
It is not going to happen today. That said, I do hope for this specification to go into final review by the end of the week.
I want to remind everyone that whatever happens here, a spec written by committee will contain mistakes. You cannot write a spec first and expect it to be flawless, humans simply are not capable of that. This will need a version 2 of the WebRTC libp2p protocol no matter what.
For the unknowns I have the same mindset. We have multiple mechanisms in place (multiaddr, Protobuf) that will allow us to evolve this protocol on live networks in the future.
For the knowns I do tend towards in-depth design instead of shipping things fast.
Talking about the knowns: The only outstanding item from my end is the datachannel ID discussion above. More specifically whether to use
negotiated: true
and thus control channel ID reuse and safe 0.5 RTT on each new channel or whether to usenegotiated: false
. See #412 (comment). I would like to get @ckousik and @John-LittleBearLabs input here before making a decision.Unless I am missing something, all remaining discussions (i.e. knowns) above are resolved / decided. For the record, these are also tracked in the pull request description.
With all that being said, can I suggest that we close all discussions, merge this as a known-to-be-flawed v1 soon (today?), then implement it, then gather back for a v2?
It is not going to happen today. That said, I do hope for this specification to go into final review by the end of the week.
@mxinden I've commented on the datachannel reuse. If there are no more discussions on this we can move this PR to final review.
I am marking this as ready for review. I feel comfortable merging this specification as is. Thus I would like to ask for a (potentially final) round of reviews from folks involved.
Note that this specification covers the browser-to-server use-case only. The browser-to-browser use-case will be specified in a second iteration based on the revert commit of https://github.com/libp2p/specs/pull/412/commits/a46919c557aa4501afb285ffbec8261a7a7e595b.
@marten-seemann, @MarcoPolo and @julian88110 does this look good to you from the Golang side?
@achingbrain what do you think from the JS side?
@ckousik and @John-LittleBearLabs any objections from Little Bear Labs?
@Menduist See any blockers for the future nim-libp2p implementation of this specification?
@melekes given that you approved, I am assuming this looks good from Parity's side.
@thomaseizinger anything you would like to see changed given your expertise with the Rust side?
Perhaps we should also say that a=max-message-size:16384
SDP attribute should be set to the maximum frame size (16kB). By default, it's 64kB (see https://datatracker.ietf.org/doc/rfc8841/), but it might be a good idea to align these two variables. Wdyt? The alternative is to not include attribute in SDP and allow messages up to 64kB.
Perhaps we should also say that
a=max-message-size:16384
SDP attribute should be set to the maximum frame size (16kB). By default, it's 64kB (see https://datatracker.ietf.org/doc/rfc8841/), but it might be a good idea to align these two variables. Wdyt? The alternative is to not include attribute in SDP and allow messages up to 64kB.
That sounds good to me @melekes. @ckousik @John-LittleBearLabs any objections?
@marten-seemann thank you for your review. I addressed all your comments.
I followed both of your major suggestions.
Would you mind taking another look?
Perhaps we should also say that
a=max-message-size:16384
SDP attribute should be set to the maximum frame size (16kB). By default, it's 64kB (see https://datatracker.ietf.org/doc/rfc8841/), but it might be a good idea to align these two variables. Wdyt? The alternative is to not include attribute in SDP and allow messages up to 64kB.That sounds good to me @melekes. @ckousik @John-LittleBearLabs any objections?
No objections from my end.
@marten-seemann thank you for your review. I addressed all your comments.
I followed both of your major suggestions.
1. Inferring hash algorithm from multiaddr for easier role out of new hash algorithm. [webrtc/: Add libp2p WebRTC browser-to-server spec #412 (comment)](https://github.com/libp2p/specs/pull/412#discussion_r999745879) 2. Prefixing ufrag to ease general future upgrade mechanism via ufrag. [webrtc/: Add libp2p WebRTC browser-to-server spec #412 (comment)](https://github.com/libp2p/specs/pull/412#discussion_r999768836)
Would you mind taking another look?
Friendly ping @marten-seemann. Does this look good to you?
I understand that you're being over cautious with clicking the merge button, but can we now consider that this version of the spec is "final", in the sense that the logic isn't going to change before merging, and the only things that could change are possible clarifications of the language? I'd like to start releasing versions that "officially" support this spec.
Quite a long journey. 475 (+1) comments and 135 commits, 14 reviewers, ...
I am merging here as a Working Draft. We can promote it once we have a released implementation.
A couple of shoutouts:
Up next: browser-to-browser use-case.
Shoutout to me for having started the whole thing before it was aggressively taken over and not receiving a shoutout
I am working a project that heavily relies on the libp2p-js impl. I am currently working towards enabling a huge amount of dapps that benefit greatly from WebRTC. What I am missing (?) currently is a way for having js WebRTC listeners. Now it seems to only be supported/to-be-supported for the the Go and Rust impl. Correct me if I have understood the spec. incorrectly, but is this something that is planned for? Or not?
Follow #497
This pull requests summarizes the status quo of the ongoing WebRTC in libp2p work. In other words, it represents the summary of all past discussions like https://github.com/libp2p/specs/issues/220.
While describing both the browser-to-server and browser-to-browser use-case, priority is on the former. Once the browser-to-server use-case is sufficiently specified, I will remove the browser-to-browser use-case, merge this pull request, and start iteration two, namely to sufficiently specify the browser-to-browser use-case.
Remaining work items for browser-to-server:
RTCDataChannel
ID negotiation https://github.com/libp2p/specs/pull/412#discussion_r984746570RTCDataChannel
ID reuse https://github.com/libp2p/specs/pull/412#discussion_r984747802RTCConnection.createDataChannel
. https://github.com/libp2p/specs/pull/412/files#r995877508