Full WebRTC signalling is not necessary

steely-glint commented 1 month ago

In the explainer you state that webRTC requires signalling through a cloud server. This isn't strictly true. The wire protocol only requires the following data to be available about the peer.

IP address and port number
ICE ufrag and password
DTLS certificate finger print

This is enough to create a valid offer/answer and start a data channel connection.

These could be exchanged via MDNS in a similar way to the one proposed here.

WebRTC allows for persistence of DTLS certificates - which could obviate the need for the user to submit a PIN on subsequent usage.

backkem commented 1 month ago

Thanks for bringing this up. It has certainly been our goal to keep the stack minimal and as close to existing API/protocols. Avoiding the need for unnecessary PIN entry is certainly something we want to explore, for example by using different authentication methods/helpers or by introducing peer persistence.

I tried to capture some of my thoughts on the subject below. This is not meant as a conclusion, just to move the conversation ahead.

Notes on ICE in this setting:

To establish an ICE connection, both peers need know each others ICE credentials. If only one side advertises the credentials, it's not obvious how to get the remote credentials. I guess one way to do that would be for both sides to advertise via mDNS simultaneously. I didn't reason through this fully yet, but I expect it would have side-effects since the credentials are basically public this way.
The candidate pairing logic which lies at the core of ICE and allows it to do NAT traversal has little effect in this setting.

Another option would be to "skip" ICE but make the peer authentication more independent of the transport protocol. For example, allowing peer authentication to happen over the existing transport (UDP/DTLS/SCTP) instead of using a more integrated protocol such as OpenScreen Protocol. This is feasible but there will likely always be some transport-specifics bits that need to be spec'd out.

Notes on OpenScreen Protocol in this setting:

Note that we mean a slimmed down version of OSP here, specifically the discovery & authentication phases.
The advantage I saw/see in using OpenScreen Protocol is that is was designed for the exact purpose of local peer authentication.
It is meant as the open protocol stack to implement the Presentation API and Remote Playback API.
The OSP may pursue compatibility with the Matter protocol, see w3c/openscreenprotocol#308. It seems compelling to be able to use an existing Matter fabric to boothstrap local connections and fit in more closely with IoT networks overall.

backkem commented 1 month ago

I was thinking more on how to keep the spec even closer to the existing WebRTC stack, the following came to mind:

Assuming OSP is used for discovery & authentication, this could be used to initiate a WebRTC connection more directly; Using the ORTC API for illustration purposes, you could have a new constructor similar to what we currently define for LP2PQuicTransport but directly on RTCDtlsTransport:

partial interface RTCDtlsTransport : RTCStatsProvider {
  constructor((LP2PRequest or LP2PReceiver) source);
};

On the protocol level this would create a DTLS connection with certificates provided by OSP authentication (potentially using child certificates as mentioned in #34). On top of this, one can run the rest of the WebRTC protocol stack (SRTP / SCTP) as-is. This may also be a nice way to gain SRTP Media on LAN.

The same can be done for RTCQuicTransport. This is basically equivalent to LP2PQuicTransport but avoids introducing separate API surface:

partial interface RTCQuicTransport : WebTransport {
  constructor((LP2PRequest or LP2PReceiver) source,
              optional LP2PQuicTransportListenerInit quicTransportListenerDict = {});
};

This should also be sufficient to enable RoQ and MoQ.

For the RTCPeerConnection API the story is a bit more complex. We'd likely have to define a way to exchange SDP (signaling) since creating any Track or DataChannel seems to require it. Even the in-protocol methods such as addTrack and createDataChannel seem to require signaling on first use. It may be possible to achieve this this sneaking the SDP over OSP. However, I have to admit that does seem somewhat janky.

One downside to all this I can see is that it moves somewhat away from supporting different connection media as discussed in #47. Maybe it's possible to keep LP2PQuicTransport but define it as a shim for RTCQuicTransport for the LAN use-case.

WICG / local-peer-to-peer

Full WebRTC signalling is not necessary #48