w3c / webrtc-encoded-transform

WebRTC Encoded Transform
https://w3c.github.io/webrtc-encoded-transform/
Other
124 stars 27 forks source link

Integration of SFrameTransform and MLS #114

Open youennf opened 3 years ago

youennf commented 3 years ago

Richard mentioned the potential issue of manual keying and how to do MLS integration. We should investigate this.

youennf commented 3 years ago

My basic idea was something like a native MLS module that would generate unextractable crypto keys. These keys could potentially have a dedicated 'sframe' usage (plus either encrypt or decrypt) so that it can only be used with an SFrameTransform, exclusively for sending or exclusively for receiving.

bifurcation commented 3 years ago

For context, the MLS integration with SFrame replaces manual per-stream keying with a more automated scheme that derives per-sender keys from a single key exported from MLS. It also provides new keys to SFrame when the MLS epoch changes (e.g., because of a join or leave).

That means that a web app that wants to do MLS+SFrame needs the following parts:

  1. Export a key from the MLS context
  2. Derive SFrame keys according to the MLS+SFrame spec
  3. Configure SFrame senders/receivers

The current SFrameTransform does (3); presumably a future WebMLS thing would do (1). The question is where (2) resides, and in particular, whether it needs to be in SFrameTransform in order to keep the keys isolated from JS (our ultimate goal here), or whether it can be done in JS.

To be doable in JS while keeping the keys isolated from JS, all of the CryptoKeys involved would need to be have extractable = false, including the keys exported from MLS, the keys imported to SFrame, and any intermediate values. I haven't worked through all the details, but IIRC, WebCrypto supports enough derivation functions from non-extractable to non-extractable that this could probably be made to work.

The unfortunate thing is that WebCrypto also allows you to derive an extractable value from a non-extractable one, and vice-versa. So if the browser were going to try to assure that the SFrame keys were never exposed to JS, it would need to track the whole history of a key to make sure that none of its antecedents were available to JS. This might be a useful thing to add to WebCrypto, and not super complicated, but it doesn't exist today.

So there's still a case to be made for doing the key derivation within the browser, inside one API context or another. In addition to the above security concerns, it's more ergonomic. If we had some sort of API where you connect an MLS context and an SFrame context, we could put the key derivation logic in one of those contexts.

youennf commented 3 years ago

So there's still a case to be made for doing the key derivation within the browser, inside one API context or another. In addition to the above security concerns, it's more ergonomic. If we had some sort of API where you connect an MLS context and an SFrame context, we could put the key derivation logic in one of those contexts.

Agreed. Going with key derivation in JS is okay but it would be preferable to support a more direct setup. This could be done by extending the current SFrameTransform API, or on a yet to be defined MLS API to handle native SFrame keys.

aboba commented 3 years ago

Since the goal of SFrame is transport independence, it seems like MLS should be supported as a media encryption scheme, not just as a key management scheme within SFrameTransform. Otherwise, we will be in a position where SFRAME key management is transport-specific, and SFRAMEs aren't supported within media APIs such as WebCodecs.

bifurcation commented 3 years ago

I feel like I might be missing some nuance here, but I think I disagree with @aboba here. MLS doesn't have anything to say about how media is encrypted. It just provides keys that are applied by some media encryption scheme such as SFrame.

That said, I'll grant that there are some interesting API questions about how exactly we should do the integration. Since MLS is useful outside of WebRTC, it seems not unlikely that the instantiation of MLS will be in some WebMLS thing that stands alone from WebRTC. So you'll get a WebMLS context/group that can provide keys, and the question is how you attach that thing to the WebRTC context: Do you attach it to an SFrameTransform so that it provides keys to that transform? Do you attach it to something higher-level like a PeerConnection so that it can provide keys to whatever E2E security thing plugs into that? ISTM that the former would be cleaner, but it does seem like there are a few options.