w3c / webrtc-extensions

A repository for experimental additions to the WebRTC API
https://w3c.github.io/webrtc-extensions/
Other
58 stars 19 forks source link

Peer Connection and back/forward cache #200

Open aboba opened 4 years ago

aboba commented 4 years ago

Moved from https://github.com/w3c/webrtc-pc/issues/2346

Current browsers do not add pages that have a live (or connected) peer connection to the b/f cache. It would be interesting to enable this. One possibility is for these connections to simulate connection failure so that, should the page be shown from b/f cache, the application logic would try to restart a connection.

A 'b/f cache', or 'page cache' is used when user is on Page1, navigates to Page2 and then clicks the back button to go back to Page1. Some browsers will reuse the exact same page to quickly render Page1 in the same state it was. Page1 will typically receive a pageshow event when being rendered from page cache. This can also be used when user clicks the Forward button.

jesup commented 4 years ago

Note that typically any page with an active http(s) connection will be blocked from the bfcache, as will any page with an unload/beforeunload (and other things too...). Chrome is experimenting with a bfcache implementation as well now. A major downside would be that we would freeze the implementation (so all the packets); freeze capture (and then restore it - would it need to be reprompted? Note that the page could have been frozen hours or days ago...), any http connections or websockets would have to severed (which means that on pageshow the page would need to restore them somehow), etc. Also the other side might not got any notification of call shutdown - it would look like a machine turned off/network break/etc (unless you watch pagehide and do something there). Lots of complexity and ways pages could fail to restore properly; security concerns (capture restart - likely we'd need to reprompt), etc. Compare to now, where it's not cached, and on back/forward it reloads, which generally will attempt to restart a call (if the URL includes something like a room id/etc). Yes, reloading can take longer, but specifying how to deal with the above (and debugging it by a page developer) probably is not worth it. Happy to consider opposing positions, however - this is all off the cuff based on having implemented the bfcache blocking and having been in that code a few times for perf work (and having looked at relaxing some of the other strictures around it)

youennf commented 4 years ago

FWIW, this is now done in Safari Tech Preview. A good b/f cache is a net user experience gain. A lot of pages are creating peer connections and not closing them or doing so for the purpose of stealing IP addresses.

A major downside would be that we would freeze the implementation (so all the packets)

That is how it is done in Safari. We basically simulate a network failure. This is something websites should be prepared for, at least in theory.

We have done similarly for web socket connections as well as active HTTP exchanges.

freeze capture (and then restore it - would it need to be reprompted? Note that the page could have been frozen hours or days ago...),

There is a separate issue for getUserMedia but we basically simulate a capture failure in Safari so the track gets ended and the application needs to call getUserMedia if needed. Again, this is something that may happen in practice (USB web cam gets disconnected for instance) and websites should handle this in theory. This is probably not done appropriately everywhere.

Lots of complexity and ways pages could fail to restore properly;

That is why it would be good to have consistency between implementations. That would decrease the complexity and might allow some libraries to deal with this uniformly.

My recommendation for pages that do not want to handle restoration would be to manually reload the page or the frame if loaded from b/f cache.

youennf commented 8 months ago

This was discussed in WebRTC interim today. Maybe it should be moved to webrtc extensions?

aboba commented 8 months ago

@youennf Moving it to WebRTC-Extensions probably makes sense.

docfaraday commented 8 months ago

When it comes to what a bfcached peerconnection being revived looks like, I think the key thing to be thinking about is whether there is any other endpoint that is likely to have state lying around that corresponds to that peerconnection. Is there anything else that remembers the offer/answer state, for example? Does anything remember its own DTLS certificate? Does anything remember what the stream ids mean? Does anything remember the RTP mid -> ssrc mappings it discovered?

I think the answer to most of these is "probably not". Now, maybe we could keep the bfcached peerconnection just alive enough to do ICE consent checks (perhaps for a limited time?). If those are still working, there's a pretty good chance we can pick up where we left off. If not, that likely means that the other end is gone and forgotten. I don't know if I like the idea of continuing to send any packets when we navigate away, though.

As for whether we disable bfcache outright, as long as we do something sensible with the cached peerconnections, bfcacheing is probably fine.

Personally, I would be very irritated if hitting back or reopening a tab turned my camera/microphone on and dumped me back into a meeting. I think many other users would feel the same way.

jan-ivar commented 8 months ago

Does Safari have data on how well this works in practice on video conferencing webpages?

It seems to me the main gain here is the roughly 5.35% (5.65 - 0.3) of webpages that only use WebRTC for fingerprinting. E.g. only blocking BFCache after sRD could have a large impact without disrupting video conferencing.

Personally, I would be very irritated if hitting back or reopening a tab turned my camera/microphone on and dumped me back into a meeting. I think many other users would feel the same way.

Websites can already do that, so I think that is orthogonal (thanks to persisted permissions or even permission grace periods). We messed up gUM which doesn't require transient activation, but it's probably too late to fix that. UAs might be able to mitigate this with muting — idea in https://github.com/w3c/mediacapture-extensions/issues/39#issuecomment-955131893).

fippo commented 8 months ago

Use https://webrtchacks.github.io/chromestatus/?buckets=3451,3452 for easier comparison. But yes, excluding unconnected peerconnections should help. IIRC those already receive special treatment in Chrome, I think @guidou knows how/where

jan-ivar commented 7 months ago

That is how it is done in Safari. We basically simulate a network failure.

When I test in Safari with https://jan-ivar.github.io/dummy/pc_bfcache.html which has 3 buttons Start, sLD and sRD, which together complete a local-loop connection, I observe the following:

  1. pressing Start, sLD, sRD, , ... seems to recover (showing 2 videos) after ~2-5 seconds
  2. pressing Start, sLD, , , sRD ... seems to recover (showing 2 videos) after ~14 seconds

This is cool! It recovers thanks to this code:

pc1.oniceconnectionstatechange = async () => {
  switch (pc1.iceConnectionState) {
    case "disconnected":
    case "failed":
      await pc1.restartIce();
      await sLD.onclick();
      await sRD.onclick();
      break;
  }
};

Of course this being a local-loop it doesn't address whether signaling over e.g. WebSocket would recover or https://github.com/w3c/webrtc-extensions/issues/200#issuecomment-1961861356

This is something websites should be prepared for, at least in theory.

In 1 yes, but in 2 which is during initial signaling (admittedly a narrow gap), "disconnected" never fires because "connected" never fired, and "failed" only fires if sRD comes in, which seems unclear whether it would in practice.

Webpages can of course listen to pageshow to detect this situation, or detect stalled signaling, but this seems a higher bar than quoted.

Is this fine?