w3c / ortc

ORTC Community Group specification repository (see W3C WebRTC for official standards track)
http://www.w3.org/community/ortc/
122 stars 42 forks source link

More responsive mobile ICE failover and Wifi/3G/4G switch-back #176

Open robin-raymond opened 9 years ago

robin-raymond commented 9 years ago

The issue: Wifi and 3G/4G can go up/down like a yo-yo. A mobile device can go in/out of Wifi and 3G/4G range where switching between the networks seamlessly is highly desirable but difficult in practice due to the cost of keeping interfaces "hot" and / or firewalls interfering with communications (i.e. pinholes close or reflexive / relay ports / IPs change).

Scenario A:

  1. IceTransport discovers Wifi and 3G/4G are available but settles on the preferred Wifi candidate pairing.
  2. Wifi stops working and 3G/4G might be a viable option but those candidates were already pruned within the IceTransport due to current ICE rules (despite the other candidate(s) still being viable).

Scenario B:

  1. IceTransport discovers 3G/4G is available but not Wifi.
  2. Wifi becomes available. The new host (presumably reflexive and relay as well) candidates are transmitted and used. (nice)

This issue is also related to issue #174 .

Problem: For scenario A, the IceTransport must go into a full disconnected state and cause an ICE restart to occur to resolve the issue. But the better solution would be to allow the ICE engine to know that candidates are NOT expired remotely and thus when trouble "starts" the backup non-expired host, reflexive and possible relay candidates can be tested. In the case of 3G/4G, the reality is for mobile IPv4 that the firewalls will close the pinholes and they will not remain viable reflexive candidates for very long. But mobile IPv6 only requires mutual IceTransport connectivity checks to reopen pinholes to be a viable backup when Wifi starts to fail.

For scenario B, this works nicely but the reflexive and relay candidates are likely to be required for Wifi to work again. If the IceGatherer is not aware that it needs to power backup the Wifi and re-discover reflexive / relay candidates then Wifi switch-back over will likely fail too, especially if the IceGatherer is in full "prune unused candidates" operating mode and the IceTransport is already settled on the "completed" state.

Solutions: (a) Before a full "disconnect" of an IceTransport, start testing still viable back-up candidates mutually. This would allow 3G/4G IPv6 failover to have a reasonable chance of success without a full restart (and a faster response time before full disconnect where the application layer needs to be involved). (b) When interfaces come back-up (e.g. Wifi becomes available), temporarily re-gather the reflexive and TURN candidates if those candidates would have higher priority than other candidates active within any of the attached IceTransports. (c) Do nothing and expect "restart" to solve this issue.

I personally would recommend (a) and (b) solutions be done. I think that would allow for nice failover to 3G / 4G IPv6 which is available (NOTE: IPv4 mobile is being flagged as "optional" now and "IPv6" is mandatory by the mobile industry): http://ipv6.com/articles/mobile/Mobile-IPv6.htm http://en.wikipedia.org/wiki/IPv6_deployment

Likewise (b) would allow new interfaces that show up with potentially higher priority to take over when they become available. Without the re-gather of the reflexive and maybe the TURN the Wifi is likely to not succeed.

I dislike (c) from the user experience perspective. This would require a full "disconnect" to happen before any reaction takes place. This means not just a short "glitch" where audio / video resumes once candidate back-ups are tested but a full failure where the application layer is needed to be involved to resolve scenario A. Solution (c) does not allow for scenario B optimizations either as even though a preferred host candidate might show up as there's no guarantee host related reflexive or relay candidates might come back after the "completed" IceTransport state.

[cross posting to ortc mailing list, please respond there]

aboba commented 9 years ago

Robin said:

"Solutions: (a) Before a full "disconnect" of an IceTransport, start testing still viable back-up candidates mutually. This would allow 3G/4G IPv6 failover to have a reasonable chance of success without a full restart (and a faster response time before full disconnect where the application layer needs to be involved). (b) When interfaces come back-up (e.g. Wifi becomes available), temporarily re-gather the reflexive and TURN candidates if those candidates would have higher priority than other candidates active within any of the attached IceTransports. (c) Do nothing and expect "restart" to solve this issue."

[BA] From an API perspective, would Option (a) be a use case for calling iceTransport.start() again with a previously used iceGatherer?
Option (b) seems like it would involve continuous nomination (e.g. iceTransport.state is never "completed").

Some of the required protocol changes are discussed here: https://docs.google.com/document/d/1P1XPCRJKBkSjwCzIIEUJmp7V694_FzJQe-fvN8bk-Xw/edit#heading=h.qf9e6tbxwkrn

steely-glint commented 9 years ago

On android 4.4 I find that I never see candidates for both wifi and 3g at the same time - if the wifi is up, then the 3g is ignored by chrome (and the rest of the os?). I suspect the OS does fire a 'network changed' message when switching between them. - So on Android I suspect this question is moot. - What does IoS do ?

robin-raymond commented 9 years ago

On iOS you can get multiple candidates from the OS level but I'm unsure if chromium takes advantage of this feature or not.

aboba commented 9 years ago

Here is a pointer to a recent draft from Justin Uberti: http://www.ietf.org/id/draft-uberti-mmusic-nombis

aboba commented 9 years ago

Note from https://tools.ietf.org/html/draft-ietf-rtcweb-stun-consent-freshness Section 4.1:

After consent is lost for any reason, the same ICE credentials MUST NOT be used on the affected 5-tuple again. That means that a new session, or an ICE restart, is needed to obtain consent to send.

aboba commented 8 years ago

Some relevant drafts: https://tools.ietf.org/html/draft-thatcher-ice-remove-candidate https://tools.ietf.org/html/draft-thatcher-ice-renomination https://tools.ietf.org/html/draft-thatcher-ice-network-cost

aboba commented 8 years ago

Application needs a control to opt in to backup candidate pairs + we need renomination.