nextcloud / spreed

🗨️ Nextcloud Talk – chat, video & audio calls for Nextcloud
https://nextcloud.com/talk
GNU Affero General Public License v3.0
1.61k stars 429 forks source link

Random selection of TURN servers may result in failed connections #5267

Closed plinss closed 3 years ago

plinss commented 3 years ago

Steps to reproduce

  1. Specify multiple TURN servers, one of which uses turns: on port 443
  2. Have a client behind a firewall that restricts UDP and the normal TURN ports, but leaves 443 open for TLS.
  3. Try to initiate a session.

Expected behaviour

Audio/video calls should work.

Actual behaviour

The server selects one of the TURN servers at random and passes it to the client. If the client wasn't luck enough to get the turns: server on 443, their connection will fail.

Talk app

Talk app version: 11 Custom Signaling server configured: no

Custom TURN server configured: yes

Custom STUN server configured: yes/no (see additional admin settings: /index.php/settings/admin/talk#stun_server)

Browser

Microphone available: yes

Camera available: yes

Operating system: all

Browser name: all

Browser version: a/n

See discussion at #257 and #4715

nickvergessen commented 3 years ago

If you update to Nextcloud 21 and Talk 11 you can select turns vs turn

plinss commented 3 years ago

@nickvergessen I'm aware, this actually relies on it (I'll edit the Talk version momentarily). The issue is when there are both turns: and turn: servers specified. The server will select one at random and the turn: server may not work due to firewall issues.

brknkfr commented 3 years ago

I was confronting the same problem. How does nextcloud actually select the turn servers? Is this "random" or does nextcloud talk make some tests to find a "working" turn server if there are multiple turn servers specified?

nickvergessen commented 3 years ago

https://github.com/nextcloud/spreed/blob/20f2c3186b7cc58cd0327a49f1cb547ba579d3e4/lib/Config.php#L271-L276

brknkfr commented 3 years ago

Is there a reason that only one turn server is selected? Or is there a way to provide maybe multiple turn servers to the client (browser, nextcloud talk app), so that the client has to choose a working one?

Afaik, Jitsi checks somehow on the client side if creating udp or tcp connections works. When only tcp connections are working, Jitsi provides multiple tcp turnservers (if specified so) and no udp turnservers and the other way round. Then it's up to the client what turnserver to use. I think this is handled differently in different browsers (For example Firefox isn't very happy about more than three turnservers).

nickvergessen commented 3 years ago

https://github.com/nextcloud/spreed/pull/427/files#diff-a9facf8547f41c9326235cee2e92dcfeed102a5eb138475b054fe1f13028e80aR83

because back in 2017 I didn't know that multiple turns on the client would work after seeing the firefox error and after that no one made a PR to fix it.

plinss commented 3 years ago

For the record, I did submit a PR that fixes it, #4715.

That PR was primarily about adding support for turns, but changing the random selection to a set of servers passed to the client was part of the fix to make turns support actually useful (and would have fixed this issue).

Instead, a previously unsubmitted PR, taking a different approach to adding turns support was merged, #5086, leaving this issue unresolved.

As part of the discussion around my PR, there was a desire to keep the current random selection behavior for load balancing purposes, and a desire to change the turn server configuration into a list of sets of turn servers (or a set of lists depending on how the grouping is arranged). I don't have the bandwidth to make that change as it also involves a fair bit of UI work.

I personally don't agree with that approach either. It adds complexity to the UI, would make a (likely) incompatible change to the way the configuration is stored (breaking automations that manage Nextcloud configs, like ones I use), and serves little purpose IMO. Anyone needing to load balance their turn servers can simply put the turn servers behind a load balancer (which would also offer more control of the load balancing behavior). The current UI also leaves the impression that the admin is already defining a set of turn servers, not a list of turn servers to select from for load balancing purposes.

If there's agreement with that thinking, taking the part of my previous PR that removes the random selection and presents the full set of turn servers to the client is relatively simple, is something I could do in a few minutes, and would be happy to take care of.

brknkfr commented 3 years ago

I'm quite a newbie concerning the development of a nextcloud app, but I tried to resolve this issue with https://github.com/nextcloud/spreed/pull/5491.

With this patch, you would get at least a working call session in very restricted networks (only outgoing tls traffic on port 443/tcp) when you provide at least three turnservers like turn:turn.example.tld?transport=udp, turn:turn.example.tld?transport=tcp and turns:turn.example.tld?transport=tcp (this one will work).

Unfortunately I don't know really how to generate "successful" test for the php stuff. Any help would be appreciated.