Kurento / bugtracker

[ARCHIVED] Contents migrated to monorepo: https://github.com/Kurento/kurento
46 stars 10 forks source link

KMS not releasing TURN relay ports #494

Open RaistGH opened 4 years ago

RaistGH commented 4 years ago

This bug is already fixed in version 6.14, but as per the release notes, I think it has passed unnoticed. I'd like to report it here in order to keep in mind to avoid it happens again in future releases and for people facing the same issue.

Issue description

After some time working, and (apparently) randomly, the TURN connection stops working and no more connections can be established. The browser shows the error message ICE failed, add a TURN server and see about:webrtc for more details or ICE failed, your TURN server appears to be broken, see about:webrtc for more details. Restarting TURN server doesn't solve the issue. Only restarting KMS works. Then again, after some time the issue appears.

How to reproduce?

This is happening in KMS 6.13.1 and 6.13.2 but it's not happening in 6.13.0 and 6.14.0 Instructions using Docker container

  1. Use an environment where TURN server is needed.
  2. Reduce the number of relay ports to a short number, such as 4 (in BaseRtpEndpoint.conf.ini)
  3. Use your browser to establish a connection through KMS as you normally do, then stop it.
  4. Do this 5 times. The 4 first will work. It will fail on the 5th (NOTE: This will depend on your TURN connection (I think). Mine was consuming 2 ports with every connection, so it would fail on the 3th).

In order to confirm the issue:

  1. Enter the docker container: from the server where the KMS container is deployed execute (where is the name of the container) docker exec -it <CONTAINER_NAME> /bin/bash
  2. Install network tools in the container: apt-get update && apt-get install net-tools
  3. Network connections will be permanently open (in this case, relay binding ports are 4000x):
    root@2e345056a015:/# netstat -tupan
    Active Internet connections (servers and established)
    Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
    **tcp        0      0 172.17.0.2:40000        0.0.0.0:*               LISTEN      1/kurento-media-ser
    tcp        0      0 172.17.0.2:40001        0.0.0.0:*               LISTEN      1/kurento-media-ser
    tcp        0      0 172.17.0.2:40002        0.0.0.0:*               LISTEN      1/kurento-media-ser
    tcp        0      0 172.17.0.2:40003        0.0.0.0:*               LISTEN      1/kurento-media-ser**
    tcp        0      0 127.0.0.1:55680         127.0.0.1:8443           TIME_WAIT   -               
    tcp6       0      0 :::8888                 :::*                    LISTEN      1/kurento-media-ser
    tcp6       0      0 :::8443                  :::*                    LISTEN      1/kurento-media-ser
    tcp6       0      0 127.0.0.1:8443           127.0.0.1:55658         TIME_WAIT   -               
    **udp     5376      0 172.17.0.2:40000        0.0.0.0:*                           1/kurento-media-ser
    udp        0      0 172.17.0.2:40001        0.0.0.0:*                           1/kurento-media-ser
    udp     5376      0 172.17.0.2:40002        0.0.0.0:*                           1/kurento-media-ser
    udp        0      0 172.17.0.2:40003        0.0.0.0:*                           1/kurento-media-ser**
  4. When configuring TURN server in verbose mode, it will throw the following errors in /var/log/turn__.log:
    Trying to bind fd 55 to <172.17.0.1:40000>: errno=98
    Trying to bind fd 55 to <172.17.0.1:40001>: errno=98
    Trying to bind fd 55 to <172.17.0.1:40002>: errno=98
    Trying to bind fd 55 to <172.17.0.1:40003>: errno=98
  5. If installing coturn in the container to test the client side the command turnutils_uclient -v -p 3478 -u <TURN_USER> -w <TURN_PASS> 172.17.0.1 will result in an error coturn error 508 (Cannot create socket)

Expected & current behavior

KMS should release the ports used for every relay connection once the connection is stopped. However it doesn't, blocking allocation of further ports for next connections.

(Optional) Possible solution

Upgrade KMS to version 6.14.0 or downgrade to version 6.13.0

INFO about Kurento Media Server

INFO about your environment

Using Docker version Using Recorder Endpoint Using TURN server (coturn) in the same server as the KMS container

RaistGH commented 4 years ago

This is related to this discussion.

@j1elo , just mentioning you to take it into account and closing the issue.

j1elo commented 4 years ago

This was probably caused by a regression introduced in libnice at some point after 0.1.16. We updated libnice for one of the 6.13.1 or 6.13.2 releases, and then immediately started receiving reports about File Descriptors being exhausted, and after some investigation were able to find the issue in libnice.

Opened a bug report: https://gitlab.freedesktop.org/libnice/libnice/-/issues/110 and this helped to get the problem resolved quickly. libnice fixed this issue for their release 0.1.17, and this is the version that ships with Kurento 6.14.0.

RaistGH commented 2 years ago

It seems like the issue is still ongoing but not that frequently. So we've not been able to catch it yet, but from time to time (few months) we need to restart KMS due to this. I couldn't have a close look to it until now, but definitely this is what is happening again, all ports being exhausted. Unfortunately I cannot reproduce it yet. We're using KMS 6.16, but I guess this was the same in 6.14, we just didn't notice it fast enough. Do you have any suggestion to log what's happening so I can inspect and/or paste here the log whenever it happens again?