Kurento / bugtracker

[ARCHIVED] Contents migrated to monorepo: https://github.com/Kurento/kurento
46 stars 10 forks source link

kurento is not accepting ws connections after some hours of work (restart is needed) #259

Open darchuk opened 6 years ago

darchuk commented 6 years ago

KMS Version:

kurento-media-server -v Kurento Media Server version: 6.7.1 Found modules: 'core' version 6.7.1 'elements' version 6.7.1 'filters' version 6.7.1 'proctorplugin' version 0.0.1-dev

Other libraries versions: We are using https://github.com/Kurento/kms-opencv-plugin-sample and openCV 2.4.13

dpkg -l | egrep -i "kurento|gstreamer|nice"

ii  gir1.2-gst-plugins-base-1.0           1.8.3-1ubuntu0.2                           amd64        GObject introspection data for the GStreamer Plugins Base library
ii  gir1.2-gst-plugins-base-1.5           1.8.1.1.xenial-20170725154709.55.7b19cfd   amd64        Description: GObject introspection data for the GStreamer Plugins Base library
ii  gir1.2-gstreamer-1.0                  1.8.3-1-ubuntu0.1                          amd64        GObject introspection data for the GStreamer library
ii  gir1.2-gstreamer-1.5                  1.8.1.1.xenial-20170725152356.170.0d6031b  amd64        Description: GObject introspection data for the GStreamer library
ii  gstreamer1.0-clutter-3.0              3.0.18-1                                   amd64        Clutter PLugin for GStreamer 1.0
ii  gstreamer1.0-doc                      1.8.3-1-ubuntu0.1                          all          GStreamer core documentation and manuals
ii  gstreamer1.0-plugins-base:amd64       1.8.3-1ubuntu0.2                           amd64        GStreamer plugins from the "base" set
ii  gstreamer1.0-plugins-good:amd64       1.8.3-1ubuntu0.4                           amd64        GStreamer plugins from the "good" set
ii  gstreamer1.0-x:amd64                  1.8.3-1ubuntu0.2                           amd64        GStreamer plugins for X11 and Pango
ii  gstreamer1.5-libav:amd64              1.8.2.1.xenial-20170725171352.96.493eee4   amd64        libav plugin for GStreamer
ii  gstreamer1.5-nice:amd64               0.1.13.1.xenial-20170725160546.81.eebfdab  amd64        ICE library (GStreamer plugin)
ii  gstreamer1.5-plugins-bad:amd64        1.8.1.1.xenial-20170725164047.100.3db37b1  amd64        GStreamer plugins from the "bad" set
ii  gstreamer1.5-plugins-base:amd64       1.8.1.1.xenial-20170725154709.55.7b19cfd   amd64        GStreamer plugins from the "base" set
ii  gstreamer1.5-plugins-good:amd64       1.8.1.1.xenial-20170725161537.112.9ee4248  amd64        GStreamer plugins from the "good" set
ii  gstreamer1.5-plugins-ugly:amd64       1.8.1.1.xenial-20170725170621.89.2685b0f   amd64        GStreamer plugins from the "ugly" set
ii  gstreamer1.5-pulseaudio:amd64         1.8.1.1.xenial-20170725161537.112.9ee4248  amd64        GStreamer plugin for PulseAudio
ii  gstreamer1.5-x:amd64                  1.8.1.1.xenial-20170725154709.55.7b19cfd   amd64        GStreamer plugins for X11 and Pango
ii  kms-cmake-utils                       6.7.1.xenial-20180321182604.1.d04e4ce      all          Kurento CMake utils
ii  kms-core                              6.7.1.xenial.20180321183643.47e0370        amd64        Kurento Core module
rc  kms-core-6.0                          6.7.0.xenial-20171219004723.21.e100fbf     amd64        Kurento core module
ii  kms-core-dev                          6.7.1.xenial.20180321183643.47e0370        amd64        Kurento Core module
ii  kms-elements                          6.7.1.xenial.20180321184406.5aaf4a0        amd64        Kurento Elements module
rc  kms-elements-6.0                      6.7.0.xenial-20171228141917.27.1ffdf34     amd64        Kurento elements module
ii  kms-elements-dev                      6.7.1.xenial.20180321184406.5aaf4a0        amd64        Kurento Elements module
ii  kms-filters                           6.7.1.xenial.20180321185049.7f2d045        amd64        Kurento Filters module
ii  kms-filters-dev                       6.7.1.xenial.20180321185049.7f2d045        amd64        Kurento Filters module
ii  kms-jsonrpc                           6.7.1.xenial-20180321183209.1.637bf2a      amd64        Kurento JSON-RPC library
ii  kms-jsonrpc-dev                       6.7.1.xenial-20180321183209.1.637bf2a      amd64        Kurento JSON-RPC library
ii  kmsjsoncpp                            1.6.3.xenial.20170725151047.d78deb7        amd64        Kurento jsoncpp library
ii  kmsjsoncpp-dev                        1.6.3.xenial.20170725151047.d78deb7        amd64        Kurento jsonrpc library
ii  kurento-media-server                  6.7.1.xenial.20180321185711.dd49a2c        amd64        Kurento Media Server
ii  kurento-media-server-dev              6.7.1.xenial.20180321185711.dd49a2c        amd64        Kurento Media Server
ii  kurento-module-creator                6.7.1.xenial-20180321182858.7.23c2fb8      amd64        Kurento Module Creator
ii  libclutter-gst-3.0-0:amd64            3.0.18-1                                   amd64        Open GL based interactive canvas library GStreamer elements
ii  libgstreamer-plugins-bad1.5-0:amd64   1.8.1.1.xenial-20170725164047.100.3db37b1  amd64        GStreamer development files for libraries from the "bad" set
ii  libgstreamer-plugins-base1.0-0:amd64  1.8.3-1ubuntu0.2                           amd64        GStreamer libraries from the "base" set
ii  libgstreamer-plugins-base1.0-dev      1.8.3-1ubuntu0.2                           amd64        GStreamer development files for libraries from the "base" set
ii  libgstreamer-plugins-base1.5-0:amd64  1.8.1.1.xenial-20170725154709.55.7b19cfd   amd64        GStreamer libraries from the "base" set
ii  libgstreamer-plugins-base1.5-dev      1.8.1.1.xenial-20170725154709.55.7b19cfd   amd64        GStreamer development files for libraries from the "base" set
ii  libgstreamer-plugins-good1.0-0:amd64  1.8.3-1ubuntu0.4                           amd64        GStreamer development files for libraries from the "good" set
ii  libgstreamer1.0-0:amd64               1.8.3-1-ubuntu0.1                          amd64        Core GStreamer libraries and elements
ii  libgstreamer1.0-dev                   1.8.3-1-ubuntu0.1                          amd64        GStreamer core development files
ii  libgstreamer1.5-0:amd64               1.8.1.1.xenial-20170725152356.170.0d6031b  amd64        Core GStreamer libraries and elements
ii  libgstreamer1.5-dev                   1.8.1.1.xenial-20170725152356.170.0d6031b  amd64        GStreamer core development files
ii  libnice-dev                           0.1.13.1.xenial-20170725160546.81.eebfdab  amd64        ICE library (development files)
ii  libnice10:amd64                       0.1.13.1.xenial-20170725160546.81.eebfdab  amd64        ICE library (shared library)
ii  openh264-gst-plugins-bad-1.5:amd64    1.8.1.1.xenial-20170725164047.100.3db37b1  amd64        GStreamer plugins from openh264
ii  openwebrtc-gst-plugins                0.10.0.1.xenial-20170725173413.1.33ccf19   amd64        OpenWebRTC specific GStreamer plugins

Client libraries

Browsers tested

System description: Operating system where the client is running

lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 16.04.4 LTS Release: 16.04 Codename: xenial

How is the system deployed? Are KMS and client in the same network? Are you using TURN/STUN?

The same result when client and KMS are in the same network and in defferent ones. We are using TURN.

Config file in /etc/kurento/kurento.conf.json

{ "mediaServer" : { "resources": { // //Resources usage limit for raising an exception when an object creation is attempted // "exceptionLimit": "0.8", // // Resources usage limit for restarting the server when no objects are alive // "killLimit": "0.7", // Garbage collector period in seconds "garbageCollectorPeriod": 240 }, "net" : { "websocket": { "port": 8888, //"secure": { // "port": 8433, // "certificate": "defaultCertificate.pem", // "password": "" //}, //"registrar": { // "address": "ws://localhost:9090", // "localAddress": "localhost" //}, "path": "kurento", "threads": 10 } } } }

We have such issue on production, where users are using different version of chrome.Kurento is running and everything works fine and after some time later kurento is still running, but can't handle any connection.

Finally we've found the way how to reproduce it. We run auto-tests (1st user is streaming,2nd one connecting as a viewer, stream is running for 10 minutes, stop process (release pipeline etc.)) After some hours of running tests we face with described problem, kurento is not able to accept connections anymore. What steps will reproduce the problem?

  1. Register user one
  2. Start streaming as user one
  3. Register user two
  4. User two is a viewer of user one stream
  5. View stream for 10 minutes
  6. Stop
  7. Repeat

Expected result: Kurento keeps accepting connections.

Actual result: Kurento process is alive, but ws connection can't be established.

Last logs kurento has written: 2018-06-15 05:06:23,491615 756 [0x00007fd3fbfff700] debug KurentoMediaSet MediaSet.cpp:122 doGarbageCollection() Running garbage collector 2018-06-15 05:10:23,491924 756 [0x00007fd3fbfff700] debug KurentoMediaSet MediaSet.cpp:122 doGarbageCollection() Running garbage collector 2018-06-15 05:10:23,492051 756 [0x00007fd3fbfff700] warning KurentoMediaSet MediaSet.cpp:130 doGarbageCollection() Session timeout: 3d7c941c-ec4b-43ea-875f-370fc73672b0 2018-06-15 05:10:23,492354 756 [0x00007fd403fff700] debug KurentoMediaElementImpl MediaElementImpl.cpp:1013 disconnect() Disconnecting 15d6b0e4-0678-4a0c-924a-89b90ded59be_kurento.MediaPipeline/7016d3f4-754c-48c1-86d8-dcbc28f8395a_kurento.WebRtcEndpoint - 15d6b0e4-0678-4a0c-924a-89b90ded59be_kurento.MediaPipeline/29b3bbcd-a974-4fd5-a402-f9de1033a32b_proctorplugin.ProctorPlugin params AUDIO default default 2018-06-15 05:10:26,492446 756 [0x00007fd408ea4700] warning KurentoWorkerPool WorkerPool.cpp:155 operator()() Worker threads locked. Spawning a new one.

After that any ws connection to kurento will be refused.

UP: netstat -anpl| grep kurento| wc -l 464

root@ip-10-0-1-73:/home/ubuntu# free -m total used free shared buff/cache available Mem: 15038 996 11592 40 2449 13623 Swap: 0 0

lsof| grep kurento | wc -l 530025 (see file in attach)

cat /proc/1681/limits Limit Soft Limit Hard Limit Units Max cpu time unlimited unlimited seconds Max file size unlimited unlimited bytes Max data size unlimited unlimited bytes Max stack size 8388608 unlimited bytes Max core file size unlimited unlimited bytes Max resident set unlimited unlimited bytes Max processes 60053 60053 processes Max open files 766673 766673 files Max locked memory 65536 65536 bytes Max address space unlimited unlimited bytes Max file locks unlimited unlimited locks Max pending signals 60053 60053 signals Max msgqueue size 819200 819200 bytes Max nice priority 0 0 Max realtime priority 0 0 Max realtime timeout unlimited unlimited us

pstree -p 1681| wc -l 572

xenyou commented 5 years ago

I have the same problem. very critical!!

xenyou commented 5 years ago

Although I don't know if it's related to this, garbage collector often run just before stalling KMS.

2018-07-12 15:00:06,045955 24956 [0x00007fed74fd8700] debug KurentoMediaSet MediaSet.cpp:122 doGarbageCollection() Running garbage collector

Does gc something evil?

PonyHugger commented 5 years ago

Same problem over here! We need to get this fixed!

Jaehdawg commented 5 years ago

We see this issue as well.

buzeandrei commented 5 years ago

We are having the same issue with a client in production. After a couple of hours of streaming Kurento doesn't accept connections anymore and we need to restart the app. We think it might be because of insufficient CPU/RAM as after it crashes almost 100% of the memory is used.

micaelgallego commented 5 years ago

What kind of connections KMS is not attending? WebSocket connections using KurentoClient?

darchuk commented 5 years ago

What kind of connections KMS is not attending? WebSocket connections using KurentoClient? Correct. btw, I can't even open WebSocket connection with kurento directly, bypassing kurentoClient

buzeandrei commented 5 years ago

In my case, Kurento seemed not to listen on the default WebSocket port at all.

micaelgallego commented 5 years ago

Usually Application Server uses KurentoClient to connect to KMS and maintain WebSocket connection opened. Is this your case? Requests give you timeouts? Or you connect and disconnect KurentoClient from time to time?

We are preparing KMS load testing to try to solve these issues.

darchuk commented 5 years ago

Yes, we use KurentoClient to maintain WebSocket connections, and keep KurentoClient connected as long as application server is running. At some time kurento is starting to send timeout exceptions (we use default 30s timeout)

xenyou commented 5 years ago

There are two cases in our production.

case1

  1. boot kurento-media-server
  2. connect to the server using kurento-client(ws) and the connection is established.
  3. Do some conversation with KMS via kurento-client (node module)
  4. at some time, kurento-client got timeout for some request.

case2

  1. boot kurento-media-server
  2. try to connect to the server, but can not connect. kurento-client module says this error message repeatedly.
reconnect to server 1549 10000 dd6db11f-ab91-4fd5-b0d2-3bb80504905f
reconnect to server 1549 10000 a9a72208-52df-4e32-ac1f-8b96fac683e8
reconnect to server 1549 10000 b1ef16c9-72f3-4db5-b6f8-9ba5d3482cdf
reconnect to server 1549 10000 75c19c81-bc60-457d-99e2-5c122a349986

note: On our environment, kurento ping message is sent to the server once per 1minute for monitoring.

https://doc-kurento.readthedocs.io/en/stable/features/kurento_protocol.html#ping

xenyou commented 5 years ago

Because of the case2, I don't think that load is so important.

alexei-talankov-oxagile commented 5 years ago

Any updates on this?

j1elo commented 4 years ago

Please change this line to increment the number of threads from 1 to 10:

static WorkerPool workers (10);

then build and test if this improves the issue (instructions here to build from sources)

j1elo commented 4 years ago

You can now test this with an experimental installation of Kurento, using this apt-get repository in your /etc/apt/sources.list.d/kurento.list:

deb http://ubuntu.openvidu.io/eventhandler-10-threads xenial kms6

Or with this Docker image https://hub.docker.com/r/kurento/kurento-media-server-exp

kurento/kurento-media-server-exp:eventhandler-10-threads
darchuk commented 4 years ago

You can now test this with an experimental installation of Kurento, using this apt-get repository in your /etc/apt/sources.list.d/kurento.list:

deb http://ubuntu.openvidu.io/eventhandler-10-threads xenial kms6

Or with this Docker image https://hub.docker.com/r/kurento/kurento-media-server-exp

kurento/kurento-media-server-exp:eventhandler-10-threads

Hi j1elo, thanks for answer. Why should we prefer experimental kurento over usual one, what if we just fork kurento module and change it by ourselves?

j1elo commented 4 years ago

It's just a more convenient method for some users that don't want or don't know how to build from sources; I had to do the images for some internal tests anyway, so I just told you about them here.

Of course, if you can just do as I said in my previous comment https://github.com/Kurento/bugtracker/issues/259#issuecomment-523077731 then just do it and let me know if the change improves anything

xenyou commented 4 years ago

Using KMS 6.10 with Ubuntu 18.04 might solve this issue.

We recently changed the versions (KMS 6.10 from 6.6, Ubuntu 18.04 from 16.04). And the issue has not been happened yet.

But I'm not sure. I'll watch the situation for a while.

j1elo commented 4 years ago

Ok let me know if it happens... the eventhandler-10-threads branch (https://github.com/Kurento/kms-core/commit/5e9cb3f46f731f90bd9ee3e14464497231d7b6c4) is still waiting for feedback to know if it's a real improvement or not, for this issue.

If you don't see the problem with KMS 6.10 and Ubuntu 18.04, then it means the change in eventhandler-10-threads is irrelevant. But then we wouldn't know if it got solved by 6.10 or by Ubuntu 18.04 :-)

In any case, let us know of what you find out, thanks!

xenyou commented 4 years ago

It's not happened yet after upgrading Ubuntu and KMS.

hdev72 commented 2 years ago

I have this problem too, any body resolved this?