meetecho / janus-gateway

Janus WebRTC Server
https://janus.conf.meetecho.com
GNU General Public License v3.0
8.23k stars 2.48k forks source link

[1.x] crash when ws connected #3191

Closed nikohpng closed 1 year ago

nikohpng commented 1 year ago

Description

two publisher publish stream. suddenly, one publisher offline, then the publisher re-publish stream.

then janus-gateway crash

Enviroment

JanusLog

At the time, I was testing and debug log have opened, but there was no log output at that time

CoreDump

#0  0x00007f3c383456bd in lws_buflist_next_segment_len () from /usr/local/lib/libwebsockets.so.19
#1  0x00007f3c383aa578 in rops_handle_POLLIN_ws () from /usr/local/lib/libwebsockets.so.19
#2  0x00007f3c3837e29f in lws_service_fd_tsi () from /usr/local/lib/libwebsockets.so.19
#3  0x00007f3c3833ab03 in _lws_plat_service_forced_tsi () from /usr/local/lib/libwebsockets.so.19
#4  0x00007f3c3833af59 in _lws_plat_service_tsi () from /usr/local/lib/libwebsockets.so.19
#5  0x00007f3c3833afbc in lws_plat_service () from /usr/local/lib/libwebsockets.so.19
#6  0x00007f3c3837e467 in lws_service () from /usr/local/lib/libwebsockets.so.19
#7  0x00007f3c385f1f17 in janus_websockets_thread (data=0x13bb9e0) at transports/janus_websockets.c:1119
#8  0x00007f3c8f4c2abd in g_thread_proxy (data=0x13b29e0) at ../subprojects/glib-2.64.2/glib/gthread.c:807
#9  0x00007f3c8dd60ea5 in start_thread () from /lib64/libpthread.so.0
#10 0x00007f3c8da898dd in clone () from /lib64/libc.so.6
atoppi commented 1 year ago

Worth mentioning that @nikohpng already opened an issue on lws too.

I'd wait for their feedback since the crash seems to happen deep in the library while handling a buffer.

lminiero commented 1 year ago

@nikohpng any reason why you closed this? It doesn't seem like this was addressed in the libwebsockets repo yet.

nikohpng commented 1 year ago

sorry. originally wanted to close another issue

lminiero commented 1 year ago

Any update on this? Did it crash again? I don't see activity on the lws issue, and it doesn't look like something we need to fix in Janus from what I've understood. If so, I'll close this.

shivanshtalwar commented 1 year ago

Hey guys, i don't know if it's related to issue i am facing but i will mention it just in case, So @lminiero i was trying to setup videoroom cascading in our in house orchestrator server and we are using websocket as the transport to connect to janus servers individually, now as soon as we publish more than 3 publisher successfully janus destroys websocket connection to our orchaestrator with different error at different times in websocket.on('error')

  1. Invalid op code error
  2. Websocket RangeError: Invalid WebSocket frame: RSV2 and RSV3 must be clear We also tried with http polling based approach in that janus rest service (8088) gives timeout error as we add polling on each janus session maintained in orchaestrator
atoppi commented 1 year ago

@shivanshtalwar that sounds unrelated to the issue and to Janus code in general. Maybe something is broken in the way you are handling websocket connections and frames.

shivanshtalwar commented 1 year ago

@atoppi and what do you have to say about polling why it starts giving out timeout error is it resource intensive from janus's point of view?

atoppi commented 1 year ago

@shivanshtalwar this is not the right place to discuss how to manage the Janus API. Please stop submitting your questions in an unrelated issue and refer to our discourse group.

lminiero commented 1 year ago

Any update on this? Did it crash again? I don't see activity on the lws issue, and it doesn't look like something we need to fix in Janus from what I've understood. If so, I'll close this.

Closing.