meetecho / janus-gateway

Janus WebRTC Server
https://janus.conf.meetecho.com
GNU General Public License v3.0
8.15k stars 2.47k forks source link

websocket transport disconnected occasionally #2081

Closed tomnotcat closed 4 years ago

tomnotcat commented 4 years ago

I use videoroom plugin with websocket transport, most time janus works well, but occasionally websocket will disconnect, the janus log file output: [libwebsockets][ERR] ** 0x7fe6cc0026e0: Sending new 187 (<81>~), pending truncated ... It's illegal to do an lws_write outside of the writable callback: fix your code

lminiero commented 4 years ago

That's exactly what we do, if you even bothered to check the code. But it you think that's not the case, I will consider "fixing my code" when you say "please".

tomnotcat commented 4 years ago

I'm sorry for your misunderstand, “fix your code” is part of libwebsockets log, Not added by me

lminiero commented 4 years ago

Sorry about that, I shouldn't read mails early in the morning... :blush: It's weird we get that error, though, because we definitely only do writes from the lws callbacks. I'll reopen and check on Monday.

tomnotcat commented 4 years ago

I'm appreciated for all your guys amazing work!

I googled and found this post, https://github.com/warmcat/libwebsockets/issues/852

It seems we should test lws_send_pipe_choked before call lws_write when receive LWS_CALLBACK_SERVER_WRITEABLE

But I'm not sure whether this will fix the bug, I'm going to test it for a while.

lminiero commented 4 years ago

Ah, thanks for the pointer! This comment makes sense indeed:

However two or more lws_write() inside the writable callback can also cause it, if the first one was unable to send everything.

So you should marshall what you want to write into a buffer and write it once (on lws_write() call) per writable callback.

as we do have a mechanism to send pending data. I'll have to check if we try to send more stuff in sequence, which is likely. I'll keep you posted.

lminiero commented 4 years ago

Mh, no, that's not our case: I verified and in the LWS_CALLBACK_CLIENT_WRITEABLE callback we only do a singe lws_write and then call lws_callback_on_writable again before returning, waiting for the callback to be invoked again. This means we never call lws_write more than once in the same callback.

tomnotcat commented 4 years ago

I tested for some days in production environment, After add lws_send_pipe_choked check, this error never happen again. I created a pull request that fix this bug: https://github.com/meetecho/janus-gateway/pull/2107

lminiero commented 4 years ago

Closing as the fix was merged.