Closed zevarito closed 1 year ago
Please test with #3247 too, since it addresses a couple of race conditions in the VideoRoom.
Thanks @lminiero will do.
I know you guys are on vacation right now, I am just updating the issue to keep you in the loop when you got back.
I've updated a few servers with the patch mentioned above but one of them only last 2 days up, it did not crash in the same way as previous crashes though, but it did last much less than previous revisions. I am attaching full BT let me know if you need something else, thank you!
#0 0x00007f91b022d843 in janus_videoroom_handler (data=<optimized out>) at plugins/janus_videoroom.c:11609
iter = {dummy1 = 0x7f912c92e2a0, dummy2 = 0x0, dummy3 = 0x0, dummy4 = 8, dummy5 = 0, dummy6 = 0xd}
value = 0x0
audiocodec = <optimized out>
vp9_profile = <optimized out>
temp = <optimized out>
jsep = <optimized out>
videoroom = <optimized out>
error_str = '\000' <repeats 344 times>...
start = <optimized out>
count = <optimized out>
answer = <optimized out>
h264_profile = <optimized out>
https://gist.github.com/zevarito/d71a35b43d1e699f2791784a952dad2e
I've updated a few servers with the patch mentioned above but one of them only last 2 days up, it did not crash in the same way as previous crashes though, but it did last much less than previous revisions. I am attaching full BT let me know if you need something else, thank you!
Thanks for the backtrace, can you try with https://github.com/meetecho/janus-gateway/pull/3259 ?
@zevarito any update? Did you try the proposed patch?
@atoppi no servers have crashed so far with this patch, the longest running is about 10 days now, however previous releases took 3 months some times to crash so deploying carefully. As in performance the patch doesn't seem to present any regressions.
@atoppi just in case, I was talking about #3259, unfortunately #3247 keep crashing as mentioned above.
@atoppi just in case, I was talking about #3259, unfortunately #3247 keep crashing as mentioned above.
Thanks @zevarito I've marked #3259 as ready for review in that case.
Closing as I merged @tmatth 's patch.
What version of Janus is this happening on? It does happen in latest e1c7704 but it is happening since a few months now.
Have you tested a more recent version of Janus too? Yes
Was this working before? This same exception and a similar one (same function) I think is around since May at least.
Is there a gdb or libasan trace of the issue?
Additional context I've also seen a related exception that I don't have the backtrace at hand which fails in the same function but a little bit earlier and the core dump generated says
janus
instead ofhloop
. My guess is that it is some sort of race condition when that list is generated and a publisher leaves the room at the same time. It crash approximately after 2 weeks on every server, but I have the sense that latest build might also crash early, updated recently will let you know.Let me know if you need me to provide any extra info and thank you very much for looking into this issue!