meetecho / janus-gateway

Janus WebRTC Server
https://janus.conf.meetecho.com
GNU General Public License v3.0
8.17k stars 2.47k forks source link

Possible memory leak in [1.2] #3408

Open do-not-set-2fa opened 2 months ago

do-not-set-2fa commented 2 months ago

What version of Janus is this happening on? master (from 2024-07-18) const char *janus_version_string = "1.2.4";

Have you tested a more recent version of Janus too? Yes

Was this working before? Yes

Is there a gdb or libasan trace of the issue? libasan report

Additional context Simple Peer connection via lua plugin, cleanup of all sessions and shutdown of janus

lminiero commented 2 months ago

I see references to incoming requests, which makes me think there were still active connections to Janus when you shut it down.

do-not-set-2fa commented 2 months ago

Ok i will check it again, but first vacation;)

do-not-set-2fa commented 2 months ago

I repeated the same steps, and killed all sessions (made sure that session_list admin request return empty list) and got same leaks from incoming request, can it be that session recovery timer keeps it in memory for that amount of time till it is fully cleared?

atoppi commented 2 months ago

Just test by terminating after the session reclaim timeout. E.g. if it's 30 seconds wait 60 seconds since the last destroy, then kill the server.

do-not-set-2fa commented 2 months ago

I have 120sec reclame, and waited +3 min but same libasan output

lminiero commented 2 months ago

You can try uncommenting the REFCOUNT_DEBUG define in refcount.h and recompile, which should show more info on the references Janus still has at shutdown.

atoppi commented 2 months ago

I suspect that janus_lua_session_free is never being called because the lua session is the only thing getting malloc'ed while creating the plugin session.

atoppi commented 2 months ago

@do-not-set-2fa are you able to repro with one of the stock lua plugins?

atoppi commented 2 months ago

I was trying to repro with echotest.lua and bumped into a memory leak that seems different

==52251==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 640 byte(s) in 8 object(s) allocated from:
    #0 0x58f97aeac743 in malloc (/usr/local/janus/bin/janus+0x27b743) (BuildId: 1a85cdaa67c38c85f1006f74bd2cbbe7db8a5234)
    #1 0x7c5ae62e8af9 in g_malloc /usr/src/glib2.0-2.80.0-6ubuntu3.1/debian/build/deb/../../../glib/gmem.c:100:13
    #2 0x7c5ae63416f1 in g_system_thread_new /usr/src/glib2.0-2.80.0-6ubuntu3.1/debian/build/deb/../../../glib/gthread-posix.c:1265:12
    #3 0x7c5ae275c1fb in janus_lua_method_pushevent /home/atoppi/src/janus-gateway/src/plugins/janus_lua.c:554:3
    #4 0x7c5ae1c56f7d in luaD_precall /build/lua5.3-LmUA4D/lua5.3-5.3.6/src/ldo.c:434:12
    #5 0x7c5ae1c617f8 in luaV_execute /build/lua5.3-LmUA4D/lua5.3-5.3.6/src/lvm.c:1134:13
    #6 0x7c5ae1c4f888 in luaD_rawrunprotected /build/lua5.3-LmUA4D/lua5.3-5.3.6/src/ldo.c:142:3
    #7 0x7c5ae1c522e7 in lua_resume /build/lua5.3-LmUA4D/lua5.3-5.3.6/src/ldo.c:664:12
    #8 0x7c5ae1c6e97f in auxresume /build/lua5.3-LmUA4D/lua5.3-5.3.6/src/lcorolib.c:39:12
    #9 0x7c5ae1c6eba9 in luaB_coresume /build/lua5.3-LmUA4D/lua5.3-5.3.6/src/lcorolib.c:60:7
    #10 0x7c5ae1c56f7d in luaD_precall /build/lua5.3-LmUA4D/lua5.3-5.3.6/src/ldo.c:434:12
    #11 0x7c5ae1c617f8 in luaV_execute /build/lua5.3-LmUA4D/lua5.3-5.3.6/src/lvm.c:1134:13
    #12 0x7c5ae1c57237 in luaD_call /build/lua5.3-LmUA4D/lua5.3-5.3.6/src/ldo.c:499:5
    #13 0x7c5ae1c572ab in luaD_callnoyield /build/lua5.3-LmUA4D/lua5.3-5.3.6/src/ldo.c:509:3
    #14 0x7c5ae1c572ab in lua_callk /build/lua5.3-LmUA4D/lua5.3-5.3.6/src/lapi.c:925:5
    #15 0x7c5ae2783c5b in janus_lua_scheduler /home/atoppi/src/janus-gateway/src/plugins/janus_lua.c:2533:4
    #16 0x7c5ae6311c81 in g_thread_proxy /usr/src/glib2.0-2.80.0-6ubuntu3.1/debian/build/deb/../../../glib/gthread.c:835:20
    #17 0x58f97aeaa22c in asan_thread_start(void*) asan_interceptors.cpp.o

This is a leak for sure, since for any echotest session started and closed, the amount of leaked bytes grows. Wondering if the root cause might be similar.

atoppi commented 2 months ago

3409 should fix the leak mentioned in the message above

atoppi commented 2 months ago

With #3409 I can't reproduce memory leaks anymore with echotest.lua

do-not-set-2fa commented 2 months ago

Great thanks, I will test too in a week or so, after vacation.

Regards, Mirko

On Wed, 24 Jul 2024, 12:18 Alessandro Toppi, @.***> wrote:

With #3409 https://github.com/meetecho/janus-gateway/pull/3409 I can't reproduce memory leaks anymore with echotest.lua

— Reply to this email directly, view it on GitHub https://github.com/meetecho/janus-gateway/issues/3408#issuecomment-2247511844, or unsubscribe https://github.com/notifications/unsubscribe-auth/BIALZIKJVPTUHM4EYCTGLA3ZN55OPAVCNFSM6AAAAABLEUTVWSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENBXGUYTCOBUGQ . You are receiving this because you were mentioned.Message ID: @.***>

do-not-set-2fa commented 1 month ago

@atoppi what transport plugin fif you use?

Have you tried with http?

atoppi commented 1 month ago

@do-not-set-2fa it's the same with both ws and http. No leak when using this patch and the official echotest.lua.

do-not-set-2fa commented 1 month ago

Ok this would explain the slow leak that i am experiencing over the long period of time (even though i can't seem to get the same libasan report as you do, probably doing something wrong :)

I'll wait for the merge and new tag and then start to deploy it to see the effect over time.

Thanks a lot @atoppi and @lminiero

atoppi commented 3 weeks ago

@do-not-set-2fa the PR has been merged and will soon be backported to 0.x too.

do-not-set-2fa commented 3 weeks ago

Great thanks

On Wed, 4 Sept 2024, 11:37 Alessandro Toppi, @.***> wrote:

@do-not-set-2fa https://github.com/do-not-set-2fa the PR has been merged and will soon be backported to 0.x too.

— Reply to this email directly, view it on GitHub https://github.com/meetecho/janus-gateway/issues/3408#issuecomment-2328384186, or unsubscribe https://github.com/notifications/unsubscribe-auth/BIALZIKU27UOXI6SWMQZTZ3ZU3IFJAVCNFSM6AAAAABLEUTVWSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMRYGM4DIMJYGY . You are receiving this because you were mentioned.Message ID: @.***>