Closed vincentfretin closed 6 months ago
v0.14.2 bump JANUS_PLUGIN_API_VERSION to 18 and v0.14.3 will be JANUS_PLUGIN_API_VERSION 19 so requiring updates in janus-plugin-rs but for now the changes are not interesting for janus-plugin-sfu.
Memory doesn't seem to be properly freed when users leave a room with latest version of libnice. Versions tested: ok libnice 2021-02-21 (post 0.1.18) 36aa468c4916cfccd4363f0e27af19f2aeae8604 ok libnice 2021-04-19 21:05 (post 0.1.18) e1a841356d227a15a32477b18186df036dd3c479 ok libnice 2020-07-06 13:53 (post 0.1.18) 48dac0d702b134f7b11b92602c234ba1120cc75b agent: keep a track of the candidate refreshes being pruned ok libnice 2020-12-06 19:09 (post 0.1.18) b353f30cfce498ffc2f1057d2d14aeb4b183e671 udp-turn: don't allocate large arrays on the stack ko libnice 2020-12-06 19:12 (post 0.1.18) a3f535669de74ba707fbd11c268ac04e60564a65 agent: don't allocate large arrays on the stack => the two previous commits come from https://gitlab.freedesktop.org/libnice/libnice/-/merge_requests/171 ko libnice 2021-04-20 15:09 (post 0.1.18) 8ccac4b04759edaecead7470e28865aff8066921 ko libnice 2020-12-19 17:56 (post 0.1.18) caf9f1d8b1d4675c8c88f8f6fa04d3ff5e27f09a ko libnice 2021-05-11 10:52 (post 0.1.18) a874865807cdd3b27f63e32c9d2675fe31878158 ko 2021-08-19 15:12 (post 0.1.18) 4401e346a7517614d9ff51bdaec5adaaf55c2f5e ko libnice 2022-05-03 (0.1.19) 64ef27e4847016568985f0f3c1fe4a4fb632e408 ko libnice 2023-01-06 (0.1.20) ca7e68f06ead448996da612b06eac5f2c5c5b713 ko libnice 2023-01-07 (0.1.21) 3d9cae16a5094aadb1651572644cb5786a8b4e2d ko libnice 2024-03-04 (0.1.22) ae3eb16fd7d1237353bf64e899c612b8a63bca8a
And looking at the RSS column (memory) of the ps aux
command with:
watch "ps aux |grep janus|grep ^nobody"
with recent libnice versions, the memory never decrease when users leave the room.
48dac0d702b134f7b11b92602c234ba1120cc75b is the latest working correctly
BUT even this commit has the memory that doesn't free if event_loops configuration is used in janus.jcfg. Like this
in .env
:
EVENT_LOOPS=4
ALLOW_LOOP_INDICATION=true
some infos for debug build:
ubuntu 24.04 remove cargo from apt list add libasan8
janus:
RUN git clone -b 0.x https://github.com/meetecho/janus-gateway.git && \
cd janus-gateway && \
git checkout e933d2224b272af35afeb1f472d0fe2640b34ae6 && \
sh autogen.sh && \
CFLAGS="${CFLAGS} -O0 -g -ggdb3 -fno-omit-frame-pointer -fsanitize=address -fno-sanitize-recover=all -fsanitize-address-use-after-scope" LDFLAGS="-fsanitize=address" ./configure --prefix=/usr --disable-all-plugins --disable-all-handlers && \
make && make install && make configs && \
cd / && rm -rf janus-gateway
janus plugin:
RUN git clone -b master https://github.com/capticxyz/janus-plugin-sfu-captic.git janus-plugin-sfu && \
cd janus-plugin-sfu && \
git checkout 8a4969214e7b86219fdd2d7981a02fda005ed4c6 && \
rm Cargo.lock && \
echo version 2 increment this line to invalidate cache of this layer while iterating build during development && \
curl https://sh.rustup.rs -sSf | sh -s -- --default-toolchain nightly -y && \
. "$HOME/.cargo/env" && \
rustup component add rust-src --toolchain nightly-x86_64-unknown-linux-gnu && \
RUSTFLAGS=-Zsanitizer=address cargo build -Zbuild-std --target x86_64-unknown-linux-gnu && \
mkdir -p /usr/lib/janus/plugins && \
mkdir -p /usr/lib/janus/events && \
cp target/x86_64-unknown-linux-gnu/debug/libjanus_plugin_sfu.so /usr/lib/janus/plugins && \
cd / && rm -rf janus-plugin-sfu ~/.cargo
https://gist.github.com/arpu/93c980ba1ca5410cf3e3d8a8b17f4a9c
v0.14.2 bump JANUS_PLUGIN_API_VERSION to 18 and v0.14.3 will be JANUS_PLUGIN_API_VERSION 19 so requiring updates in janus-plugin-rs but for now the changes are not interesting for janus-plugin-sfu.
https://github.com/meetecho/janus-gateway/commit/7bd1a432547ef6e06715f6864d49f06f16b89da7#diff-c3788174f08849e2bace2994661bedc4bc858764da32f3eeb20815df8c2b8e4c and https://github.com/meetecho/janus-gateway/commit/122dd83c54a31e7fdd7cc2dcf5a218912fbe1f9e don't need any change in janus-plugin-rs. I merged your changes @arpu to update glib-sys from 0.10 to 0.19. There is also bitflags that could be updated to a major version, but not sure what need to be changed in the code.
I pushed the patch that's currently in https://github.com/meetecho/janus-gateway/pull/3371
I fixed the second memory leak
Direct leak of 22 byte(s) in 10 object(s) allocated from:
#0 0x7f613f2e6b37 in malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:69
#1 0x7f613ee57af9 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x62af9) (BuildId: 9753724b85d60f97b5d5663181ef7f4e69a62131)
#2 0x7f613ee6d4c8 in g_strdup (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x784c8) (BuildId: 9753724b85d60f97b5d5663181ef7f4e69a62131)
#3 0x55df62920c7f in g_strdup_inline /usr/include/glib-2.0/glib/gstrfuncs.h:321
#4 0x55df62920c7f in janus_process_incoming_request /janus-gateway/janus.c:1758
#5 0x55df6292db37 in janus_transport_task /janus-gateway/janus.c:3478
#6 0x7f613ee86541 (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x91541) (BuildId: 9753724b85d60f97b5d5663181ef7f4e69a62131)
#7 0x7f613ee80c81 (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x8bc81) (BuildId: 9753724b85d60f97b5d5663181ef7f4e69a62131)
#8 0x7f613f249109 in asan_thread_start ../../../../src/libsanitizer/asan/asan_interceptors.cpp:234
SUMMARY: AddressSanitizer: 22 byte(s) leaked in 10 allocation(s).
that was on rust side this time, not calling g_free for transaction_text *char given as param to rust handle_message:
https://github.com/meetecho/janus-gateway/blob/a7767ad30b803d96e11b491547bcf5660cb7a937/janus.c#L1758-L1759
With libnice 48dac0d702b134f7b11b92602c234ba1120cc75b (post 0.1.18) I don't see any memory leak anymore.
I did a quick test with a release build and
docker run --net=host -e EVENT_LOOPS=4 -e MESSAGE_THREADS=1 janus:latest
using 4 users in a room, leaving, enter again, leaving, enter...
watch "ps aux |grep janus|grep ^nobody"
looking at RSS column, it gave
39708 -> 39384 -> 40152 -> 39232 -> 40256 -> 39708 -> 40348 -> 39776 -> 40672 -> 40232
That seems right.
I tested again libnice 0.1.22 (ae3eb16fd7d1237353bf64e899c612b8a63bca8a), the memory only keeps growing We have this leak
Direct leak of 69840 byte(s) in 2 object(s) allocated from:
#0 0x725d9ca7e4d0 in calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
#1 0x725d9c5ec7a1 in g_malloc0 (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x637a1) (BuildId: 9753724b85d60f97b5d5663181ef7f4e69a62131)
#2 0x580861beb9a4 in janus_sctp_association_create /janus-gateway/sctp.c:189
#3 0x580861ad1e47 in janus_dtls_srtp_create_sctp /janus-gateway/dtls.c:689
#4 0x580861b748e6 in janus_process_incoming_request /janus-gateway/janus.c:1662
#5 0x580861b84b37 in janus_transport_task /janus-gateway/janus.c:3478
#6 0x725d9c61a541 (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x91541) (BuildId: 9753724b85d60f97b5d5663181ef7f4e69a62131)
#7 0x725d9c614c81 (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x8bc81) (BuildId: 9753724b85d60f97b5d5663181ef7f4e69a62131)
#8 0x725d9c9e1109 in asan_thread_start ../../../../src/libsanitizer/asan/asan_interceptors.cpp:234
That one I really have no clue.
Image size reduced a lot! From 1.2GB to 155MB (amd64) And for Raspberry: arm64: 181MB (Raspberry Pi 400 based on ubuntu:24.04) armhf: 186MB (Raspberry Pi 2 based on debian:bookworm)
Update docker image to ubuntu:24.04, libwebsockets 4.3.3, libsrtp 2.6.0, libnice, usrsctp master, janus-gateway v0.14.2+
Refs https://github.com/networked-aframe/naf-janus-adapter/issues/62