Glimesh / janus-ftl-plugin

A plugin for the Janus WebRTC gateway to enable relaying of audio/video streams utilizing Mixer's FTL (Faster-Than-Light) protocol.
https://hayden.fyi/posts/2020-08-03-Faster-Than-Light-protocol-engineering-notes.html
GNU Affero General Public License v3.0
46 stars 11 forks source link

Possible file descriptor leak? #18

Closed haydenmc closed 3 years ago

haydenmc commented 4 years ago

We've seen instances of the service failing due to having too many file descriptors open, even though we don't expect too many. Valgrind or some other tool should shed some light on this.

danstiner commented 3 years ago
Can Confirm

I tested by running with valgrind --track-fds=yes /opt/janus/bin/janus

If I connected once, disconnected, and then stopped the server there would be 11 open sockets.

If I connected and disconnected nine times and then stopped the server there would be 19 open sockets. As you can see below it seems each time a new stream is started from a FTL client the main listen AF_INET socket on port 8084 is leaked from startListenThread https://github.com/Glimesh/janus-ftl-plugin/blob/2b1319ecd50bde6d67a1920ee1cead306eccb961/IngestServer.cpp#L102

Log snippet from valgrind:

Bye!
==15672== 
==15672== FILE DESCRIPTORS: 18 open at exit.
==15672== Open AF_INET socket 26: 192.168.40.29:8084 <-> unbound
==15672==    at 0x506A49F: accept (accept.c:26)
==15672==    by 0x7DCBF0E: IngestServer::startListenThread() (IngestServer.cpp:102)
==15672==    by 0x7DD0CE8: void std::__invoke_impl<void, void (IngestServer::*)(), IngestServer*>(std::__invoke_memfun_deref, void (IngestServer::*&&)(), IngestServer*&&) (invoke.h:73)
==15672==    by 0x7DD0C02: std::__invoke_result<void (IngestServer::*)(), IngestServer*>::type std::__invoke<void (IngestServer::*)(), IngestServer*>(void (IngestServer::*&&)(), IngestServer*&&) (invoke.h:95)
==15672==    by 0x7DD0B52: void std::thread::_Invoker<std::tuple<void (IngestServer::*)(), IngestServer*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) (thread:244)
==15672==    by 0x7DD0A1E: std::thread::_Invoker<std::tuple<void (IngestServer::*)(), IngestServer*> >::operator()() (thread:251)
==15672==    by 0x7DD09A7: std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (IngestServer::*)(), IngestServer*> > >::_M_run() (thread:195)
==15672==    by 0x178DDD83: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28)
==15672==    by 0x505F608: start_thread (pthread_create.c:477)
==15672==    by 0x519B292: clone (clone.S:95)
==15672== 
==15672== Open AF_INET socket 25: 192.168.40.29:8084 <-> unbound
==15672==    at 0x506A49F: accept (accept.c:26)
==15672==    by 0x7DCBF0E: IngestServer::startListenThread() (IngestServer.cpp:102)
==15672==    by 0x7DD0CE8: void std::__invoke_impl<void, void (IngestServer::*)(), IngestServer*>(std::__invoke_memfun_deref, void (IngestServer::*&&)(), IngestServer*&&) (invoke.h:73)
==15672==    by 0x7DD0C02: std::__invoke_result<void (IngestServer::*)(), IngestServer*>::type std::__invoke<void (IngestServer::*)(), IngestServer*>(void (IngestServer::*&&)(), IngestServer*&&) (invoke.h:95)
==15672==    by 0x7DD0B52: void std::thread::_Invoker<std::tuple<void (IngestServer::*)(), IngestServer*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) (thread:244)
==15672==    by 0x7DD0A1E: std::thread::_Invoker<std::tuple<void (IngestServer::*)(), IngestServer*> >::operator()() (thread:251)
==15672==    by 0x7DD09A7: std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (IngestServer::*)(), IngestServer*> > >::_M_run() (thread:195)
==15672==    by 0x178DDD83: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28)
==15672==    by 0x505F608: start_thread (pthread_create.c:477)
==15672==    by 0x519B292: clone (clone.S:95)
==15672== 
==15672== Open AF_INET socket 24: 192.168.40.29:8084 <-> unbound
==15672==    at 0x506A49F: accept (accept.c:26)
==15672==    by 0x7DCBF0E: IngestServer::startListenThread() (IngestServer.cpp:102)
==15672==    by 0x7DD0CE8: void std::__invoke_impl<void, void (IngestServer::*)(), IngestServer*>(std::__invoke_memfun_deref, void (IngestServer::*&&)(), IngestServer*&&) (invoke.h:73)
==15672==    by 0x7DD0C02: std::__invoke_result<void (IngestServer::*)(), IngestServer*>::type std::__invoke<void (IngestServer::*)(), IngestServer*>(void (IngestServer::*&&)(), IngestServer*&&) (invoke.h:95)
==15672==    by 0x7DD0B52: void std::thread::_Invoker<std::tuple<void (IngestServer::*)(), IngestServer*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) (thread:244)
==15672==    by 0x7DD0A1E: std::thread::_Invoker<std::tuple<void (IngestServer::*)(), IngestServer*> >::operator()() (thread:251)
==15672==    by 0x7DD09A7: std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (IngestServer::*)(), IngestServer*> > >::_M_run() (thread:195)
==15672==    by 0x178DDD83: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28)
==15672==    by 0x505F608: start_thread (pthread_create.c:477)
==15672==    by 0x519B292: clone (clone.S:95)
==15672== 
==15672== Open AF_INET socket 23: 192.168.40.29:8084 <-> unbound
==15672==    at 0x506A49F: accept (accept.c:26)
==15672==    by 0x7DCBF0E: IngestServer::startListenThread() (IngestServer.cpp:102)
==15672==    by 0x7DD0CE8: void std::__invoke_impl<void, void (IngestServer::*)(), IngestServer*>(std::__invoke_memfun_deref, void (IngestServer::*&&)(), IngestServer*&&) (invoke.h:73)
==15672==    by 0x7DD0C02: std::__invoke_result<void (IngestServer::*)(), IngestServer*>::type std::__invoke<void (IngestServer::*)(), IngestServer*>(void (IngestServer::*&&)(), IngestServer*&&) (invoke.h:95)
==15672==    by 0x7DD0B52: void std::thread::_Invoker<std::tuple<void (IngestServer::*)(), IngestServer*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) (thread:244)
==15672==    by 0x7DD0A1E: std::thread::_Invoker<std::tuple<void (IngestServer::*)(), IngestServer*> >::operator()() (thread:251)
==15672==    by 0x7DD09A7: std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (IngestServer::*)(), IngestServer*> > >::_M_run() (thread:195)
==15672==    by 0x178DDD83: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28)
==15672==    by 0x505F608: start_thread (pthread_create.c:477)
==15672==    by 0x519B292: clone (clone.S:95)
==15672== 
==15672== Open AF_INET socket 22: 192.168.40.29:8084 <-> unbound
==15672==    at 0x506A49F: accept (accept.c:26)
==15672==    by 0x7DCBF0E: IngestServer::startListenThread() (IngestServer.cpp:102)
==15672==    by 0x7DD0CE8: void std::__invoke_impl<void, void (IngestServer::*)(), IngestServer*>(std::__invoke_memfun_deref, void (IngestServer::*&&)(), IngestServer*&&) (invoke.h:73)
==15672==    by 0x7DD0C02: std::__invoke_result<void (IngestServer::*)(), IngestServer*>::type std::__invoke<void (IngestServer::*)(), IngestServer*>(void (IngestServer::*&&)(), IngestServer*&&) (invoke.h:95)
==15672==    by 0x7DD0B52: void std::thread::_Invoker<std::tuple<void (IngestServer::*)(), IngestServer*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) (thread:244)
==15672==    by 0x7DD0A1E: std::thread::_Invoker<std::tuple<void (IngestServer::*)(), IngestServer*> >::operator()() (thread:251)
==15672==    by 0x7DD09A7: std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (IngestServer::*)(), IngestServer*> > >::_M_run() (thread:195)
==15672==    by 0x178DDD83: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28)
==15672==    by 0x505F608: start_thread (pthread_create.c:477)
==15672==    by 0x519B292: clone (clone.S:95)
==15672== 
==15672== Open AF_INET socket 21: 192.168.40.29:8084 <-> unbound
==15672==    at 0x506A49F: accept (accept.c:26)
==15672==    by 0x7DCBF0E: IngestServer::startListenThread() (IngestServer.cpp:102)
==15672==    by 0x7DD0CE8: void std::__invoke_impl<void, void (IngestServer::*)(), IngestServer*>(std::__invoke_memfun_deref, void (IngestServer::*&&)(), IngestServer*&&) (invoke.h:73)
==15672==    by 0x7DD0C02: std::__invoke_result<void (IngestServer::*)(), IngestServer*>::type std::__invoke<void (IngestServer::*)(), IngestServer*>(void (IngestServer::*&&)(), IngestServer*&&) (invoke.h:95)
==15672==    by 0x7DD0B52: void std::thread::_Invoker<std::tuple<void (IngestServer::*)(), IngestServer*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) (thread:244)
==15672==    by 0x7DD0A1E: std::thread::_Invoker<std::tuple<void (IngestServer::*)(), IngestServer*> >::operator()() (thread:251)
==15672==    by 0x7DD09A7: std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (IngestServer::*)(), IngestServer*> > >::_M_run() (thread:195)
==15672==    by 0x178DDD83: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28)
==15672==    by 0x505F608: start_thread (pthread_create.c:477)
==15672==    by 0x519B292: clone (clone.S:95)
==15672== 
==15672== Open AF_INET socket 20: 192.168.40.29:8084 <-> unbound
==15672==    at 0x506A49F: accept (accept.c:26)
==15672==    by 0x7DCBF0E: IngestServer::startListenThread() (IngestServer.cpp:102)
==15672==    by 0x7DD0CE8: void std::__invoke_impl<void, void (IngestServer::*)(), IngestServer*>(std::__invoke_memfun_deref, void (IngestServer::*&&)(), IngestServer*&&) (invoke.h:73)
==15672==    by 0x7DD0C02: std::__invoke_result<void (IngestServer::*)(), IngestServer*>::type std::__invoke<void (IngestServer::*)(), IngestServer*>(void (IngestServer::*&&)(), IngestServer*&&) (invoke.h:95)
==15672==    by 0x7DD0B52: void std::thread::_Invoker<std::tuple<void (IngestServer::*)(), IngestServer*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) (thread:244)
==15672==    by 0x7DD0A1E: std::thread::_Invoker<std::tuple<void (IngestServer::*)(), IngestServer*> >::operator()() (thread:251)
==15672==    by 0x7DD09A7: std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (IngestServer::*)(), IngestServer*> > >::_M_run() (thread:195)
==15672==    by 0x178DDD83: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28)
==15672==    by 0x505F608: start_thread (pthread_create.c:477)
==15672==    by 0x519B292: clone (clone.S:95)
==15672== 
==15672== Open AF_INET socket 19: 192.168.40.29:8084 <-> unbound
==15672==    at 0x506A49F: accept (accept.c:26)
==15672==    by 0x7DCBF0E: IngestServer::startListenThread() (IngestServer.cpp:102)
==15672==    by 0x7DD0CE8: void std::__invoke_impl<void, void (IngestServer::*)(), IngestServer*>(std::__invoke_memfun_deref, void (IngestServer::*&&)(), IngestServer*&&) (invoke.h:73)
==15672==    by 0x7DD0C02: std::__invoke_result<void (IngestServer::*)(), IngestServer*>::type std::__invoke<void (IngestServer::*)(), IngestServer*>(void (IngestServer::*&&)(), IngestServer*&&) (invoke.h:95)
==15672==    by 0x7DD0B52: void std::thread::_Invoker<std::tuple<void (IngestServer::*)(), IngestServer*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) (thread:244)
==15672==    by 0x7DD0A1E: std::thread::_Invoker<std::tuple<void (IngestServer::*)(), IngestServer*> >::operator()() (thread:251)
==15672==    by 0x7DD09A7: std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (IngestServer::*)(), IngestServer*> > >::_M_run() (thread:195)
==15672==    by 0x178DDD83: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28)
==15672==    by 0x505F608: start_thread (pthread_create.c:477)
==15672==    by 0x519B292: clone (clone.S:95)
==15672== 
==15672== Open AF_INET socket 7: 192.168.40.29:8084 <-> unbound
==15672==    at 0x506A49F: accept (accept.c:26)
==15672==    by 0x7DCBF0E: IngestServer::startListenThread() (IngestServer.cpp:102)
==15672==    by 0x7DD0CE8: void std::__invoke_impl<void, void (IngestServer::*)(), IngestServer*>(std::__invoke_memfun_deref, void (IngestServer::*&&)(), IngestServer*&&) (invoke.h:73)
==15672==    by 0x7DD0C02: std::__invoke_result<void (IngestServer::*)(), IngestServer*>::type std::__invoke<void (IngestServer::*)(), IngestServer*>(void (IngestServer::*&&)(), IngestServer*&&) (invoke.h:95)
==15672==    by 0x7DD0B52: void std::thread::_Invoker<std::tuple<void (IngestServer::*)(), IngestServer*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) (thread:244)
==15672==    by 0x7DD0A1E: std::thread::_Invoker<std::tuple<void (IngestServer::*)(), IngestServer*> >::operator()() (thread:251)
==15672==    by 0x7DD09A7: std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (IngestServer::*)(), IngestServer*> > >::_M_run() (thread:195)
==15672==    by 0x178DDD83: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28)
==15672==    by 0x505F608: start_thread (pthread_create.c:477)
==15672==    by 0x519B292: clone (clone.S:95)
==15672== 
==15672== Open file descriptor 4:
==15672==    at 0x519B48B: eventfd (syscall-template.S:78)
==15672==    by 0x4B7F77C: ??? (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.6400.3)
==15672==    by 0x4B33EC9: g_main_context_new (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.6400.3)
==15672==    by 0x4B33FC4: g_main_context_default (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.6400.3)
==15672==    by 0x4B373AC: g_main_loop_new (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.6400.3)
==15672==    by 0x122A3A: main (janus.c:5311)
==15672== 
==15672== Open file descriptor 15:
==15672==    at 0x519B48B: eventfd (syscall-template.S:78)
==15672==    by 0x4B7F77C: ??? (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.6400.3)
==15672==    by 0x4B33EC9: g_main_context_new (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.6400.3)
==15672==    by 0x3FD7ED27: janus_http_init (janus_http.c:589)
==15672==    by 0x121BF2: main (janus.c:5248)
==15672== 
==15672== Open AF_UNIX socket 14: <unknown>
==15672==    at 0x519C93E: socketpair (syscall-template.S:78)
==15672==    by 0x3FD7A9AE: ??? (in /opt/janus/lib/janus/transports/libjanus_http.so.0.0.0)
==15672==    by 0x121BF2: main (janus.c:5248)
==15672== 
==15672== Open AF_UNIX socket 13: <unknown>
==15672==    at 0x519C93E: socketpair (syscall-template.S:78)
==15672==    by 0x3FD7A9AE: ??? (in /opt/janus/lib/janus/transports/libjanus_http.so.0.0.0)
==15672==    by 0x121BF2: main (janus.c:5248)
==15672== 
==15672== Open AF_INET socket 6: 0.0.0.0:8084 <-> unbound
==15672==    at 0x519C90B: socket (syscall-template.S:78)
==15672==    by 0x7DCB95F: IngestServer::Start() (IngestServer.cpp:41)
==15672==    by 0x7DD2939: JanusFtl::Init(janus_callbacks*, char const*) (JanusFtl.cpp:50)
==15672==    by 0x7DD0FA5: Init(janus_callbacks*, char const*) (janus_ftl.cpp:105)
==15672==    by 0x123685: main (janus.c:5129)
==15672== 
==15672== Open file descriptor 5:
==15672==    at 0x519B48B: eventfd (syscall-template.S:78)
==15672==    by 0x4B7F77C: ??? (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.6400.3)
==15672==    by 0x4B33EC9: g_main_context_new (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.6400.3)
==15672==    by 0x7B47F9F: janus_videoroom_init (janus_videoroom.c:2402)
==15672==    by 0x123685: main (janus.c:5129)
==15672== 
==15672== Open file descriptor 2: /home/dan/projects/glimesh/janus-ftl-plugin/output-6.log
==15672==    <inherited from parent>
==15672== 
==15672== Open file descriptor 1: /home/dan/projects/glimesh/janus-ftl-plugin/output-6.log
==15672==    <inherited from parent>
==15672== 
==15672== Open file descriptor 0: /dev/pts/2
==15672==    <inherited from parent>
==15672== 
==15672== 
==15672== Events    : Ir
==15672== Collected : 6120144545
==15672== 
==15672== I   refs:      6,120,144,545

local-valgrind-output-startstops-file-leak.log

danstiner commented 3 years ago

My first guess would be this is just a missing call to close() on the listen socket after the listener has entirely stopped.

I see a call to shutdown() but not one to close(). Maybe a close() after line 83 would be enough but I want to think it through more. https://github.com/Glimesh/janus-ftl-plugin/blob/2b1319ecd50bde6d67a1920ee1cead306eccb961/IngestServer.cpp#L75-L84

This guide explains the difference between shutdown and close: https://beej.us/guide/bgnet/html/#close-and-shutdownget-outta-my-face

haydenmc commented 3 years ago

D'oh

The disconnect sequencing needs some love in general - there are other things that need improvement here (like gracefully letting the client know they're about to be disconnected).