CESNET / UltraGrid

UltraGrid low-latency audio and video network transmission system
http://www.ultragrid.cz
Other
492 stars 53 forks source link

nat-helper segfaults when running on public server #258

Closed reduzent closed 1 year ago

reduzent commented 1 year ago

I compiled nat-helper on Debian Bullseye (amd64) and intend to run it as a service on server with a public IP address. It seems that it crashes after running for a few days. I haven't figured out a way to make it crash on purpose. It could well be that random endpoints connect that don't talk the protocol and send arbitrary messages that make it crash.

I'm not too familiar with cmake. If there is a simple way to create a debug build, please let me know and I'll try to get a backtrace from such a crash.

This is what I get from the service log:

Sep 25 13:36:13 tpf-server nat-helper[487]: Error reading client description
Sep 26 14:57:50 tpf-server nat-helper[487]: Error reading client description
Sep 26 14:57:50 tpf-server nat-helper[487]: Error reading client description
Sep 26 14:57:50 tpf-server nat-helper[487]: Error reading client description
Sep 26 14:57:50 tpf-server nat-helper[487]: Moving client  to room
Sep 26 14:57:50 tpf-server nat-helper[487]: Removing empty room testi_audio
Sep 26 14:57:50 tpf-server nat-helper[487]: Removing empty room testi_video
Sep 26 14:57:50 tpf-server nat-helper[487]: Creating room
Sep 26 14:57:50 tpf-server systemd[1]: nat-helper.service: Main process exited, code=killed, status=11/SEGV
Sep 26 14:57:50 tpf-server systemd[1]: nat-helper.service: Failed with result 'signal'.
reduzent commented 1 year ago

Making it crash is apparently as simple as sending it two lines of garbage:

roman@nl-11852:~$ nc -v tpf-server.zhdk.ch 17990
Connection to tpf-server.zhdk.ch (86.119.43.229) 17990 port [tcp/*] succeeded!
jklgsfd
dsdfjlkdsfg

Here is the backtrace (I hope it is useful, gdb says something about missing debug symbols):

(No debugging symbols found in /usr/local/bin/nat-helper)
Starting program: /usr/local/bin/nat-helper -p 17990
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Running
Error reading client description
Error reading client description
Error reading client description
Moving client  to room 
Creating room 

Program received signal SIGSEGV, Segmentation fault.
0x0000555555568cce in Client::readCandidate(std::function<void (Client&, bool)>) ()
(gdb) backtrae
Undefined command: "backtrae".  Try "help".
(gdb) backtrace
#0  0x0000555555568cce in Client::readCandidate(std::function<void (Client&, bool)>) ()
#1  0x0000555555571d7f in Room::addClient(std::shared_ptr<Client>&&) ()
#2  0x000055555555de21 in NatHelper::onClientDesc(Client&, bool) ()
#3  0x0000555555567e4b in Client::readDescComplete(std::function<void (Client&, bool)>, bool) ()
#4  0x00005555555692b9 in std::_Function_handler<void (bool), std::_Bind<void (Client::*(std::shared_ptr<Client>, std::function<void (Client&, bool)>, std::_Placeholder<1>))(std::function<void (Client&, bool)>, bool)> >::_M_invoke(std::_Any_data const&, bool&&) ()
#5  0x000055555556a2d8 in Message::async_readBodyComplete(std::function<void (bool)>, std::error_code const&, unsigned long) ()
#6  0x000055555556f06f in asio::detail::reactive_socket_recv_op<asio::mutable_buffers_1, asio::detail::read_op<asio::basic_stream_socket<asio::ip::tcp, asio::execution::any_executor<asio::execution::context_as_t<asio::execution_context&>, asio::execution::detail::blocking::never_t<0>, asio::execution::prefer_only<asio::execution::detail::blocking::possibly_t<0> >, asio::execution::prefer_only<asio::execution::detail::outstanding_work::tracked_t<0> >, asio::execution::prefer_only<asio::execution::detail::outstanding_work::untracked_t<0> >, asio::execution::prefer_only<asio::execution::detail::relationship::fork_t<0> >, asio::execution::prefer_only<asio::execution::detail::relationship::continuation_t<0> > > >, asio::mutable_buffers_1, asio::mutable_buffer const*, asio::detail::transfer_all_t, std::_Bind<void (Message::*(Message*, std::function<void (bool)>, std::_Placeholder<1>, std::_Placeholder<2>))(std::function<void (bool)>, std::error_code const&, unsigned long)> >, asio::execution::any_executor<asio::execution::context_as_t<asio::execution_context&>, asio::execution::detail::blocking::never_t<0>, asio::execution::prefer_only<asio::execution::detail::blocking::possibly_t<0> >, asio::execution::prefer_only<asio::execution::detail::outstanding_work::tracked_t<0> >, asio::execution::prefer_only<asio::execution::detail::outstanding_work::untracked_t<0> >, asio::execution::prefer_only<asio::execution::detail::relationship::fork_t<0> >, asio::execution::prefer_only<asio::execution::detail::relationship::continuation_t<0> > > >::do_complete(void*, asio::detail::scheduler_operation*, std::error_code const&, unsigned long) ()
#7  0x000055555555d0b1 in asio::detail::scheduler::run(std::error_code&) [clone .isra.0] ()
#8  0x000055555555d5ed in NatHelper::worker() ()
#9  0x000055555555ed70 in NatHelper::run(bool) ()
#10 0x000055555555a636 in main ()
(gdb) 
mpiatka commented 1 year ago

Thanks for reporting. It should now work correctly.

reduzent commented 1 year ago

I confirm the issue is fixed. When a client sends garbage, nat-helper quits the connection. Thanks!