htfy96 / c10k-server

[Abandoned] A toy asynchronous server, written in C++14
Apache License 2.0
19 stars 4 forks source link

Test failed: Read when socket is closed #3

Closed htfy96 closed 7 years ago

htfy96 commented 7 years ago

https://travis-ci.org/htfy96/c10k-server/jobs/195788297

5/5 Test #5: worker_thread_test ...............***Exception: SegFault  3.23 sec

[2017-01-27 07:48:57.243] [Server] [info] A new worker thread created

[2017-01-27 07:48:57.243] [DEBUG] [info] Main thread detached

[2017-01-27 07:49:00.243] [DEBUG] [info] Start to create clientWorker

[2017-01-27 07:49:00.243] [Client] [info] A new worker thread created

[2017-01-27 07:49:00.243] [DEBUG] [info] ClientWorker created

[2017-01-27 07:49:00.244] [DEBUG] [info] New fd added 9: 0/500

[2017-01-27 07:49:00.244] [Server] [info] new fd created = 7

[2017-01-27 07:49:00.244] [Server] [info] new fd created = 11

[2017-01-27 07:49:00.244] [DEBUG] [info] New fd added 12: 1/500

[2017-01-27 07:49:00.245] [Server] [info] new fd created = 13

[2017-01-27 07:49:00.245] [DEBUG] [info] New fd added 14: 2/500

[2017-01-27 07:49:00.245] [Server] [info] new fd created = 15

[2017-01-27 07:49:00.245] [DEBUG] [info] New fd added 7: 3/500

[2017-01-27 07:49:00.246] [Server] [info] new fd created = 9

[2017-01-27 07:49:00.247] [DEBUG] [info] New fd added 11: 4/500

[2017-01-27 07:49:00.247] [Server] [info] new fd created = 12

[2017-01-27 07:49:00.247] [DEBUG] [info] New fd added 13: 5/500

[2017-01-27 07:49:00.248] [Server] [info] new fd created = 14

[2017-01-27 07:49:00.248] [DEBUG] [info] New fd added 7: 6/500

[2017-01-27 07:49:00.249] [DEBUG] [info] New fd added 11: 7/500

[2017-01-27 07:49:00.249] [Server] [info] new fd created = 9

[2017-01-27 07:49:00.250] [DEBUG] [info] New fd added 7: 8/500

[2017-01-27 07:49:00.250] [Server] [info] new fd created = 12

[2017-01-27 07:49:00.251] [Server] [info] new fd created = 13

[2017-01-27 07:49:00.251] [DEBUG] [info] New fd added 9: 9/500

[2017-01-27 07:49:00.253] [Server] [info] new fd created = 11

[2017-01-27 07:49:00.253] [DEBUG] [info] New fd added 7: 10/500

[2017-01-27 07:49:00.254] [Server] [info] new fd created = 12

[2017-01-27 07:49:00.254] [DEBUG] [info] New fd added 9: 11/500

[2017-01-27 07:49:00.255] [Server] [info] new fd created = 13

[2017-01-27 07:49:00.255] [DEBUG] [info] New fd added 7: 12/500

[2017-01-27 07:49:00.256] [Server] [info] new fd created = 11

[2017-01-27 07:49:00.256] [DEBUG] [info] New fd added 9: 13/500

[2017-01-27 07:49:00.257] [Server] [info] new fd created = 12

[2017-01-27 07:49:00.257] [DEBUG] [info] New fd added 7: 14/500

[2017-01-27 07:49:00.258] [Server] [info] new fd created = 13

[2017-01-27 07:49:00.258] [DEBUG] [info] New fd added 9: 15/500

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

worker_thread-test is a Catch v1.6.1 host application.

Run with -? for options

-------------------------------------------------------------------------------

multiple WorkerThread test

-------------------------------------------------------------------------------

/data/test/worker_thread_test.cpp:98

...............................................................................

/data/test/worker_thread_test.cpp:98: FAILED:

due to unexpected exception with message:

  Read when socket is closed

===============================================================================

test casesworker_thread-test is a Catch v1.6.1 host application.

Run with -? for options

: 1 | 1 failed

assertions: 2 | 2 failed

80% tests passed, 1 tests failed out of 5

Total Test time (real) =   5.59 sec

The following tests FAILED:

      5 - worker_thread_test (SEGFAULT)
htfy96 commented 7 years ago

https://travis-ci.org/htfy96/c10k-server/jobs/195895315

htfy96 commented 7 years ago

The two bugs are probably caused by data race in eventloop::handle_event: Event e { data->fd } and eventloop:modify_event which changes data->fd in another thread. Current workaround is to force handle_event to acquire map_mutex before reading data->fd

https://github.com/htfy96/c10k-server/commit/7ce2d49d912b1af59b4d5c1bd439abe6eaaf4f47

htfy96 commented 7 years ago

Still broken in https://travis-ci.org/htfy96/c10k-server/jobs/195955076 7ce2d49.

1x SEGV 2x unhandled exception read when socket is closed

No DR was reported

htfy96 commented 7 years ago

With logbt we acquired backtrace of SEGV: https://travis-ci.org/htfy96/c10k-server/jobs/196185054

worker_thread-test: ../nptl/pthread_mutex_lock.c:81: __pthread_mutex_lock: Assertion `mutex->__data.__owner == 0' failed.

Core was generated by `/data/build/worker_thread-test'.

Program terminated with signal SIGABRT, Aborted.

#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:58
58  ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.

[Current thread is 1 (Thread 0x7ff0ced02700 (LWP 48))]

Thread 4 (Thread 0x7ff0cfd04700 (LWP 46)):
#0  0x00007ff0d0105ced in write () at ../sysdeps/unix/syscall-template.S:84
#1  0x00007ff0d0088927 in _IO_new_file_write (f=0x7ff0d03d0600 <_IO_2_1_stdout_>, data=0x891af0, n=707) at fileops.c:1271
#2  0x00007ff0d008a492 in new_do_write (to_do=707, data=0x891af0 "\n\n", '~' <repeats 158 times>, "\n\nworker_thread-test is a Catch v1.6.1 h"..., fp=0x7ff0d03d0600 <_IO_2_1_stdout_>) at fileops.c:526
#3  _IO_new_do_write (fp=0x7ff0d03d0600 <_IO_2_1_stdout_>, data=0x891af0 "\n\n", '~' <repeats 158 times>, "\n\nworker_thread-test is a Catch v1.6.1 h"..., to_do=707) at fileops.c:502
#4  0x00007ff0d0087ec8 in _IO_new_file_sync (fp=0x7ff0d03d0600 <_IO_2_1_stdout_>) at fileops.c:882
#5  0x00007ff0d007ca1c in __GI__IO_fflush (fp=0x7ff0d03d0600 <_IO_2_1_stdout_>) at iofflush.c:40
#6  0x00007ff0d06fd98e in std::ostream::flush() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x0000000000467d14 in Catch::ConsoleReporter::printTestCaseAndSectionHeader() ()
#8  0x0000000000467efc in Catch::ConsoleReporter::lazyPrint() ()
#9  0x00000000004683dc in Catch::ConsoleReporter::assertionEnded(Catch::AssertionStats const&) ()
#10 0x000000000045f881 in Catch::RunContext::assertionEnded(Catch::AssertionResult const&) ()
#11 0x0000000000448f6d in Catch::ResultBuilder::handleResult(Catch::AssertionResult const&) ()
#12 0x000000000044a292 in Catch::ResultBuilder::captureExpression() ()
#13 0x000000000046b9c1 in Catch::RunContext::handleFatalErrorCondition(std::string const&) ()
#14 0x00000000004579e5 in Catch::fatal(std::string const&, int) ()
#15 0x0000000000457a8c in Catch::FatalConditionHandler::handleSignal(int) ()
#16 <signal handler called>
#17 0x00007ff0c8000078 in ?? ()
#18 0x000000000047dc35 in c10k::EventLoop::loop() ()
#19 0x0000000000445201 in std::thread::_Impl<std::_Bind_simple<std::reference_wrapper<c10k::detail::WorkerThread<ServerHandler> > ()> >::_M_run() ()
#20 0x00007ff0d06a78f0 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#21 0x00007ff0d097b6ca in start_thread (arg=0x7ff0cfd04700) at pthread_create.c:333
#22 0x00007ff0d01160af in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:105

Thread 3 (Thread 0x7ff0d0da9740 (LWP 45)):
#0  __lll_lock_wait_private () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
#1  0x00007ff0d007d992 in __GI__IO_fwrite (buf=0x7ff0d03d1760 <_IO_stdfile_1_lock>, size=1, count=26, fp=0x7ff0d012407c <__lll_lock_wait_private+28>) at iofwrite.c:37
#2  0x00007ff0d06fe9f6 in std::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*, long) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3  0x000000000046794b in Catch::ConsoleReporter::printOpenHeader(std::string const&) ()
#4  0x0000000000467ae5 in Catch::ConsoleReporter::printTestCaseAndSectionHeader() ()
#5  0x0000000000467efc in Catch::ConsoleReporter::lazyPrint() ()
#6  0x00000000004683dc in Catch::ConsoleReporter::assertionEnded(Catch::AssertionStats const&) ()
#7  0x000000000045f881 in Catch::RunContext::assertionEnded(Catch::AssertionResult const&) ()
#8  0x0000000000448f6d in Catch::ResultBuilder::handleResult(Catch::AssertionResult const&) ()
#9  0x000000000044a292 in Catch::ResultBuilder::captureExpression() ()
#10 0x000000000044e158 in Catch::ResultBuilder::useActiveException(Catch::ResultDisposition::Flags) ()
#11 0x0000000000452e02 in Catch::runTests(Catch::Ptr<Catch::Config> const&) ()
#12 0x000000000043099a in main ()

Thread 2 (Thread 0x7ff0cf503700 (LWP 47)):
#0  0x00007ff0d09848fd in accept () at ../sysdeps/unix/syscall-template.S:84
#1  0x0000000000432954 in std::thread::_Impl<std::_Bind_simple<____C_A_T_C_H____T_E_S_T____0()::{lambda()#1} ()> >::_M_run() ()
#2  0x00007ff0d06a78f0 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3  0x00007ff0d097b6ca in start_thread (arg=0x7ff0cf503700) at pthread_create.c:333
#4  0x00007ff0d01160af in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:105

Thread 1 (Thread 0x7ff0ced02700 (LWP 48)):
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:58
#1  0x00007ff0d00453ea in __GI_abort () at abort.c:89
#2  0x00007ff0d003bbb7 in __assert_fail_base (fmt=<optimized out>, assertion=assertion@entry=0x7ff0d09873c2 "mutex->__data.__owner == 0", file=file@entry=0x7ff0d09873a5 "../nptl/pthread_mutex_lock.c", line=line@entry=81, function=function@entry=0x7ff0d0987530 <__PRETTY_FUNCTION__.8476> "__pthread_mutex_lock") at assert.c:92
#3  0x00007ff0d003bc62 in __GI___assert_fail (assertion=assertion@entry=0x7ff0d09873c2 "mutex->__data.__owner == 0", file=file@entry=0x7ff0d09873a5 "../nptl/pthread_mutex_lock.c", line=line@entry=81, function=function@entry=0x7ff0d0987530 <__PRETTY_FUNCTION__.8476> "__pthread_mutex_lock") at assert.c:101
#4  0x00007ff0d097e068 in __GI___pthread_mutex_lock (mutex=<optimized out>) at ../nptl/pthread_mutex_lock.c:81
#5  0x000000000047dc62 in c10k::EventLoop::loop() ()
#6  0x0000000000445171 in std::thread::_Impl<std::_Bind_simple<std::reference_wrapper<c10k::detail::WorkerThread<ClientHandler> > ()> >::_M_run() ()

#7  0x00007ff0d06a78f0 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6

#8  0x00007ff0d097b6ca in start_thread (arg=0x7ff0ced02700) at pthread_create.c:333
#9  0x00007ff0d01160af in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:105
htfy96 commented 7 years ago

Here is the most detailed backtrace until now from travis CI:

Thread 3 (Thread 0x7f49e28c2700 (LWP 45)):
#0  0x00007f49e324d323 in std::locale::locale() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#1  0x00007f49e324a9c8 in std::ios_base::_M_init() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2  0x00007f49e32a1c51 in std::basic_ios<char, std::char_traits<char> >::init(std::basic_streambuf<char, std::char_traits<char> >*) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3  0x00007f49e32c3d26 in std::__cxx11::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >::basic_ostringstream(std::_Ios_Openmode) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x0000559c47a0744e in Catch::ConsoleReporter::SummaryColumn::addRow (this=0x7f49e28bd900, count=2) at /data/vendor/Catch/include/reporters/catch_reporter_console.hpp:329
#5  0x0000559c47a079ed in Catch::ConsoleReporter::printTotals (this=0x559c48c54fa0, totals=...) at /data/vendor/Catch/include/reporters/catch_reporter_console.hpp:364
#6  0x0000559c47a05ea3 in Catch::ConsoleReporter::testRunEnded (this=0x559c48c54fa0, _testRunStats=...) at /data/vendor/Catch/include/reporters/catch_reporter_console.hpp:97
#7  0x0000559c479fa5e4 in Catch::RunContext::handleFatalErrorCondition (this=0x7ffe07827440, message="SIGSEGV - Segmentation violation signal") at /data/vendor/Catch/include/internal/catch_run_context.hpp:255
#8  0x0000559c479f8312 in Catch::fatal (message="SIGSEGV - Segmentation violation signal", exitCode=-11) at /data/vendor/Catch/include/internal/catch_fatal_condition.hpp:19
#9  0x0000559c479f83d7 in Catch::FatalConditionHandler::handleSignal (sig=11) at /data/vendor/Catch/include/internal/catch_fatal_condition.hpp:59
#10 <signal handler called>
#11 0x00007f49dc0008b0 in ?? ()
#12 0x0000559c47a5c052 in std::_Mem_fn_base<void (c10k::Connection::*)(c10k::Event const&), true>::_M_call<std::shared_ptr<c10k::Connection>&, c10k::Event const&> (this=0x7f49dc0069b0, __ptr=std::shared_ptr (empty) 0x0) at /usr/include/c++/5/functional:634
#13 0x0000559c47a5be0d in std::_Mem_fn_base<void (c10k::Connection::*)(c10k::Event const&), true>::operator()<std::shared_ptr<c10k::Connection>&, c10k::Event const&, void> (this=0x7f49dc0069b0, __object=std::shared_ptr (empty) 0x0) at /usr/include/c++/5/functional:610 // why do we have an empty shared_ptr here?
#14 0x0000559c47a5b8ab in std::_Bind<std::_Mem_fn<void (c10k::Connection::*)(c10k::Event const&)> (std::shared_ptr<c10k::Connection>, std::_Placeholder<1>)>::__call<void, c10k::Event const&, 0ul, 1ul>(std::tuple<c10k::Event const&>&&, std::_Index_tuple<0ul, 1ul>) (this=0x7f49dc0069b0, __args=<unknown type in /data/build/worker_thread-test, CU 0x17316a, DIE 0x1b7cca>) at /usr/include/c++/5/functional:1074
#15 0x0000559c47a5aa28 in std::_Bind<std::_Mem_fn<void (c10k::Connection::*)(c10k::Event const&)> (std::shared_ptr<c10k::Connection>, std::_Placeholder<1>)>::operator()<c10k::Event const&, void>(c10k::Event const&) (this=0x7f49dc0069b0) at /usr/include/c++/5/functional:1133
#16 0x0000559c47a5903c in std::_Function_handler<void (c10k::Event const&), std::_Bind<std::_Mem_fn<void (c10k::Connection::*)(c10k::Event const&)> (std::shared_ptr<c10k::Connection>, std::_Placeholder<1>)> >::_M_invoke(std::_Any_data const&, c10k::Event const&) (__functor=..., __args#0=...) at /usr/include/c++/5/functional:1871
#17 0x0000559c47a4ff28 in std::function<void (c10k::Event const&)>::operator()(c10k::Event const&) const (this=0x7f49d4000f28, __args#0=...) at /usr/include/c++/5/functional:2267
#18 0x0000559c47a4d745 in c10k::EventLoop::<lambda(const epoll_event&)>::operator()(const epoll_event &) const (__closure=0x7f49e28bec60, ev=...) at /data/src/c10k/event_loop.cpp:64
#19 0x0000559c47a4e004 in std::for_each<epoll_event*, c10k::EventLoop::handle_events(epoll_event*, epoll_event*)::<lambda(const epoll_event&)> >(epoll_event *, epoll_event *, c10k::EventLoop::<lambda(const epoll_event&)>) (__first=0x7f49e28bed6c, __last=0x7f49e28bed54, __f=...) at /usr/include/c++/5/bits/stl_algo.h:3767
#20 0x0000559c47a4d79e in c10k::EventLoop::handle_events (this=0x7ffe078264c0, st=0x7f49e28bed60, ed=0x7f49e28bed54) at /data/src/c10k/event_loop.cpp:65
#21 0x0000559c47a4d5da in c10k::EventLoop::loop (this=0x7ffe078264c0) at /data/src/c10k/event_loop.cpp:49
#22 0x0000559c479de0ce in c10k::detail::WorkerThread<ServerHandler>::operator() (this=0x7ffe078264c0) at /data/src/c10k/worker_thread.hpp:42
#23 0x0000559c479ddf59 in std::__invoke<c10k::detail::WorkerThread<ServerHandler>>(c10k::detail::WorkerThread<ServerHandler>&) (__f=...) at /usr/include/c++/5/functional:201
#24 0x0000559c479ddd73 in std::reference_wrapper<c10k::detail::WorkerThread<ServerHandler> >::operator()<>() const (this=0x559c48c59528) at /usr/include/c++/5/functional:428
#25 0x0000559c479ddb30 in std::_Bind_simple<std::reference_wrapper<c10k::detail::WorkerThread<ServerHandler> > ()>::_M_invoke<>(std::_Index_tuple<>) (this=0x559c48c59528) at /usr/include/c++/5/functional:1531
#26 0x0000559c479dd30a in std::_Bind_simple<std::reference_wrapper<c10k::detail::WorkerThread<ServerHandler> > ()>::operator()() (this=0x559c48c59528) at /usr/include/c++/5/functional:1520
#27 0x0000559c479dc92f in std::thread::_Impl<std::_Bind_simple<std::reference_wrapper<c10k::detail::WorkerThread<ServerHandler> > ()> >::_M_run() (this=0x559c48c59510) at /usr/include/c++/5/thread:115

#28 0x00007f49e32658f0 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6

#29 0x00007f49e35396ca in start_thread (arg=0x7f49e28c2700) at pthread_create.c:333
#30 0x00007f49e2cd40af in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:105
htfy96 commented 7 years ago
#1  0x0000562d556e639c in std::_Mem_fn<void (c10k::Connection::*)(c10k::Event const&)>::_M_call<std::shared_ptr<c10k::Connection>&, c10k::Event const&> (this=0x7fbb70003b30, __ptr=std::shared_ptr (count 1744830584, weak 32698) 0x7fbb68002770) at /usr/include/c++/4.9/functional:526

Since shared_cnt block is placed at the end of Handler, I guess there exists overflow in this program :(

htfy96 commented 7 years ago
#9  0x000055ab953e3c3e in c10k::EventLoop::handle_events (this=0x7ffdde173f30, st=0x7f953002fd60, ed=0x7f953002fd54) at /data/src/c10k/event_loop.cpp:65

WTF

htfy96 commented 7 years ago

Finally fixed in https://github.com/htfy96/c10k-server/commit/d0a931efce8ef1b222128addee25b3a656ae9b1f https://github.com/htfy96/c10k-server/commit/458d1772e22819f30133a5ad859a8dd16131b06c https://github.com/htfy96/c10k-server/commit/7ce2d49d912b1af59b4d5c1bd439abe6eaaf4f47 https://github.com/htfy96/c10k-server/commit/8d3abb5833412634ed944e04dbb607c30d5fbad6 https://github.com/htfy96/c10k-server/commit/75928e8be75e60edd377952ca28b84ec035b6c10 https://github.com/htfy96/c10k-server/commit/29daa0739ca506a07b3ce80830fb6329c5ba8bb5