Open bstarynk opened 4 years ago
It probably is related to signal(7) and multi-threading (see pthreads(7), nptl(7) and clone(2) ...), a topic known to be delicate.
An explanation or workaround could be inspired by Qt5 and Unix signals; see also of course signal-safety(7) and sigevent(7) ....
Asked https://stackoverflow.com/q/61370635/841108 still buggy in commit b616defc5e54ba86980f4d9031
Still buggy in commit 2ae9549245a5ac67055 using https://stackoverflow.com/a/61374592/841108
Perhaps we need to use pipes.
See https://doc.qt.io/qt-5/unix-signals.html for an explanation or inspiration and pipe(7) and signal-safety(7).
Partly fixed in commit 400aa4a5cb3e4b557fca3d63c7fc1 thanks to https://stackoverflow.com/a/61374592/841108
In commit cb8aabd2dde705819acf86d7e we still have this reproducible issue. It seems related to httplib.
/home/basile/helpcovid/helpcovid[3018484]: ΔBG!hcv_database.cc:560▪ 02.79 s‣ hcv_initialize_database: fullreqbuf=DO $$BEGIN RAISE LOG 'starting HelpCovid git cb8aabd2dde705819ac+ (built Sat 25 Apr 2020 12:11:47 PM MEST, md5 76d2c78eea7c95996904...) cleared on rimski pid 3018484'; END;$$;
/home/basile/helpcovid/helpcovid[3018484]: hcv_database.cc:568 - !! hcv_initialize_database got PostGreSQL version PostgreSQL 12.2 (Debian 12.2-4) on x86_64-pc-linux-gnu, compiled by gcc (Debian 9.3.0-8) 9.3.0, 64-bit(server version 12.2 (Debian 12.2-4))
/home/basile/helpcovid/helpcovid[3018484]: ΔBG!hcv_database.cc:353▪ 02.85 s‣ sql_register_helpcovid_instance starting
/home/basile/helpcovid/helpcovid[3018484]: ΔBG!hcv_database.cc:416▪ 02.87 s‣ sql_register_helpcovid_instance before insertion nowt=1587809561, timelong=1587809507, myuid=12752, myefuid=12752, mygid=4200, myefgid=4200, curexepath=/home/basile/helpcovid/helpcovid, cwdpath=/home/basile/helpcovid
/home/basile/helpcovid/helpcovid[3018484]: ΔBG!hcv_database.cc:464▪ 02.87 s‣ sql_register_helpcovid_instance osqlstr::
INSERT INTO tb_helpcovidinstance
(hcvinst_host, hcvinst_pid, hcvinst_execelf,
hcvinst_startime, hcvinst_buildtime,
hcvinst_cwd, hcvinst_topdir,
hcvinst_gitid, hcvinst_lastgitcommit,
hcvinst_md5sum,
hcvinst_linuxuid, hcvinst_linuxeuid,
hcvinst_linuxuser, hcvinst_linuxeffuser,
hcvinst_linuxgid, hcvinst_linuxegid,
hcvinst_compiler_version)
VALUES ('rimski', 3018484, '/home/basile/helpcovid/helpcovid',
--- @hcv_database.cc:444
to_timestamp(1587809561), to_timestamp(1587809507), '/home/basile/helpcovid', '/home/basile/helpcovid', -- @! hcv_database.cc:448
'cb8aabd2dde705819acf86d7e36bd34bac59e76a+',
'cb8aabd2dde7 testing nbs in hcv_initialize_templates',
'76d2c78eea7c95996904290e893a15cb',
--- @hcv_database.cc:452
12752, 12752, '', '',
--- @hcv_database.cc:457
4200, 4200, 'g++ (Debian 9.3.0-10) 9.3.0' )
!!!end osqlstr
/home/basile/helpcovid/helpcovid[3018484]: hcv_database.cc:471 - !! sql_register_helpcovid_instance completed, serial#1
/home/basile/helpcovid/helpcovid[3018484]: ΔBG!hcv_database.cc:588▪ 02.87 s‣ hcv_initialize_database before preparing statements in dbname=helpcovid_db user=helpcovid_usr password=passwd1234 hostaddr=127.0.0.1 port=5432
/home/basile/helpcovid/helpcovid[3018484]: ΔBG!hcv_database.cc:618▪ 02.87 s‣ preparing find_user_by_email_pstm
/home/basile/helpcovid/helpcovid[3018484]: ΔBG!hcv_database.cc:645▪ 02.87 s‣ Registering prepared SQL statement user_create_pstm
/home/basile/helpcovid/helpcovid[3018484]: ΔBG!hcv_database.cc:645▪ 02.87 s‣ Registering prepared SQL statement user_get_password_by_email_pstm
/home/basile/helpcovid/helpcovid[3018484]: ΔBG!hcv_plugins.cc:215▪ 02.87 s‣ hcv_initialize_plugins_for_database starting with 0 plugins
/home/basile/helpcovid/helpcovid[3018484]: hcv_database.cc:591 - !! PostGreSQL database dbname=helpcovid_db user=helpcovid_usr password=passwd1234 hostaddr=127.0.0.1 port=5432 successfully initialized
/home/basile/helpcovid/helpcovid[3018484]: ΔBG!hcv_main.cc:1017▪ 02.87 s‣ start of hcv_initialize_curlpp
/home/basile/helpcovid/helpcovid[3018484]: hcv_main.cc:1020 - !! initialized curlpp version libcurl/7.68.0 OpenSSL/1.1.1g zlib/1.2.11 brotli/1.0.7 libidn2/2.3.0 libpsl/0.21.0 (+libidn2/2.3.0) libssh2/1.8.0 nghttp2/1.40.0 librtmp/2.3 - see www.curlpp.org for more.
-: No such file or directory
/home/basile/helpcovid/helpcovid[3018484]: ΔBG!hcv_background.cc:237▪ 02.87 s‣ hcv_start_background_thread sigprocmask done
/home/basile/helpcovid/helpcovid[3018484]: ΔBG!hcv_background.cc:241▪ 02.87 s‣ hcv_start_background_thread hcv_bg_signal_fd=6
/home/basile/helpcovid/helpcovid[3018484]: ΔBG!hcv_background.cc:247▪ 02.87 s‣ hcv_start_background_thread hcv_bg_timer_fd=7
[New Thread 0x7ffff66b9700 (LWP 3019275)]
/home/basile/helpcovid/helpcovid[3018484]: hcv_background.cc:65 - !! hcv_background_thread_body starting thread hcovibg3018484
/home/basile/helpcovid/helpcovid[3018484]: ΔBG!hcv_background.cc:83▪ 02.87 s‣ hcv_background_thread_body signal mask set: #1=Hangup; #13=Broken pipe; #15=Terminated; #24=CPU time limit exceeded;
/home/basile/helpcovid/helpcovid[3018484]: ΔBG!hcv_background.cc:96▪ 02.87 s‣ hcv_background_thread_body before poll
Thread 1 "helpcovid" hit Breakpoint 3, std::thread::~thread (this=0x7fffffffdf20, __in_chrg=<optimized out>) at /usr/include/c++/9/thread:138
138 if (joinable())
(gdb) c
Continuing.
/home/basile/helpcovid/helpcovid[3018484]: ΔBG!hcv_background.cc:253▪ 05.97 s‣ hcv_start_background_thread did start hcv_bgthread of id 140737327634176
/home/basile/helpcovid/helpcovid[3018484]: hcv_web.cc:603 - !! Starting HelpCovid web server hcv_webserver@0x5555557589f0 with hcv_weburl=http://localhost:8089/ and 2 threads and 16777216 maximal payload
/home/basile/helpcovid/helpcovid[3018484]: ΔBG!hcv_web.cc:616▪ 05.97 s‣ helpcovid HTTP webhost='localhost' webport=8089
/home/basile/helpcovid/helpcovid[3018484]: hcv_web.cc:617 - !! weburl=http://localhost:8089/ listening on webhost=localhost webport=8089
/home/basile/helpcovid/helpcovid[3018484]: ΔBG!hcv_web.cc:649▪ 05.97 s‣ hcv_webserver_run with webport=8089 weburl= 'http://localhost:8089/'..
/home/basile/helpcovid/helpcovid[3018484]: ΔBG!hcv_plugins.cc:189▪ 05.97 s‣ hcv_initialize_plugins_for_web starting with 0 plugins
[New Thread 0x7ffff5eb8700 (LWP 3020004)]
[New Thread 0x7ffff56b7700 (LWP 3020005)]
Thread 1 "helpcovid" hit Breakpoint 3, std::thread::~thread (this=0x55555576d410, __in_chrg=<optimized out>) at /usr/include/c++/9/thread:138
138 if (joinable())
(gdb) c
Continuing.
/home/basile/helpcovid/helpcovid[3018484]: ΔBG!hcv_background.cc:120▪ 12.72 s‣ hcv_background_thread_body: after poll nbfd:1
/home/basile/helpcovid/helpcovid[3018484]: ΔBG!hcv_background.cc:140▪ 12.72 s‣ hcv_background_thread_body pollable hcv_bg_signal_fd=6
/home/basile/helpcovid/helpcovid[3018484]: ΔBG!hcv_background.cc:152▪ 12.72 s‣ hcv_background_thread_body: got signalinfo #15 from hcv_bg_signal_fd=6
/home/basile/helpcovid/helpcovid[3018484]: hcv_background.cc:156 - !! hcv_background_thread_body got SIGTERM at 12.7174 elapsed seconds
/home/basile/helpcovid/helpcovid[3018484]: ΔBG!hcv_background.cc:277▪ 12.72 s‣ start of hcv_process_SIGTERM_signal
/home/basile/helpcovid/helpcovid[3018484]: ΔBG!hcv_web.cc:129▪ 12.72 s‣ start of hcv_stop_web
/home/basile/helpcovid/helpcovid[3018484]: hcv_web.cc:134 - !! hcv_stop_web stopped the web service
/home/basile/helpcovid/helpcovid[3018484]: ΔBG!hcv_database.cc:721▪ 12.72 s‣ hcv_close_database start
/home/basile/helpcovid/helpcovid[3018484]: ΔBG!hcv_database.cc:725▪ 12.72 s‣ hcv_close_database dbnamestr=helpcovid_db
/home/basile/helpcovid/helpcovid[3018484]: ΔBG!hcv_database.cc:480▪ 12.72 s‣ sql_unregister_helpcovid_instance starting serial#1
/home/basile/helpcovid/helpcovid[3018484]: ΔBG!hcv_database.cc:485▪ 12.72 s‣ sql_unregister_helpcovid_instance sqlstr:
DELETE FROM tb_helpcovidinstance WHERE hcvinst_id = 1
/home/basile/helpcovid/helpcovid[3018484]: ΔBG!hcv_database.cc:488▪ 12.72 s‣ sql_unregister_helpcovid_instance deleted serial#1
/home/basile/helpcovid/helpcovid[3018484]: ΔBG!hcv_database.cc:731▪ 12.72 s‣ hcv_close_database before resetting database connection
/home/basile/helpcovid/helpcovid[3018484]: hcv_database.cc:733 - !! closed database helpcovid_db
/home/basile/helpcovid/helpcovid[3018484]: hcv_background.cc:280 - !! HelpCovid terminating on rimski process 3018484 built Sat 25 Apr 2020 12:11:47 PM MEST
... md5sum 76d2c78eea7c95996904290e893a15cb lastgitcommit cb8aabd2dde7 testing nbs in hcv_initialize_templates
/home/basile/helpcovid/helpcovid[3018484]: hcv_background.cc:211 - !! hcv_background_thread_body ending thread hcovibg3018484
[Thread 0x7ffff66b9700 (LWP 3019275) exited]
[Thread 0x7ffff56b7700 (LWP 3020005) exited]
[Thread 0x7ffff5eb8700 (LWP 3020004) exited]
Thread 1 "helpcovid" hit Breakpoint 3, std::thread::~thread (this=0x555555785750, __in_chrg=<optimized out>) at /usr/include/c++/9/thread:138
138 if (joinable())
(gdb) bt
#0 std::thread::~thread() (this=0x555555785750, __in_chrg=<optimized out>) at /usr/include/c++/9/thread:138
#1 0x000055555568fd49 in std::_Destroy<std::thread>(std::thread*) (__pointer=0x555555785750) at /usr/include/c++/9/bits/stl_construct.h:98
#2 0x000055555568f2b4 in std::_Destroy_aux<false>::__destroy<std::thread*>(std::thread*, std::thread*)
(__first=0x555555785750, __last=0x555555785760) at /usr/include/c++/9/bits/stl_construct.h:108
#3 0x000055555568e624 in std::_Destroy<std::thread*>(std::thread*, std::thread*) (__first=0x555555785750, __last=0x555555785760)
at /usr/include/c++/9/bits/stl_construct.h:137
#4 0x000055555568d165 in std::_Destroy<std::thread*, std::thread>(std::thread*, std::thread*, std::allocator<std::thread>&)
(__first=0x555555785750, __last=0x555555785760) at /usr/include/c++/9/bits/stl_construct.h:206
#5 0x000055555568c0b1 in std::vector<std::thread, std::allocator<std::thread> >::~vector() (this=0x55555576eb18, __in_chrg=<optimized out>)
at /usr/include/c++/9/bits/stl_vector.h:677
#6 0x0000555555690ecc in httplib::ThreadPool::~ThreadPool() (this=0x55555576eb10, __in_chrg=<optimized out>) at httplib.h:388
#7 0x0000555555690ef4 in httplib::ThreadPool::~ThreadPool() (this=0x55555576eb10, __in_chrg=<optimized out>) at httplib.h:388
#8 0x000055555568de26 in std::default_delete<httplib::TaskQueue>::operator()(httplib::TaskQueue*) const
(this=0x7fffffffd8e8, __ptr=0x55555576eb10) at /usr/include/c++/9/bits/unique_ptr.h:81
#9 0x000055555568c9c8 in std::unique_ptr<httplib::TaskQueue, std::default_delete<httplib::TaskQueue> >::~unique_ptr()
(this=0x7fffffffd8e8, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/unique_ptr.h:292
#10 0x000055555568b753 in httplib::Server::listen_internal() (this=0x5555557589f0) at httplib.h:3481
#11 0x000055555568b290 in httplib::Server::listen(char const*, int, int)
(this=0x5555557589f0, host=0x7fffffffdb60 "localhost", port=8089, socket_flags=0) at httplib.h:3126
#12 0x00005555556873ad in hcv_webserver_run() () at hcv_web.cc:847
#13 0x00005555556643ed in main(int, char**) (argc=5, argv=0x7fffffffe438) at hcv_main.cc:1212
(gdb) c
Continuing.
Thread 1 "helpcovid" hit Breakpoint 3, std::thread::~thread (this=0x555555785758, __in_chrg=<optimized out>) at /usr/include/c++/9/thread:138
138 if (joinable())
(gdb) bt
#0 std::thread::~thread() (this=0x555555785758, __in_chrg=<optimized out>) at /usr/include/c++/9/thread:138
#1 0x000055555568fd49 in std::_Destroy<std::thread>(std::thread*) (__pointer=0x555555785758) at /usr/include/c++/9/bits/stl_construct.h:98
#2 0x000055555568f2b4 in std::_Destroy_aux<false>::__destroy<std::thread*>(std::thread*, std::thread*)
(__first=0x555555785758, __last=0x555555785760) at /usr/include/c++/9/bits/stl_construct.h:108
#3 0x000055555568e624 in std::_Destroy<std::thread*>(std::thread*, std::thread*) (__first=0x555555785750, __last=0x555555785760)
at /usr/include/c++/9/bits/stl_construct.h:137
#4 0x000055555568d165 in std::_Destroy<std::thread*, std::thread>(std::thread*, std::thread*, std::allocator<std::thread>&)
(__first=0x555555785750, __last=0x555555785760) at /usr/include/c++/9/bits/stl_construct.h:206
#5 0x000055555568c0b1 in std::vector<std::thread, std::allocator<std::thread> >::~vector() (this=0x55555576eb18, __in_chrg=<optimized out>)
at /usr/include/c++/9/bits/stl_vector.h:677
#6 0x0000555555690ecc in httplib::ThreadPool::~ThreadPool() (this=0x55555576eb10, __in_chrg=<optimized out>) at httplib.h:388
#7 0x0000555555690ef4 in httplib::ThreadPool::~ThreadPool() (this=0x55555576eb10, __in_chrg=<optimized out>) at httplib.h:388
#8 0x000055555568de26 in std::default_delete<httplib::TaskQueue>::operator()(httplib::TaskQueue*) const
(this=0x7fffffffd8e8, __ptr=0x55555576eb10) at /usr/include/c++/9/bits/unique_ptr.h:81
#9 0x000055555568c9c8 in std::unique_ptr<httplib::TaskQueue, std::default_delete<httplib::TaskQueue> >::~unique_ptr()
(this=0x7fffffffd8e8, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/unique_ptr.h:292
#10 0x000055555568b753 in httplib::Server::listen_internal() (this=0x5555557589f0) at httplib.h:3481
#11 0x000055555568b290 in httplib::Server::listen(char const*, int, int)
(this=0x5555557589f0, host=0x7fffffffdb60 "localhost", port=8089, socket_flags=0) at httplib.h:3126
#12 0x00005555556873ad in hcv_webserver_run() () at hcv_web.cc:847
#13 0x00005555556643ed in main(int, char**) (argc=5, argv=0x7fffffffe438) at hcv_main.cc:1212
(gdb) c
Continuing.
/home/basile/helpcovid/helpcovid[3018484]: hcv_web.cc:848 - !! end hcv_webserver_run webhost=localhost webport=8089
-: Invalid argument
/home/basile/helpcovid/helpcovid[3018484]: hcv_main.cc:1216 - !! normal end of /home/basile/helpcovid/helpcovid
Thread 1 "helpcovid" hit Breakpoint 3, std::thread::~thread (this=0x5555556fd1a0 <hcv_bgthread>, __in_chrg=<optimized out>)
at /usr/include/c++/9/thread:138
138 if (joinable())
(gdb) bt
#0 std::thread::~thread() (this=0x5555556fd1a0 <hcv_bgthread>, __in_chrg=<optimized out>) at /usr/include/c++/9/thread:138
#1 0x00007ffff73fce27 in __run_exit_handlers
(status=0, listp=0x7ffff757b718 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at exit.c:108
#2 0x00007ffff73fcfda in __GI_exit (status=<optimized out>) at exit.c:139
#3 0x00007ffff73e5e12 in __libc_start_main (main=
0x5555556632a7 <main(int, char**)>, argc=5, argv=0x7fffffffe438, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe428) at ../csu/libc-start.c:342
#4 0x000055555560c73a in _start ()
(gdb) c
Continuing.
terminate called without an active exception
Thread 1 "helpcovid" hit Breakpoint 1, __GI_abort () at abort.c:49
49 abort.c: No such file or directory.
(gdb) bt
#0 __GI_abort () at abort.c:49
#1 0x00007ffff779c80c in () at /lib/x86_64-linux-gnu/libstdc++.so.6
#2 0x00007ffff77a78f6 in () at /lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x00007ffff77a7961 in () at /lib/x86_64-linux-gnu/libstdc++.so.6
#4 0x000055555568a359 in std::thread::~thread() (this=0x5555556fd1a0 <hcv_bgthread>, __in_chrg=<optimized out>) at /usr/include/c++/9/thread:139
#5 0x00007ffff73fce27 in __run_exit_handlers
(status=0, listp=0x7ffff757b718 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at exit.c:108
#6 0x00007ffff73fcfda in __GI_exit (status=<optimized out>) at exit.c:139
#7 0x00007ffff73e5e12 in __libc_start_main (main=
0x5555556632a7 <main(int, char**)>, argc=5, argv=0x7fffffffe438, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe428) at ../csu/libc-start.c:342
#8 0x000055555560c73a in _start ()
(gdb) c
Continuing.
Thread 1 "helpcovid" received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
I am very tempted to give up using httplib.h; for our case, it seems unreliable.
This problem happens because you don't call join
for hcv_bgthread
. That's why the log shows the following:
138 if (joinable())
In commit f5a5e814be41922173f3a77fab8c6 the
SIGTERM
signal is not handled as it should be by signalfd(2) facilities of C++ filehcv_background.cc
To exercise the bug after suitable configuration, run
then (in some other terminal)
kill $(cat /tmp/helpcovid.pid)