Expensify / Bedrock

Rock solid distributed database specializing in active/active automatic failover and WAN replication
https://bedrockdb.com
GNU Lesser General Public License v3.0
1.12k stars 85 forks source link

Crash when sysbench disconnects #69

Open willfong opened 7 years ago

willfong commented 7 years ago

Hi!

I've been testing bedrock with sysbench (just want to see MySQL compatibility), and noticed it would always crash when I Ctrl+C sysbench.

Bedrock version: 43234709c216b19ae7ce7ea7a5219467bf50e773

Test case:

root@ubuntu2:~# service bedrock start
 * Starting Expensify Bedrock Server bedrock                                                                                                                      [ OK ] 
root@ubuntu2:~# ps aux|grep bedrock
root      4278  0.0  0.0 399492  7604 ?        Ssl  15:25   0:00 /usr/sbin/bedrock -fork -nodeName bedrock -db /var/lib/bedrock/bedrock.db -serverHost 0.0.0.0:8888 -nodeHost 0.0.0.0:8889 -priority 200 -pidfile /var/run/bedrock.pid -quorumCheckpoint 100 -readThreads 4 -plugins status,db,jobs,cache,mysql -v -cache 10001
root      4287  0.0  0.0  11748  2128 pts/2    S+   15:25   0:00 grep --color=auto bedrock
root@ubuntu2:~# bedrock -version
43234709c216b19ae7ce7ea7a5219467bf50e773
root@ubuntu2:~# timeout 2 sysbench/sysbench/sysbench --test=sysbench/sysbench/tests/db/oltp.lua --mysql-user=root --mysql-password= --mysql-db=test --mysql-table-engine=innodb --mysql-ignore-duplicates=on --oltp-read-only=off --oltp-dist-type=uniform --oltp-skip-trx=on --oltp-auto-inc=off --init-rng=on --oltp-test-mode=complex --max-requests=0 --report-interval=1 --num-threads=8 --mysql-host=127.0.0.1 --mysql-port=3306 --oltp-table-size=5000000 --oltp-tables-count=4 --max-time=600 run > bench.log 2>&1
root@ubuntu2:~# ps aux|grep bedrock
root      4302  0.0  0.0  11748  2240 pts/2    S+   15:26   0:00 grep --color=auto bedrock
root@ubuntu2:~#

Stacktrace:

Dec 22 15:26:10 ubuntu2 bedrock: xxxxx (SSignal.cpp:62) SSendSignal [main] [eror] Got SIGSEGV, logging stack trace.
Dec 22 15:26:10 ubuntu2 bedrock: xxxxx (SLog.cpp:16) SLogStackTrace [main] [warn] /usr/sbin/bedrock(_Z14SLogStackTracev+0x24) [0x59ec29]
Dec 22 15:26:10 ubuntu2 bedrock: xxxxx (SLog.cpp:16) SLogStackTrace [main] [warn] /usr/sbin/bedrock(_Z11SSendSignali+0x48e) [0x59dc86]
Dec 22 15:26:10 ubuntu2 bedrock: xxxxx (SLog.cpp:16) SLogStackTrace [main] [warn] /lib/x86_64-linux-gnu/libc.so.6(+0x36cb0) [0x7f45837dccb0]
Dec 22 15:26:10 ubuntu2 bedrock: xxxxx (SLog.cpp:16) SLogStackTrace [main] [warn] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(_ZNSs6appendERKSs+0x1c) [0x7f4583e3bf2c]
Dec 22 15:26:10 ubuntu2 bedrock: xxxxx (SLog.cpp:16) SLogStackTrace [main] [warn] /usr/sbin/bedrock(_ZN11STCPManager6Socket4sendERKSs+0x27) [0x5a51bd]
Dec 22 15:26:10 ubuntu2 bedrock: xxxxx (SLog.cpp:16) SLogStackTrace [main] [warn] /usr/sbin/bedrock(_ZN19BedrockPlugin_MySQL21onPortRequestCompleteERK5SDataPN11STCPManager6SocketE+0xae7) [0x5c10fd]
Dec 22 15:26:10 ubuntu2 bedrock: xxxxx (SLog.cpp:16) SLogStackTrace [main] [warn] /usr/sbin/bedrock(_ZN13BedrockServer10postSelectERSt3mapIi6pollfdSt4lessIiESaISt4pairIKiS1_EEERm+0x5608) [0x5328ea]
Dec 22 15:26:10 ubuntu2 bedrock: xxxxx (SLog.cpp:16) SLogStackTrace [main] [warn] /usr/sbin/bedrock(main+0x2b01) [0x5ae993]
Dec 22 15:26:10 ubuntu2 bedrock: xxxxx (SLog.cpp:16) SLogStackTrace [main] [warn] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f45837c7f45]
Dec 22 15:26:10 ubuntu2 bedrock: xxxxx (SLog.cpp:16) SLogStackTrace [main] [warn] /usr/sbin/bedrock() [0x475915]

Right before the stack trace, minus some test data:

Dec 22 15:26:10 ubuntu2 bedrock: xxxxx (STCPManager.cpp:255) closeSocket [main] [dbug] Closing socket '127.0.0.1:38330'
Dec 22 15:26:10 ubuntu2 bedrock: xxxxx (STCPManager.cpp:255) closeSocket [main] [dbug] Closing socket '127.0.0.1:38332'
Dec 22 15:26:10 ubuntu2 bedrock: xxxxx (BedrockServer.cpp:38) BedrockServer_WorkerThread_ProcessDirectMessages [read1] [info] Processing direct message 'CANCEL_REQUEST'
Dec 22 15:26:10 ubuntu2 bedrock: xxxxx (STCPManager.cpp:255) closeSocket [main] [dbug] Closing socket '127.0.0.1:38334'
Dec 22 15:26:10 ubuntu2 bedrock: xxxxx (BedrockServer.cpp:38) BedrockServer_WorkerThread_ProcessDirectMessages [read0] [info] Processing direct message 'CANCEL_REQUEST'
Dec 22 15:26:10 ubuntu2 bedrock: xxxxx (BedrockServer.cpp:38) BedrockServer_WorkerThread_ProcessDirectMessages [read2] [info] Processing direct message 'CANCEL_REQUEST'
Dec 22 15:26:10 ubuntu2 bedrock: xxxxx (BedrockServer.cpp:38) BedrockServer_WorkerThread_ProcessDirectMessages [write0] [info] Processing direct message 'CANCEL_REQUEST'
Dec 22 15:26:10 ubuntu2 bedrock: xxxxx (BedrockServer.cpp:54) BedrockServer_WorkerThread_ProcessDirectMessages [read2] [info] No need to cancel request #3381 because not queued, ignoring.
Dec 22 15:26:10 ubuntu2 bedrock: xxxxx (BedrockServer.cpp:54) BedrockServer_WorkerThread_ProcessDirectMessages [write0] [info] No need to cancel request #3381 because not queued, ignoring.
Dec 22 15:26:10 ubuntu2 bedrock: xxxxx (BedrockServer.cpp:54) BedrockServer_WorkerThread_ProcessDirectMessages [read1] [info] No need to cancel request #3381 because not queued, ignoring.
Dec 22 15:26:10 ubuntu2 bedrock: xxxxx (BedrockServer.cpp:54) BedrockServer_WorkerThread_ProcessDirectMessages [read0] [info] No need to cancel request #3381 because not queued, ignoring.
Dec 22 15:26:10 ubuntu2 bedrock: xxxxx (BedrockNode.cpp:97) _peekCommand [read3] [info] Plugin 'DB' peeked command 'Query'
Dec 22 15:26:10 ubuntu2 bedrock: xxxxx (BedrockNode.cpp:116) _peekCommand [read3] [info] Responding '200 OK' to read-only 'Query'.
Dec 22 15:26:10 ubuntu2 bedrock: xxxxx (SQLiteNode.cpp:292) openCommand [read3] [info] {bedrock/MASTERING} Processed peekable command 'Query' (bedrock#619)
Dec 22 15:26:10 ubuntu2 bedrock: xxxxx (SQLiteNode.cpp:342) getProcessedCommand [read3] [info] {bedrock/MASTERING} Returning processed command 'Query' (bedrock#619)
Dec 22 15:26:10 ubuntu2 bedrock: xxxxx (BedrockServer.cpp:137) BedrockServer_WorkerThread [read3] [info] Peek successful. Putting command 'bedrock#619' on processed list.
Dec 22 15:26:10 ubuntu2 bedrock: xxxxx (SQLiteNode.cpp:405) closeCommand [read3] [dbug] {bedrock/MASTERING} Closing command 'bedrock#619'
Dec 22 15:26:10 ubuntu2 bedrock: xxxxx (BedrockServer.cpp:38) BedrockServer_WorkerThread_ProcessDirectMessages [read3] [info] Processing direct message 'CANCEL_REQUEST'
Dec 22 15:26:10 ubuntu2 bedrock: xxxxx (BedrockServer.cpp:54) BedrockServer_WorkerThread_ProcessDirectMessages [read3] [info] No need to cancel request #3381 because not queued, ignoring.
Dec 22 15:26:10 ubuntu2 bedrock: xxxxx (BedrockServer.cpp:634) postSelect [main] [info] Processed command 'Query' #3381 (result '200 OK') from 'internal' in 133=0+133 ms
Dec 22 15:26:10 ubuntu2 bedrock: xxxxx (BedrockServer.cpp:656) postSelect [main] [info] Plugin 'MySQL' handling response '200 OK' to request 'Query'
Dec 22 15:26:10 ubuntu2 bedrock: xxxxx (libstuff.cpp:1380) _SParseJSONObject [main] [dbug] Parsed: 'headers':'["c"]'
Dec 22 15:26:10 ubuntu2 bedrock: xxxxx (libstuff.cpp:1380) _SParseJSONObject [main] [dbug] Parsed: 'rows':'[["01744209700-28851195920-33965673137-73349934359-0917788...<snip>...

The full log (13MB compressed to 700KB), minus the sysbench data, can be found here.

Thanks, -will

quinthar commented 7 years ago

Tips for fixing:

Any volunteers?

quinthar commented 7 years ago

FYI, added an Upwork project here if anybody wants to fix this and earn a quick $500: https://www.upwork.com/jobs/~0117b61ddf80095260

quinthar commented 7 years ago

This should be fixed in HEAD; reassigning to @righdforsa until deployed.