Closed slfritchie closed 10 years ago
"LOG" file contents:
2014/03/28-17:34:55.454901 b1873000 Options.comparator: leveldb.InternalKeyComparator
2014/03/28-17:34:55.454930 b1873000 Options.create_if_missing: 1
2014/03/28-17:34:55.454935 b1873000 Options.error_if_exists: 0
2014/03/28-17:34:55.454939 b1873000 Options.paranoid_checks: 0
2014/03/28-17:34:55.454942 b1873000 Options.verify_compactions: 1
2014/03/28-17:34:55.454946 b1873000 Options.env: 0x7fe23b8009c0
2014/03/28-17:34:55.454949 b1873000 Options.info_log: 0x7fe23b804da0
2014/03/28-17:34:55.454953 b1873000 Options.write_buffer_size: 16777216
2014/03/28-17:34:55.454956 b1873000 Options.max_open_files: 1000
2014/03/28-17:34:55.454959 b1873000 Options.block_cache: 0x7fe23c006e00
2014/03/28-17:34:55.454962 b1873000 Options.block_size: 4096
2014/03/28-17:34:55.454966 b1873000 Options.block_size_steps: 16
2014/03/28-17:34:55.454969 b1873000 Options.block_restart_interval: 16
2014/03/28-17:34:55.454972 b1873000 Options.compression: 0
2014/03/28-17:34:55.454975 b1873000 Options.filter_policy: NULL
2014/03/28-17:34:55.454979 b1873000 Options.is_repair: false
2014/03/28-17:34:55.454982 b1873000 Options.is_internal_db: false
2014/03/28-17:34:55.454985 b1873000 Options.total_leveldb_mem: 0
2014/03/28-17:34:55.454988 b1873000 Options.block_cache_threshold: 33554432
2014/03/28-17:34:55.454992 b1873000 Options.limited_developer_mem: false
2014/03/28-17:34:55.454995 b1873000 Options.delete_threshold: 1000
2014/03/28-17:34:55.454998 b1873000 Options.fadvise_willneed: false
2014/03/28-17:34:55.455001 b1873000 crc32c: hardware
2014/03/28-17:34:55.455005 b1873000 File cache size: 278921216
2014/03/28-17:34:55.455008 b1873000 Block cache size: 281018368
2014/03/28-17:34:55.465354 b1873000 Delete type=3 #1
2014/03/28-17:34:55.465457 b1873000 UpdatePenalty: 0
2014/03/28-17:34:55.493983 b1873000 UpdatePenalty: 0
2014/03/28-17:34:55.494453 b3abc000 Level-0 table #5: started
2014/03/28-17:34:55.530660 b1873000 waiting 2...
2014/03/28-17:34:55.530681 b1873000 UpdatePenalty: 0
2014/03/28-17:34:55.591396 b3abc000 Level-0 table #5: 16847612 bytes, 170 keys IO error: /tmp/generic.qc/sst_0/000005.sst: No space left on device
2014/03/28-17:34:55.591427 b3abc000 Waiting after background imm compaction error: IO error: /tmp/generic.qc/sst_0/000005.sst: No space left on device
2014/03/28-17:34:55.591462 b1873000 running 2...
2014/03/28-17:34:55.591477 b1873000 waiting 2...
2014/03/28-17:34:55.591483 b1873000 UpdatePenalty: 0
2014/03/28-17:34:56.592507 b3abc000 UpdatePenalty: 0
2014/03/28-17:34:56.592592 b1873000 running 2...
2014/03/28-17:34:56.592617 b1873000 waiting 2...
2014/03/28-17:34:56.592635 b1873000 UpdatePenalty: 0
If I remove an exit() call on the Erlang side, I see a deadlock involving only 1 thread: at the same place as above, the leveldb::DBImpl::MakeRoomForWrite (this=0x7fd9a3100a80, force=<value temporarily unavailable, due to optimizations>) at db/db_impl.cc:1836
thread.
Does @matthewvon How do you reproduce the issue @slfritchie ?
I'm fighting QuickCheck to get a somewhat-deterministic case.
Don't waste your time making a deterministic case. I see what is happening … shutdown with pending writes and full disk. Compaction knows to stop attempting compactions, but write thread does not know to quit waiting upon compactions.
Add:
if (!shuttingdown.Acquire_Load())
before each of the bgcv.Wait() calls in MakeRoomForWrite() … and all will be fine.
Matthew, I suspect that there's a larger problem, and the 2-thread case above got caught in it?
The larger problem appears to be that a eleveldb_thread_pool
thread appears to get stuck in leveldb::DBImpl::Write
and leveldb::DBImpl::MakeRoomForWrite
while all other threads appear sitting in their main condvar wait loops? {shrug} It's quite easy and quick to reproduce a case of this, where close()
is not a factor.
I've added a new command to the QuickCheck model that will write 10s/100s/1000s of keys sequentially (without other model operations interleaved). The new command is called 'put_filler'. An example test case looks like:
[{set,{var,1},
{call,generic_qc_fsm,set_keys,
[[<<249,57,0,0,0,0>>],{var,parameter_test_dir}]}},
{set,{var,2},
{call,generic_qc_fsm,open,
[{var,parameter_test_dir},
[read_write,
{open_timeout,0},
{max_file_size,100},
{sync_strategy,none},
{create_if_missing,true},
{limited_developer_mem,true},
{compression,false},
{write_buffer_size,16000000}]]}},
{set,{var,3},
{call,generic_qc_fsm,put_filler,[{var,2},{1273,<<"]+">>},34084]}},
{set,{var,4},
{call,generic_qc_fsm,put,[{var,2},<<249,57,0,0,0,0>>,<<144,2>>]}},
{set,{var,5},
{call,generic_qc_fsm,put_filler,
[{var,2},{2462,<<17,106>>},113163]}},
{set,{var,6},
{call,generic_qc_fsm,put,[{var,2},<<249,57,0,0,0,0>>,<<"@">>]}},
{set,{var,7},{call,generic_qc_fsm,get,[{var,2},<<249,57,0,0,0,0>>]}},
{set,{var,8},
{call,generic_qc_fsm,put_filler,[{var,2},{2853,<<23>>},33185]}},
{set,{var,9},{call,generic_qc_fsm,delete,[{var,2},<<249,57,0,0,0,0>>]}}]
Without fault injection, this kind of test case has no problem: I can run for hours without a single problem. However, it's impossible to go for 1 minute without a problem caused by the put_filler
command getting stuck. In the above case, the first filler command writes 1273 keys of with values of 34084 bytes. That command worked just fine. The 2nd filler tries to write 2462 keys with values of 113163 bytes and gets stuck.
Stack traces from all threads is available at https://gist.github.com/slfritchie/fb238f5c48ff57788507
The "LOG" file looks like this after getting stuck:
2014/04/01-12:03:13.225374 b1873000 Options.comparator: leveldb.InternalKeyComparator
2014/04/01-12:03:13.225404 b1873000 Options.create_if_missing: 1
2014/04/01-12:03:13.225409 b1873000 Options.error_if_exists: 0
2014/04/01-12:03:13.225413 b1873000 Options.paranoid_checks: 0
2014/04/01-12:03:13.225416 b1873000 Options.verify_compactions: 1
2014/04/01-12:03:13.225420 b1873000 Options.env: 0x7ff70bb00460
2014/04/01-12:03:13.225423 b1873000 Options.info_log: 0x7ff70ba00740
2014/04/01-12:03:13.225427 b1873000 Options.write_buffer_size: 16000000
2014/04/01-12:03:13.225430 b1873000 Options.max_open_files: 1000
2014/04/01-12:03:13.225434 b1873000 Options.block_cache: 0x7ff711800e00
2014/04/01-12:03:13.225437 b1873000 Options.block_size: 4096
2014/04/01-12:03:13.225440 b1873000 Options.block_size_steps: 16
2014/04/01-12:03:13.225444 b1873000 Options.block_restart_interval: 16
2014/04/01-12:03:13.225447 b1873000 Options.compression: 0
2014/04/01-12:03:13.225450 b1873000 Options.filter_policy: NULL
2014/04/01-12:03:13.225454 b1873000 Options.is_repair: false
2014/04/01-12:03:13.225457 b1873000 Options.is_internal_db: false
2014/04/01-12:03:13.225460 b1873000 Options.total_leveldb_mem: 0
2014/04/01-12:03:13.225463 b1873000 Options.block_cache_threshold: 33554432
2014/04/01-12:03:13.225467 b1873000 Options.limited_developer_mem: false
2014/04/01-12:03:13.225470 b1873000 Options.delete_threshold: 1000
2014/04/01-12:03:13.225473 b1873000 Options.fadvise_willneed: false
2014/04/01-12:03:13.225477 b1873000 crc32c: hardware
2014/04/01-12:03:13.225480 b1873000 File cache size: 280475648
2014/04/01-12:03:13.225484 b1873000 Block cache size: 282572800
2014/04/01-12:03:13.228286 b1873000 Delete type=3 #1
2014/04/01-12:03:13.228381 b1873000 UpdatePenalty: 0
2014/04/01-12:03:13.256861 b1873000 UpdatePenalty: 0
2014/04/01-12:03:13.256888 b3abc000 Level-0 table #5: started
2014/04/01-12:03:13.293151 b1873000 waiting 2...
2014/04/01-12:03:13.293172 b1873000 UpdatePenalty: 0
2014/04/01-12:03:13.357334 b3abc000 Level-0 table #5: 16013598 bytes, 469 keys OK
2014/04/01-12:03:13.385372 b3abc000 Delete type=0 #3
2014/04/01-12:03:13.385515 b3abc000 UpdatePenalty: 0
2014/04/01-12:03:13.385562 b1873000 running 2...
2014/04/01-12:03:13.385825 b1873000 UpdatePenalty: 0
2014/04/01-12:03:13.385888 b3abc000 Level-0 table #7: started
2014/04/01-12:03:13.421676 b1873000 waiting 2...
2014/04/01-12:03:13.421701 b1873000 UpdatePenalty: 0
2014/04/01-12:03:13.524520 b3abc000 Level-0 table #7: 16013598 bytes, 469 keys OK
2014/04/01-12:03:13.526535 b3abc000 Delete type=0 #4
2014/04/01-12:03:13.526658 b3abc000 UpdatePenalty: 0
2014/04/01-12:03:13.526731 b1873000 running 2...
2014/04/01-12:03:13.527142 b1873000 UpdatePenalty: 0
2014/04/01-12:03:13.527162 b3abc000 Level-0 table #11: started
2014/04/01-12:03:13.556371 b1873000 waiting 2...
2014/04/01-12:03:13.556392 b1873000 UpdatePenalty: 0
2014/04/01-12:03:13.612630 b3abc000 Level-0 table #11: 16080534 bytes, 377 keys OK
2014/04/01-12:03:13.612948 b3abc000 Delete type=0 #6
2014/04/01-12:03:13.613117 b3abc000 UpdatePenalty: 0
2014/04/01-12:03:13.613141 b1873000 running 2...
2014/04/01-12:03:13.613274 b1873000 UpdatePenalty: 0
2014/04/01-12:03:13.613290 b3abc000 Level-0 table #13: started
2014/04/01-12:03:13.642876 b1873000 waiting 2...
2014/04/01-12:03:13.642898 b1873000 UpdatePenalty: 0
2014/04/01-12:03:13.710699 b3abc000 Level-0 table #13: 16077771 bytes, 142 keys IO error: /tmp/generic.qc.1396.321393.224161/sst_0/000013.sst: Input/output error
2014/04/01-12:03:13.710725 b3abc000 Waiting after background imm compaction error: IO error: /tmp/generic.qc.1396.321393.224161/sst_0/000013.sst: Input/output error
2014/04/01-12:03:13.710759 b1873000 running 2...
2014/04/01-12:03:13.710774 b1873000 waiting 2...
2014/04/01-12:03:13.710781 b1873000 UpdatePenalty: 0
2014/04/01-12:03:14.711505 b3abc000 UpdatePenalty: 0
2014/04/01-12:03:14.711628 b1873000 running 2...
2014/04/01-12:03:14.711643 b1873000 waiting 2...
2014/04/01-12:03:14.711650 b1873000 UpdatePenalty: 0
Drat, I forgot to mention that the comment immediately above also has your if
patch suggestion.
diff --git a/db/db_impl.cc b/db/db_impl.cc
index a67b1b3..23c9ffc 100644
--- a/db/db_impl.cc
+++ b/db/db_impl.cc
@@ -1833,14 +1833,14 @@ Status DBImpl::MakeRoomForWrite(bool force) {
Log(options_.info_log, "waiting 2...\n");
gPerfCounters->Inc(ePerfWriteWaitImm);
MaybeScheduleCompaction();
- bg_cv_.Wait();
+if (!shutting_down_.Acquire_Load()) bg_cv_.Wait();
Log(options_.info_log, "running 2...\n");
} else if (versions_->NumLevelFiles(0) >= config::kL0_StopWritesTrigger) {
// There are too many level-0 files.
Log(options_.info_log, "waiting...\n");
gPerfCounters->Inc(ePerfWriteWaitLevel0);
MaybeScheduleCompaction();
- bg_cv_.Wait();
+if (!shutting_down_.Acquire_Load()) bg_cv_.Wait();
Log(options_.info_log, "running...\n");
} else {
// Attempt to switch to a new memtable and trigger compaction of old
If I inject faults only into pread(2) at a 7%, I can get the same hang inside leveldb::DBImpl::MakeRoomForWrite
. Ditto for only ftruncate(2)
at 25% random rate. Using 50% rate for rename(2)
and unlink(2)
and unlinkat(2)
seem to have no effect (or waiting 5 minutes isn't long enough).
Ditto for only open(2)
at 10% random rate with errno=ENOSPC.
Failing access(2)
and stat(2)
and flock(2)
and fcntl(2)
at 50% appears to have no effect.
To recreate on OS X:
mkdir /tmp/e
cd /tmp/e
git clone git@github.com:basho/eleveldb.git
cd eleveldb
git checkout -f 850668ffce63e96d3ee1e4350c1a994711611fae
make ; rebar skip_deps=true eunit suites=NONE ; erlc -o deps/faulterl/ebin -I deps/faulterl/include priv/scenario/*erl ; deps/faulterl/ebin/make_intercept_c.escript trigger_commonpaths yo ; env `deps/faulterl/ebin/example_environment.sh $PWD/yo` erl -pz .eunit deps/*/ebin
Then:
eqc:quickcheck(eqc:testing_time(1, generic_qc_fsm:prop(false, false))).
eqc:quickcheck(eqc:testing_time(3*60, generic_qc_fsm:prop(true, false))).
The string <f
marks the start of a filler operation and >
marks its completion. If you don't see a completion, and disk I/O rates have dropped to zero, you're stuck.
Did you wait a full 60 seconds to see if errors unblocked? Level db holds the failed compaction thread for60 seconds. That holds writes on conditions too.
Wow, that's a cool feature. Yes, I'm pretty certain that I've gone to lunch and hand the hang remain when I returned. I haven't yet learned fully to eat at Tokyo Speed(™), so I believe that I've exceeded the 60 second wow-that's-cool-latency-waiting-for-the-referee-to-ring-the-bell-for-the-next-round period.
Yup, more than 60 seconds.
2> time().
{20,12,21}
8:12:21 pm
3> eqc:quickcheck(eqc:testing_time(3*60, generic_qc_fsm:prop(true, false))).
.{.{<f (hang here......)
(press control-g)
User switch command
--> s
--> c
Eshell V5.10.3 (abort with ^G)
1> time().
{20,21,26}
8:21:26pm
leveldb branch mv-imm-retry
+1, many thanks.
eleveldb revision: 639f69c96757cc5ddaf2bbacc0ef10a6e405c7bf leveldb revision: d553b6c4bf525c0301b3d2f128be0a63df930ee4
I see a deadlock that immobilizes the entire scheduler thread. The first suspicious stacktrace is:
And the second: