chrsmithdemos / leveldb

Automatically exported from code.google.com/p/leveldb
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Background compaction may cause unnecessary 1s sleep during shutdown #125

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
If background compaction has already started when the database is shutting down 
(i.e. the DBImpl object destructor is called), 

Here's the sequence of events:

* Somewhere: bg_compaction_scheduled_ is set
* BG Thread: BackgroundCall is invoked
* BG Thread: BackgroundCall acquires mutex
* BG Thread: BackgroundCall proceeds as shutting_down_ is false, invokes 
BackgroundCompaction
* Main thread: ~DBImpl invoked
* Main thread: ~DBImpl blocks on mutex
* BG Thread: DoCompactionWork releases mutex ("Release mutex while we're 
actually doing the compaction work")
* Main thread: ~DBImpl proceeds, sets shutting_down_
* Main thread: ~DBImpl blocks on bg_cv_ since bg_compaction_scheduled_
* BG Thread: DoCompactionWork tests shutting_down_, sets status to IOError
* BG Thread: BackgroundCall sees !s.ok()
* BG Thread: BackgroundCall signals bg_cv_
* Main thread: ~DBImpl wakes up, but since bg_compaction_scheduled_ blocks on 
bg_cv_ again
* BG Thread: BackgroundCall sleeps for 1000000 microseconds (fearing transient 
I/O error)
* BG Thread: BackgroundCall clears bg_compaction_scheduled_, calls 
MaybeScheduleCompaction
* BG Thread: MaybeScheduleCompaction exits since shutting_down_ is set
* Main thread: ~DBImpl completes

Possible patch:

+++ b/db/db_impl.cc
@@ -609,7 +609,7 @@ void DBImpl::BackgroundCall() {
   assert(bg_compaction_scheduled_);
   if (!shutting_down_.Acquire_Load()) {
     Status s = BackgroundCompaction();
-    if (!s.ok()) {
+    if (!s.ok() && !shutting_down_.Acquire_Load()) {
       // Wait a little bit before retrying background compaction in
       // case this is an environmental problem and we do not want to
       // chew up resources for failed compactions for the duration of

Original issue reported on code.google.com by jsbell@chromium.org on 8 Oct 2012 at 4:14

GoogleCodeExporter commented 9 years ago
This is causing test timeouts in Chrome and WebKit:

https://bugs.webkit.org/show_bug.cgi?id=65862

Original comment by jsbell@chromium.org on 8 Oct 2012 at 4:18

GoogleCodeExporter commented 9 years ago
Fixed in 1.6.0.

Original comment by san...@google.com on 12 Oct 2012 at 7:53