facebook / rocksdb

A library that provides an embeddable, persistent key-value store for fast storage.
http://rocksdb.org
GNU General Public License v2.0
28.52k stars 6.3k forks source link

coredump when multithread call delete range and use allow_concurrent_memtable_write = true #5977

Closed T0mmyliu closed 3 years ago

T0mmyliu commented 4 years ago

I have two threads which can call delete_range, and I use allow_concurrent_memtable_write = true, and coredump at post_process_info->num_entries (memtable line 551), the post_process_info is nullptr. I change allow_concurrent_memtable_write to fasle or use lock to limit only one thread can delete range, tt won't core.

I use tag: v6.1.2

T0mmyliu commented 4 years ago

You can use this code to reproduce: case not_allow_concurrent_memtable_write will succ case allow_concurrent_memtable_write will coredump

TEST(Basic, not_allow_concurrent_memtable_write)
{
  rocksdb::DB* db;
  rocksdb::Options options;
  options.create_if_missing = true;
  options.allow_concurrent_memtable_write = false;

  char* tmpfile;
  tmpfile = tmpnam(nullptr);

  rocksdb::DB::Open(options, tmpfile, &db);

  std::thread t1(DeleteRange, db);
  std::thread t2(DeleteRange, db);

  sleep(3);
  stop = true;
  t1.join();
  t2.join();
  stop = false;
}

TEST(Basic, allow_concurrent_memtable_write)
{
  rocksdb::DB* db;
  rocksdb::Options options;
  options.create_if_missing = true;
  options.allow_concurrent_memtable_write = true;

  char* tmpfile;
  tmpfile = tmpnam(nullptr);

  rocksdb::DB::Open(options, tmpfile, &db);

  std::thread t1(DeleteRange, db);
  std::thread t2(DeleteRange, db);

  sleep(3);
  stop = true;
  t1.join();
  t2.join();
  stop = false;
}
T0mmyliu commented 4 years ago
coredump info:
#0  0x00000000004d2316 in rocksdb::MemTable::Add (this=this@entry=0x15fed50, s=2, type=type@entry=rocksdb::kTypeRangeDeletion, key=..., value=..., oplog_sn=1, allow_concurrent=true, post_process_info=0x0) at db/memtable.cc:551
#1  0x00000000005183c2 in rocksdb::MemTableInserter::DeleteImpl (delete_type=rocksdb::kTypeRangeDeletion, value=..., key=..., this=0x7f4fefffde10) at db/write_batch.cc:1082
#2  rocksdb::MemTableInserter::DeleteRangeCF (this=0x7f4fefffde10, column_family_id=0, begin_key=..., end_key=...) at db/write_batch.cc:1459
#3  0x000000000050fb94 in rocksdb::WriteBatch::Iterate (Python Exception <type 'exceptions.NameError'> Installation error: gdb.execute_unwinders function is missing:
this=0x7f4fefffe5c0, handler=0x7f4fefffde10) at db/write_batch.cc:481
#4  0x0000000000513176 in rocksdb::WriteBatchInternal::InsertInto (Python Exception <type 'exceptions.NameError'> Installation error: gdb.execute_unwinders function is missing:
Python Exception <type 'exceptions.NameError'> Installation error: gdb.execute_unwinders function is missing:
writer=writer@entry=0x7f4fefffe1b0, sequence=<optimized out>, memtables=memtables@entry=0x7f4fefffe290, flush_scheduler=flush_scheduler@entry=0x15ee768,
    ignore_missing_column_families=<optimized out>, log_number=0, db=0x15eda50, concurrent_memtable_writes=true, lsn=0, seq_per_batch=false, batch_cnt=0, batch_per_txn=true) at db/write_batch.cc:1803
#5  0x00000000004c59e4 in rocksdb::DBImpl::WriteImpl (this=0x15eda50, write_options=..., my_batch=0x7f4fefffe5c0, callback=<optimized out>, log_used=0x0, log_ref=0, disable_memtable=false, seq_used=0x0, batch_cnt=0,
    pre_release_callback=0x0) at db/db_impl_write.cc:139
#6  0x00000000004c7b41 in rocksdb::DBImpl::Write (Python Exception <type 'exceptions.NameError'> Installation error: gdb.execute_unwinders function is missing:
this=<optimized out>, write_options=..., my_batch=<optimized out>) at db/db_impl_write.cc:55
#7  0x00000000004c7ea5 in rocksdb::DB::DeleteRange (Python Exception <type 'exceptions.NameError'> Installation error: gdb.execute_unwinders function is missing:
this=0x15eda50, opt=..., column_family=0x15f9710, begin_key=..., end_key=...) at db/db_impl_write.cc:1616
#8  0x000000000046eb4f in DeleteRange (Python Exception <type 'exceptions.NameError'> Installation error: gdb.execute_unwinders function is missing:
db=0x15eda50) at data_server/test/rocksdb_test.cpp:22
T0mmyliu commented 4 years ago

This code is not thread safe

  MemPostInfoMap& GetPostMap() {
    assert(concurrent_memtable_writes_);
    if(!post_info_created_) {
      new (&mem_post_info_map_) MemPostInfoMap();
      post_info_created_ = true;
    }
    return *reinterpret_cast<MemPostInfoMap*>(&mem_post_info_map_);
  }
riversand963 commented 4 years ago

This code is not thread safe

  MemPostInfoMap& GetPostMap() {
    assert(concurrent_memtable_writes_);
    if(!post_info_created_) {
      new (&mem_post_info_map_) MemPostInfoMap();
      post_info_created_ = true;
    }
    return *reinterpret_cast<MemPostInfoMap*>(&mem_post_info_map_);
  }

GetPostMap() is a non-public member function of MemTableInserter. There are several places where we construct MemTableInserter objects.

In these places, I cannot find multiple threads accessing the same MemTableInserter object.

I think the parallelism is in the WriteImpl where multiple threads can call InsertInto concurrently. However, when it goes down into InsertInto, each thread has its own MemTableInserter object.

Based on the above, could you elaborate a scenario in which GetPostMap is thread-unsafe?

Furthermore, the code example you post seems to be missing some important details. What does DeleteRange do for each thread? What is the purpose of sleep? Where is stop defined? Is it a shared variable accessed by your DeleteRange? Once stop is true, the thread exits and returns? What is the initial data in your db? Looks like empty?

I used the following code snippet, but cannot reproduce the crash you reported.

diff --git a/db/db_test2.cc b/db/db_test2.cc
index 368ebe936..7e2e21adb 100644
--- a/db/db_test2.cc
+++ b/db/db_test2.cc
@@ -4177,6 +4177,21 @@ TEST_F(DBTest2, CrashInRecoveryMultipleCF) {
                                           options));
   }
 }
+
+TEST_F(DBTest2, DeleteRangeRace) {
+  Options options = CurrentOptions();
+  options.create_if_missing = true;
+  options.allow_concurrent_memtable_write = true;
+  DestroyAndReopen(options);
+  const auto delete_range_func = [&](const Slice& begin_key, const Slice& end_key) {
+    WriteOptions write_opts;
+    ASSERT_OK(dbfull()->DeleteRange(write_opts, dbfull()->DefaultColumnFamily(), begin_key, end_key));
+  };
+  port::Thread thr1(delete_range_func, "a", "c");
+  port::Thread thr2(delete_range_func, "b", "d");
+  thr1.join();
+  thr2.join();
+}
 }  // namespace rocksdb

This does not crash on current master or 6.1.fb.

T0mmyliu commented 4 years ago

This code is not thread safe

  MemPostInfoMap& GetPostMap() {
    assert(concurrent_memtable_writes_);
    if(!post_info_created_) {
      new (&mem_post_info_map_) MemPostInfoMap();
      post_info_created_ = true;
    }
    return *reinterpret_cast<MemPostInfoMap*>(&mem_post_info_map_);
  }

GetPostMap() is a non-public member function of MemTableInserter. There are several places where we construct MemTableInserter objects.

In these places, I cannot find multiple threads accessing the same MemTableInserter object.

I think the parallelism is in the WriteImpl where multiple threads can call InsertInto concurrently. However, when it goes down into InsertInto, each thread has its own MemTableInserter object.

Based on the above, could you elaborate a scenario in which GetPostMap is thread-unsafe?

Furthermore, the code example you post seems to be missing some important details. What does DeleteRange do for each thread? What is the purpose of sleep? Where is stop defined? Is it a shared variable accessed by your DeleteRange? Once stop is true, the thread exits and returns? What is the initial data in your db? Looks like empty?

I used the following code snippet, but cannot reproduce the crash you reported.

diff --git a/db/db_test2.cc b/db/db_test2.cc
index 368ebe936..7e2e21adb 100644
--- a/db/db_test2.cc
+++ b/db/db_test2.cc
@@ -4177,6 +4177,21 @@ TEST_F(DBTest2, CrashInRecoveryMultipleCF) {
                                           options));
   }
 }
+
+TEST_F(DBTest2, DeleteRangeRace) {
+  Options options = CurrentOptions();
+  options.create_if_missing = true;
+  options.allow_concurrent_memtable_write = true;
+  DestroyAndReopen(options);
+  const auto delete_range_func = [&](const Slice& begin_key, const Slice& end_key) {
+    WriteOptions write_opts;
+    ASSERT_OK(dbfull()->DeleteRange(write_opts, dbfull()->DefaultColumnFamily(), begin_key, end_key));
+  };
+  port::Thread thr1(delete_range_func, "a", "c");
+  port::Thread thr2(delete_range_func, "b", "d");
+  thr1.join();
+  thr2.join();
+}
 }  // namespace rocksdb

This does not crash on current master or 6.1.fb.

all my test code is

#include <gtest/gtest.h>

#include "rocksdb/db.h"
#include "rocksdb/options.h"
#include "rocksdb/snapshot.h"
#include <thread>

using namespace std;

bool stop = false;
void DeleteRange(rocksdb::DB* db)
{
  while(!stop) {
    rocksdb::WriteOptions delete_range_option;
    db->DeleteRange(delete_range_option, db->DefaultColumnFamily(), "1", "2");
  }
}

TEST(Basic, SingleThread)
{
  rocksdb::DB* db;
  rocksdb::Options options;
  options.create_if_missing = true;
  options.allow_concurrent_memtable_write = false;

  char* tmpfile;
  tmpfile = tmpnam(nullptr);

  rocksdb::DB::Open(options, tmpfile, &db);

  std::thread t1(DeleteRange, db);
  std::thread t2(DeleteRange, db);

  sleep(3);
  stop = true;
  t1.join();
  t2.join();
  stop = false;
}

TEST(Basic, MultiThread)
{
  rocksdb::DB* db;
  rocksdb::Options options;
  options.create_if_missing = true;
  options.allow_concurrent_memtable_write = true;

  char* tmpfile;
  tmpfile = tmpnam(nullptr);

  rocksdb::DB::Open(options, tmpfile, &db);

  std::thread t1(DeleteRange, db);
  std::thread t2(DeleteRange, db);

  sleep(3);
  stop = true;
  t1.join();
  t2.join();
  stop = false;
}

int main(int argc, char* argv[])
{
  return TestMain(argc, argv);
}
T0mmyliu commented 4 years ago

This code is not thread safe

  MemPostInfoMap& GetPostMap() {
    assert(concurrent_memtable_writes_);
    if(!post_info_created_) {
      new (&mem_post_info_map_) MemPostInfoMap();
      post_info_created_ = true;
    }
    return *reinterpret_cast<MemPostInfoMap*>(&mem_post_info_map_);
  }

GetPostMap() is a non-public member function of MemTableInserter. There are several places where we construct MemTableInserter objects.

In these places, I cannot find multiple threads accessing the same MemTableInserter object.

I think the parallelism is in the WriteImpl where multiple threads can call InsertInto concurrently. However, when it goes down into InsertInto, each thread has its own MemTableInserter object.

Based on the above, could you elaborate a scenario in which GetPostMap is thread-unsafe?

Furthermore, the code example you post seems to be missing some important details. What does DeleteRange do for each thread? What is the purpose of sleep? Where is stop defined? Is it a shared variable accessed by your DeleteRange? Once stop is true, the thread exits and returns? What is the initial data in your db? Looks like empty?

I used the following code snippet, but cannot reproduce the crash you reported.

diff --git a/db/db_test2.cc b/db/db_test2.cc
index 368ebe936..7e2e21adb 100644
--- a/db/db_test2.cc
+++ b/db/db_test2.cc
@@ -4177,6 +4177,21 @@ TEST_F(DBTest2, CrashInRecoveryMultipleCF) {
                                           options));
   }
 }
+
+TEST_F(DBTest2, DeleteRangeRace) {
+  Options options = CurrentOptions();
+  options.create_if_missing = true;
+  options.allow_concurrent_memtable_write = true;
+  DestroyAndReopen(options);
+  const auto delete_range_func = [&](const Slice& begin_key, const Slice& end_key) {
+    WriteOptions write_opts;
+    ASSERT_OK(dbfull()->DeleteRange(write_opts, dbfull()->DefaultColumnFamily(), begin_key, end_key));
+  };
+  port::Thread thr1(delete_range_func, "a", "c");
+  port::Thread thr2(delete_range_func, "b", "d");
+  thr1.join();
+  thr2.join();
+}
 }  // namespace rocksdb

This does not crash on current master or 6.1.fb.

thank you for reply, I will try to reproduce it at 6.1.fb.

T0mmyliu commented 4 years ago

This code is not thread safe

  MemPostInfoMap& GetPostMap() {
    assert(concurrent_memtable_writes_);
    if(!post_info_created_) {
      new (&mem_post_info_map_) MemPostInfoMap();
      post_info_created_ = true;
    }
    return *reinterpret_cast<MemPostInfoMap*>(&mem_post_info_map_);
  }

GetPostMap() is a non-public member function of MemTableInserter. There are several places where we construct MemTableInserter objects.

In these places, I cannot find multiple threads accessing the same MemTableInserter object.

I think the parallelism is in the WriteImpl where multiple threads can call InsertInto concurrently. However, when it goes down into InsertInto, each thread has its own MemTableInserter object.

Based on the above, could you elaborate a scenario in which GetPostMap is thread-unsafe?

Furthermore, the code example you post seems to be missing some important details. What does DeleteRange do for each thread? What is the purpose of sleep? Where is stop defined? Is it a shared variable accessed by your DeleteRange? Once stop is true, the thread exits and returns? What is the initial data in your db? Looks like empty?

I used the following code snippet, but cannot reproduce the crash you reported.

diff --git a/db/db_test2.cc b/db/db_test2.cc
index 368ebe936..7e2e21adb 100644
--- a/db/db_test2.cc
+++ b/db/db_test2.cc
@@ -4177,6 +4177,21 @@ TEST_F(DBTest2, CrashInRecoveryMultipleCF) {
                                           options));
   }
 }
+
+TEST_F(DBTest2, DeleteRangeRace) {
+  Options options = CurrentOptions();
+  options.create_if_missing = true;
+  options.allow_concurrent_memtable_write = true;
+  DestroyAndReopen(options);
+  const auto delete_range_func = [&](const Slice& begin_key, const Slice& end_key) {
+    WriteOptions write_opts;
+    ASSERT_OK(dbfull()->DeleteRange(write_opts, dbfull()->DefaultColumnFamily(), begin_key, end_key));
+  };
+  port::Thread thr1(delete_range_func, "a", "c");
+  port::Thread thr2(delete_range_func, "b", "d");
+  thr1.join();
+  thr2.join();
+}
 }  // namespace rocksdb

This does not crash on current master or 6.1.fb.

I reproduce crash at 6.1.fb

Try

const auto delete_range_func = [&](const Slice& begin_key, const Slice& end_key) {
    while(true){
    WriteOptions write_opts;
    ASSERT_OK(dbfull()->DeleteRange(write_opts, dbfull()->DefaultColumnFamily(), begin_key, end_key));
}
};
riversand963 commented 4 years ago

It's weird. I ported your test to our unit test framework, and it seems to be running OK. I also tried changing the thread func to execute the aforementioned infinite loop, but could not see the crash.

yanqin@myhost:~/rocksdb  (6.1.fb)$ ./db_test2 --gtest_filter=DBTest2.DeleteRangeRace
Note: Google Test filter = DBTest2.DeleteRangeRace
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from DBTest2
[ RUN      ] DBTest2.DeleteRangeRace
delete range 525263 times
[       OK ] DBTest2.DeleteRangeRace (3068 ms)
[----------] 1 test from DBTest2 (3068 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (3068 ms total)
[  PASSED  ] 1 test.
yanqin@myhost:~/rocksdb  (6.1.fb)$ git diff
diff --git a/db/db_test2.cc b/db/db_test2.cc
index 6a00300eb..6c1abb079 100644
--- a/db/db_test2.cc
+++ b/db/db_test2.cc
@@ -3685,6 +3685,33 @@ class DummyOldStats : public Statistics {
   int num_rt = 0;
   int num_mt = 0;
 };
+
+TEST_F(DBTest2, DeleteRangeRace) {
+  //Options options = CurrentOptions();
+  Options options;
+  options.create_if_missing = true;
+  options.allow_concurrent_memtable_write = true;
+  DestroyAndReopen(options);
+  std::atomic<int> stop{0};
+  std::atomic<int64_t> count{0};
+  const auto delete_range_func = [&](const Slice& begin_key,
+                                     const Slice& end_key) {
+    while (stop.load() == 0) {
+      WriteOptions write_opts;
+      ASSERT_OK(dbfull()->DeleteRange(
+          write_opts, dbfull()->DefaultColumnFamily(), begin_key, end_key));
+      ++count;
+    }
+  };
+  port::Thread thr1(delete_range_func, "1", "2");
+  port::Thread thr2(delete_range_func, "1", "2");
+  options.env->SleepForMicroseconds(3 * 1000000);
+  stop.store(1);
+  thr1.join();
+  thr2.join();
+  stop.store(0);
+  fprintf(stdout, "delete range %" PRIi64 " times\n", count.load());
+}
 }  // namespace

 TEST_F(DBTest2, OldStatsInterface) {
riversand963 commented 3 years ago

Closing after long silence on this thread.