facebook / rocksdb

A library that provides an embeddable, persistent key-value store for fast storage.
http://rocksdb.org
GNU General Public License v2.0
28.57k stars 6.31k forks source link

coredump with DeleteRange + IngestFile #2398

Closed siddontang closed 6 years ago

siddontang commented 7 years ago

Hi @ajkr

We try to use DeleteRange + IngestFile now but meet coredump, any idea?

#0  rocksdb::LRUCache::GetHash (this=0x7f8bc1c1d5b8, handle=0x0) at cache/lru_cache.cc:471
#1  0x00007f8bc60232ef in rocksdb::ShardedCache::Release (this=0x7f8bc1c1d5b8, handle=0x0, force_erase=<optimized out>) at cache/sharded_cache.cc:69
#2  0x00007f8bc619d19d in DoCleanup (this=<optimized out>) at ./include/rocksdb/cleanable.h:62
#3  rocksdb::Cleanable::~Cleanable (this=<optimized out>, __in_chrg=<optimized out>) at table/iterator.cc:24
#4  0x00007f8bc6178694 in ~InternalIterator (this=0x7f8bae264000, __in_chrg=<optimized out>) at ./table/internal_iterator.h:21
#5  rocksdb::BlockIter::~BlockIter (this=0x7f8bae264000, __in_chrg=<optimized out>) at ./table/block.h:201
#6  0x00007f8bc60bfffe in reset (iter=0x0, this=<synthetic pointer>) at ./table/scoped_arena_iterator.h:18
#7  ~ScopedArenaIterator (this=<synthetic pointer>, __in_chrg=<optimized out>) at ./table/scoped_arena_iterator.h:55
#8  rocksdb::ExternalSstFileIngestionJob::IngestedFileOverlapWithLevel (this=this@entry=0x7f8b6c7f8fe0, sv=sv@entry=0x7f8bb44d3480, file_to_ingest=file_to_ingest@entry=0x7f8b6c7f9018, lvl=lvl@entry=0,
    overlap_with_level=overlap_with_level@entry=0x7f8b6c7f843f) at db/external_sst_file_ingestion_job.cc:643
#9  0x00007f8bc60c02a4 in rocksdb::ExternalSstFileIngestionJob::AssignLevelAndSeqnoForIngestedFile (this=this@entry=0x7f8b6c7f8fe0, sv=sv@entry=0x7f8bb44d3480, force_global_seqno=force_global_seqno@entry=false,
    compaction_style=rocksdb::kCompactionStyleLevel, file_to_ingest=file_to_ingest@entry=0x7f8b6c7f9018, assigned_seqno=assigned_seqno@entry=0x7f8b6c7f8de8) at db/external_sst_file_ingestion_job.cc:438
#10 0x00007f8bc60c0a83 in rocksdb::ExternalSstFileIngestionJob::Run (this=this@entry=0x7f8b6c7f8fe0) at db/external_sst_file_ingestion_job.cc:177
#11 0x00007f8bc60733e4 in rocksdb::DBImpl::IngestExternalFile (this=0x7f8bb8f6b800, column_family=<optimized out>, external_files=..., ingestion_options=...) at db/db_impl.cc:2642
#12 0x00007f8bc7b82db5 in crocksdb_ingest_external_file_cf (db=0x7f8bc1a0e9f0, handle=0x7f8bc1a0e9e8, file_list=<optimized out>, list_len=<optimized out>, opt=0x7f8bae2266e8, errptr=0x7f8b6c7fa048) at crocksdb/c.cc:2617
#13 0x00007f8bc7b63e54 in rocksdb::rocksdb::{{impl}}::ingest_external_file_cf (self=0x7f8bc1c5eb10, cf=0x7f8bc1caa588, opt=0x7f8b6c7fa748, files=...)
    at /home/pingcap/.cargo/git/checkouts/rust-rocksdb-82ef6e5337b3fbe6/6b0e5e1/src/rocksdb.rs:991
ajkr commented 7 years ago

Do you see it every time? #2399 fixes an issue with improperly cleaning up range deletion iterators. Although, for me, that issue only caused ASAN failure, not coredump, so I'm not sure yet if it's the same issue. Want to try out that patch?

siddontang commented 7 years ago

Hi @ajkr

Unfortunately, I update the master but still meet the coredump.

It happens every time when I run our TiKV tests.

ajkr commented 7 years ago

The stack trace must be different, though, right? Because my fix makes ScopedArenaIterator no longer used for range deletions.

siddontang commented 7 years ago

yes, I think #2399 doesn't fix it 😢

ajkr commented 7 years ago

Can you post the updated stack trace?

siddontang commented 7 years ago
rocksdb::LRUCache::GetHash (this=0x7ffff46bbfd8, handle=0x0) at cache/lru_cache.cc:471
471   return reinterpret_cast<const LRUHandle*>(handle)->hash;
(gdb) bt
#0  rocksdb::LRUCache::GetHash (this=0x7ffff46bbfd8, handle=0x0) at cache/lru_cache.cc:471
#1  0x00007ffff75a1a1f in rocksdb::ShardedCache::Release (this=0x7ffff46bbfd8, handle=0x0, force_erase=<optimized out>) at cache/sharded_cache.cc:69
#2  0x00007ffff771bced in DoCleanup (this=<optimized out>) at ./include/rocksdb/cleanable.h:62
#3  rocksdb::Cleanable::~Cleanable (this=<optimized out>, __in_chrg=<optimized out>) at table/iterator.cc:24
#4  0x00007ffff76f7414 in ~InternalIterator (this=0x7ffff556db00, __in_chrg=<optimized out>) at ./table/internal_iterator.h:21
#5  ~BlockIter (this=0x7ffff556db00, __in_chrg=<optimized out>) at ./table/block.h:201
#6  rocksdb::BlockIter::~BlockIter (this=0x7ffff556db00, __in_chrg=<optimized out>) at ./table/block.h:201
#7  0x00007ffff763e628 in operator() (this=<optimized out>, __ptr=0x7ffff556db00) at /usr/include/c++/4.8/bits/unique_ptr.h:67
#8  ~unique_ptr (this=<synthetic pointer>, __in_chrg=<optimized out>) at /usr/include/c++/4.8/bits/unique_ptr.h:184
#9  rocksdb::ExternalSstFileIngestionJob::IngestedFileOverlapWithLevel (this=this@entry=0x7fffb41f8220, sv=sv@entry=0x7ffff1029480,
    file_to_ingest=file_to_ingest@entry=0x7fffb41f8258, lvl=lvl@entry=0, overlap_with_level=overlap_with_level@entry=0x7fffb41f767f)
    at db/external_sst_file_ingestion_job.cc:648
#10 0x00007ffff763e8fc in rocksdb::ExternalSstFileIngestionJob::AssignLevelAndSeqnoForIngestedFile (this=this@entry=0x7fffb41f8220, sv=sv@entry=0x7ffff1029480,
    force_global_seqno=force_global_seqno@entry=false, compaction_style=rocksdb::kCompactionStyleLevel, file_to_ingest=file_to_ingest@entry=0x7fffb41f8258,
    assigned_seqno=assigned_seqno@entry=0x7fffb41f8028) at db/external_sst_file_ingestion_job.cc:441
#11 0x00007ffff763f0e3 in rocksdb::ExternalSstFileIngestionJob::Run (this=this@entry=0x7fffb41f8220) at db/external_sst_file_ingestion_job.cc:177
#12 0x00007ffff75f1c14 in rocksdb::DBImpl::IngestExternalFile (this=0x7fffe82b2000, column_family=<optimized out>, external_files=..., ingestion_options=...)
    at db/db_impl.cc:2642
#13 0x0000555556657db5 in crocksdb_ingest_external_file_cf (db=0x7ffff460e5d0, handle=0x7ffff460e5c8, file_list=<optimized out>, list_len=<optimized out>,
    opt=0x7ffff5643e58, errptr=0x7fffb41f9288) at crocksdb/c.cc:2617
#14 0x0000555556638e54 in rocksdb::rocksdb::{{impl}}::ingest_external_file_cf (self=0x7ffff47833b0, cf=0x7fffe8214b08, opt=0x7fffb41f9988, files=...)
    at /home/pingcap/.cargo/git/checkouts/rust-rocksdb-82ef6e5337b3fbe6/6b0e5e1/src/rocksdb.rs:991
#15 0x00005555561d2b2e in tikv::raftstore::store::snap::v2::{{impl}}::apply (self=0x7ffff55a6a80, options=...) at src/raftstore/store/snap.rs:1277

Seem still the same with origin core dump stack.

ajkr commented 7 years ago

sorry I couldn't repro and the stack trace doesn't make sense to me (the line shown in frame 9 shouldn't call unique_ptr destructor). can you try make clean && DEBUG_LEVEL=2 make ... and send it again?

#8  ~unique_ptr (this=<synthetic pointer>, __in_chrg=<optimized out>) at /usr/include/c++/4.8/bits/unique_ptr.h:184
#9  rocksdb::ExternalSstFileIngestionJob::IngestedFileOverlapWithLevel (this=this@entry=0x7fffb41f8220, sv=sv@entry=0x7ffff1029480,
    file_to_ingest=file_to_ingest@entry=0x7fffb41f8258, lvl=lvl@entry=0, overlap_with_level=overlap_with_level@entry=0x7fffb41f767f)
    at db/external_sst_file_ingestion_job.cc:648
siddontang commented 7 years ago

Hi @ajkr

I follow your instruction but still get the same stack

#7  0x00007ffff763e628 in operator() (this=<optimized out>, __ptr=0x7fffd29e9200) at /usr/include/c++/4.8/bits/unique_ptr.h:67
#8  ~unique_ptr (this=<synthetic pointer>, __in_chrg=<optimized out>) at /usr/include/c++/4.8/bits/unique_ptr.h:184
#9  rocksdb::ExternalSstFileIngestionJob::IngestedFileOverlapWithLevel (this=this@entry=0x7fff977f7220, sv=sv@entry=0x7fffe6434480,
    file_to_ingest=file_to_ingest@entry=0x7fff977f7258, lvl=lvl@entry=0, overlap_with_level=overlap_with_level@entry=0x7fff977f667f)
    at db/external_sst_file_ingestion_job.cc:648
#10 0x00007ffff763e8fc in rocksdb::ExternalSstFileIngestionJob::AssignLevelAndSeqnoForIngestedFile (this=this@entry=0x7fff977f7220, sv=sv@entry=0x7fffe6434480,
    force_global_seqno=force_global_seqno@entry=false, compaction_style=rocksdb::kCompactionStyleLevel, file_to_ingest=file_to_ingest@entry=0x7fff977f7258,
    assigned_seqno=assigned_seqno@entry=0x7fff977f7028) at db/external_sst_file_ingestion_job.cc:441
siddontang commented 7 years ago

Hi @ajkr

I use a simple test to reproduce it but with rust language not C++ in our rust-rocksdb project.

Before I ingest file at https://github.com/pingcap/rust-rocksdb/blob/busyjay/delete-range-fix/tests/test_ingest_external_file.rs#L79, I add db.delete_range_cf(handle, b"k1", b"k3").unwrap();, then the following ingest will panic.

ngaut commented 7 years ago

Any update? Please let me know if you need help to reproduce the issue. @ajkr Thanks.

gfosco commented 6 years ago

Closing this via automation due to lack of activity. If discussion is still needed here, please re-open or create a new/updated issue.