facebook / rocksdb

A library that provides an embeddable, persistent key-value store for fast storage.
http://rocksdb.org
GNU General Public License v2.0
27.83k stars 6.2k forks source link

Fix possible crash in failure to sync some WALs #12789

Closed pdillinger closed 1 week ago

pdillinger commented 1 week ago

Summary: I believe this was possible with recyclable logs before recent work like #12734, but this cleans up a couple of possible crashes revealed by the crash test. A WAL with a nullptr file writer (already closed) can persist in logs_ if a later WAL fails to sync. In case of any WAL sync failures, we don't record WAL syncs to the manifest. Thus, even if a WAL is fully synced and closed, we might need to keep it on the logs_ list so that we know to record its sync to the manifest if there should be a successful sync next time. (However, I believe that's future-looking because currently any failure in WAL sync is considered non-recoverable.)

I don't believe this was likely enough before recent changes to warrant a release note (if it was possible).

Test Plan: A unit test that would reveal the crashes, now fixed

facebook-github-bot commented 1 week ago

@pdillinger has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot commented 1 week ago

@pdillinger has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot commented 1 week ago

@pdillinger has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot commented 1 week ago

@pdillinger has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot commented 1 week ago

@pdillinger has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot commented 1 week ago

@pdillinger merged this pull request in facebook/rocksdb@3ee4d5a11a882056b341a9a1694a71371a39f664.

hx235 commented 1 week ago

I believe that's future-looking because currently any failure in WAL sync is considered non-recoverable

@pdillinger : for WAL sync failure under WriteOptions::sync=true, it seems that we will kick off auto-resume in RocksDB by the following stack. So it's recoverable?

rocksdb::ErrorHandler::StartRecoverFromRetryableBGIOError(rocksdb::IOStatus const&) (/data/users/huixiao/rocksdb/db/error_handler.cc:665)
rocksdb::ErrorHandler::SetBGError(rocksdb::Status const&, rocksdb::BackgroundErrorReason) (/data/users/huixiao/rocksdb/db/error_handler.cc:472)
rocksdb::DBImpl::IOStatusCheck(rocksdb::IOStatus const&) (/data/users/huixiao/rocksdb/db/db_impl/db_impl_write.cc:1190)
rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, rocksdb::WriteBatch*, rocksdb::WriteCallback*, rocksdb::UserWriteCallback*, unsigned long*, unsigned long, bool, unsigned long*, unsigned long, rocksdb::PreReleaseCallback*, rocksdb::PostMemTableCallback*) (/data/users/huixiao/rocksdb/db/db_impl/db_impl_write.cc:631)
rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, rocksdb::WriteBatch*) (/data/users/huixiao/rocksdb/db/db_impl/db_impl_write.cc:158)
rocksdb::DB::Put(rocksdb::WriteOptions const&, rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, rocksdb::Slice const&) (/data/users/huixiao/rocksdb/db/db_impl/db_impl_write.cc:2453)rocksdb::DBImpl::Put(rocksdb::WriteOptions const&, rocksdb::ColumnFamilyHandle*, rocksdb::Slice const&, rocksdb::Slice const&) (/data/users/huixiao/rocksdb/db/db_impl/db_impl_write.cc:29)