facebook / rocksdb

A library that provides an embeddable, persistent key-value store for fast storage.
http://rocksdb.org
GNU General Public License v2.0
27.9k stars 6.21k forks source link

Refactor SyncWAL and SyncClosedLogs for code sharing #12707

Closed pdillinger closed 1 month ago

pdillinger commented 1 month ago

Summary: These functions were very similar and did not make sense for maintaining separately. This is not a pure refactor but I think bringing the behaviors closer together should reduce long term risk of unintentionally divergent behavior. This change is motivated by some forthcoming WAL handling fixes for Checkpoint and Backups.

Cosmetic changes:

Recommended follow-up:

Test Plan: existing tests

facebook-github-bot commented 1 month ago

@pdillinger has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

anand1976 commented 1 month ago

Sync() is always used on closed WALs, like the old SyncClosedWals. SyncWithoutFlush() is only used on the active (maybe) WAL.

Flush() is not safe to call concurrently. That's probably why SyncWAL() calls SyncWithoutFlush() and SyncClosedLogs() calls Sync().

pdillinger commented 1 month ago

Looks good. I think we need to call SyncWithoutFlush() in the SyncWAL() path. The rest looks good.

DBImpl::SwitchMemtable() guarantees that WALs before the current are flushed (cur_log_writer->WriteBuffer(write_options)). I don't see the problem with using Sync on such WALs, as SyncClosedLogs() already did before this change.

anand1976 commented 1 month ago

DBImpl::SwitchMemtable() guarantees that WALs before the current are flushed (cur_log_writer->WriteBuffer(write_options)).

Indeed it does. Thanks for digging into it. So calling Sync concurrently shouldn't be a problem in practice, although it still makes me a little uncomfortable.

I don't see the problem with using Sync on such WALs, as SyncClosedLogs() already did before this change.

Can SyncClosedLogs() be called concurrently? I believe not, since its only called during Flush. But I'm not 100% sure.

Also, why not always call SyncWithoutFlush() to be on the safe side if there's nothing to be flushed?

anand1976 commented 1 month ago

Can SyncClosedLogs() be called concurrently? I believe not, since its only called during Flush. But I'm not 100% sure.

Never mind. Its called from the background thread, so I guess it can be.

facebook-github-bot commented 1 month ago

@pdillinger merged this pull request in facebook/rocksdb@7127119ae9021fc1ebe8ec7b45c2b9c7c825f102.