facebook / rocksdb

A library that provides an embeddable, persistent key-value store for fast storage.
http://rocksdb.org
GNU General Public License v2.0
28.4k stars 6.29k forks source link

Documentation is unclear about manual_wal_flush with WriteOptions::sync and without two_write_queues #12287

Open cvijdea-bd opened 8 months ago

cvijdea-bd commented 8 months ago

When using manual_wal_flush=true without two_write_queues (i.e. two_write_queues=false), and without using transactions (just atomic_flush=true), I expect that writes with WriteOptions::sync=true should both flush the buffer and sync the WAL file, i.e. equivalent of FlushWAL(true).

However nothing in the documentation explicitly promises this behaviour, and looking at the code I'm unsure if it is even actually true.

It seems to me that AddRecord which is called by Write will never call Flush if manual_flush is true: https://github.com/facebook/rocksdb/blob/59f4cbef8c3ada628c47ff7868a93c7d01e02fd7/db/log_writer.cc#L170

Also, the Write code itself will only call Sync directly on the file, but not directly flush the log writer into the file beforehand: https://github.com/facebook/rocksdb/blob/59f4cbef8c3ada628c47ff7868a93c7d01e02fd7/db/db_impl/db_impl_write.cc#L1433

Only in case of two_write_queues=true does Write trigger a call to FlushWAL(true).

Expected behavior

WriteOptions::sync=true should ensure data persistence when combined with manual_wal_flush and this should be explicitly documented.

cvijdea-bd commented 8 months ago

Update: after further digging, it seems that WritableFileWriter::Sync will itself call Flush: https://github.com/facebook/rocksdb/blob/928aca835f2a048913a2d07701fe3b8434849939/file/writable_file_writer.cc#L460

So it seems my initial assumption holds, and writing WriteOptions::sync=true is semantically equivalent to manually following the write with a FlushWAL(true).
However I still think this should be explicitly spelled out in the documentation for either manual_wal_flush or WriteOptions::sync.