Open archang19 opened 3 days ago
@archang19 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
@archang19 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
Summary
After https://github.com/facebook/rocksdb/pull/13118 was merged, I did some investigation to see whether the file system buffer reuse code was actually being used.
The good news is that I was able to see from the CPU profiling results that my code is getting invoked through the warm storage stress tests.
The bad news is that most of the time, the optimization is not being used, so we end up going through the regular old
RandomAccessFileReader::Read
path.Here is the entire function call chain up to
FilePrefetchBuffer::Read
At this point, we split into
rocksdb::RandomAccessFileReader::Read
androcksdb::FilePrefetchBuffer::FSBufferDirectRead
.FSBufferDirectRead
gets called <3% of the time.I think the root cause is that the
FileSystem* fs
parameter is not getting passed into theFilePrefetchBuffer
constructor. Whenfs
isnullptr
,UseFSBuffer()
will always returnfalse
and we do not end up callingFSBufferDirectRead
.Luckily, it does not seem like there are too many places I need to change.
BlockBasedTable
resets itsprefetch_buffer
in 3 separate places. When it disables the prefetch buffer (2/3 of the instances), we don't care about whether thefs
parameter is there. This PR is addressing the third instance, where it is not trying to disable the buffer.Note that there is another method,
PrefetchBufferCollection::GetOrCreatePrefetchBuffer
that creates newFilePrefetchBuffer
s without thefs
parameter. This method gets called bycompaction_iterator
andmerge_helper
. I think we can address this in a subsequent PR:rocksdb::FilePrefetchBuffer::FSBufferDirectRead
was invoked.PrefetchBufferCollection
Test Plan
The existing unit test coverage guards against obvious bugs. I ran another set of performance tests to confirm there were no regressions in CPU utilization.