Closed pdillinger closed 3 months ago
@pdillinger has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
@pdillinger has updated the pull request. You must reimport the pull request before landing.
@pdillinger has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
I'm good with the partitioned_index_reader fix but want to discuss more on the EnforceReadOpts and thread-local context. So feel free to separate this into two PRs to unblock stress test.
For EnforceReadOpts::UsedIO()
, my understanding is we call this to verify we are allowed to do IO before we do IO. Can we move this check to lower-level like file reader or file system level?
For EnforceReadOpts enforce(ropts)
, do we envision adding more of this in BlockBasedTableReader layer?
@pdillinger has updated the pull request. You must reimport the pull request before landing.
@pdillinger has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
OK, PR is split (next one with EnforceReadOpts to be created)
@pdillinger merged this pull request in facebook/rocksdb@d64eac28d32a025770cba641ea04e697f475cdd6.
Summary: The crash test revealed a case in which the uncache functionality in ~BlockBasedTableReader could initiate an block read (IO), despite setting ReadOptions::read_tier = kBlockCacheTier.
The root cause is a place in the code where many people have over time decided to opt-in propagating ReadOptions and no one took the initiative to propagate ReadOptions by default (opt out / override only as needed). The fix is in partitioned_index_reader.cc. Here, ReadOptions::readahead_size is opted-out to avoid churn in prefetch_test that is not clearly an improvement or regression. It's hard to tell given the poor state of relevant documentation #12756. The affected unit test was added in #10602.
Test Plan: (Now postponed to a follow-up diff) I have added some new infrastructure to DEBUG builds to catch this specific kind of violation in unit tests and in the stress/crash test.
EnforceReadOpts
establishes a thread-local context under which we assert no IOs are performed if ReadOptions said it should be forbidden. With this new checking, the Uncache unit test would catch the critical step toward a violation (inner ReadOptions allowing IO, even if no IO is actually performed), which is fixed with the production code change.