Open sumeerbhola opened 2 weeks ago
Should we make the load block semaphore shared across DBs, like the block cache? The primary motivation was preventing excessive memory utilization from many concurrent reads, right? And memory is a shared resource across DBs.
That would have been easier to implement, but there was concern that one store that is broken and has very slow IOs will block out all other stores. https://github.com/cockroachdb/cockroach/blob/3644f0d3fe77c03c20d3b603c1c0eb6f335e7e15/pkg/storage/pebble.go#L997
Concurrent reads of the same block have been observed to cause very high memory usage, and are wasteful of disk bandwidth. We now coordinate across multiple concurrent attempts to read the same block via a readEntry, which makes the readers take turns until one succeeds.
The readEntries are embedded in a map that is part of a readShard, where there is a readShard for each cache.Shard. See the long comment in the readShard declaration for motivation. The Options.LoadBlockSema is integrated into the readEntry, to simplify the waiting logic in the caller.
Callers interact with this new behavior via Cache.GetWithReadHandle, which is only for callers that intend to do a read and then populate the cache. If this method returns a ReadHandle, the caller must first wait for permission to do a read. See the ReadHandle comment for details of the contract.
Fixes #4138