Closed MrCroxx closed 4 months ago
Yugabyte introduced a methodology to avoid large scans polluting the cache. Worth reading: https://github.com/YugaByte/yugabyte-db/commit/0c6a3f018ac90724ac1106ff248c051afbdd6979
Considering that operators such as aggregation and hash join only requires single point lookup without range, shall we consider a kv cache independent of block cache?
Both Rocksdb and Apache Cassandra have kv caches.
Papers, e.g. AC-Key: Adaptive Caching for LSM-based Key-Value Stores and A Low-cost Disk Solution Enabling LSM-tree to Achieve High Performance for Mixed Read/Write Workloads, directly and indirectly, confirm the effectiveness of kv-cache.
@lmatz Looks amazing. I also thought about it before. And I think ScyllaDB also gives a good example to maintain a row-based cache.
And with secondary cache, I think we can support row-based cache in memory, and use block-based cache as secondary disk cache.
This is a tracking issue for Hummock File Cache System.
Hummock File Cache System servers as a new tier of cache to utilize the spare disk space as block cache.