Hummock: File Cache System

risingwavelabs / risingwave

Best-in-class stream processing, analytics, and management. Perform continuous analytics, or build event-driven applications, real-time ETL pipelines, and feature stores in minutes. Unified streaming and batch. PostgreSQL compatible.

https://go.risingwave.com/slack

Apache License 2.0

6.8k stars 564 forks source link

Hummock: File Cache System #198

Closed MrCroxx closed 4 months ago

MrCroxx commented 2 years ago

This is a tracking issue for Hummock File Cache System.

Hummock File Cache System servers as a new tier of cache to utilize the spare disk space as block cache.

[x] Design doc.
[x] S3 benchmark.
[x] EBS DIO benchmark
[x] #3556
[ ] #3889
[ ] ...

twocode commented 2 years ago

Yugabyte introduced a methodology to avoid large scans polluting the cache. Worth reading: https://github.com/YugaByte/yugabyte-db/commit/0c6a3f018ac90724ac1106ff248c051afbdd6979

lmatz commented 2 years ago

Considering that operators such as aggregation and hash join only requires single point lookup without range, shall we consider a kv cache independent of block cache?

Both Rocksdb and Apache Cassandra have kv caches.

Papers, e.g. AC-Key: Adaptive Caching for LSM-based Key-Value Stores and A Low-cost Disk Solution Enabling LSM-tree to Achieve High Performance for Mixed Read/Write Workloads, directly and indirectly, confirm the effectiveness of kv-cache.

MrCroxx commented 2 years ago

@lmatz Looks amazing. I also thought about it before. And I think ScyllaDB also gives a good example to maintain a row-based cache.

And with secondary cache, I think we can support row-based cache in memory, and use block-based cache as secondary disk cache.

The latency delta between disk and s3 is much larger than that between memory and disk. Reduce reads from S3 has higher priority.
Row-based secondary disk cache means there are more small-sized requests to S3.
The space of memory is mush precious and smaller than disk.