metrics: Hummock block cache

risingwavelabs / risingwave

Best-in-class stream processing, analytics, and management. Perform continuous analytics, or build event-driven applications, real-time ETL pipelines, and feature stores in minutes. Unified streaming and batch. PostgreSQL compatible.

https://go.risingwave.com/slack

Apache License 2.0

7.03k stars 578 forks source link

metrics: Hummock block cache #2668

Closed skyzh closed 2 years ago

skyzh commented 2 years ago

When importing ~1GB data into RisingWave with two tables joining, the object store throughput is ridiculously high. Maybe we will need block cache metrics to analyze what's happening inside.

hzxa21 commented 2 years ago

We fill block cache on uploading SST and with the "unlimited" block cache default config, all SSTs will be fully cached. However, is it expected for executor to issue a lof of reads to storage when joining two tables? I thought executor has its own cache and I expect the executor cache hit ratio to be relatively high if we have enough memory.

skyzh commented 2 years ago

is it expected for executor to issue a lof of reads to storage when joining two tables

For initial data, yes. No one knows whether a key exists in shared storage before actually get it. After that, this info will be cached.

skyzh commented 2 years ago

The TPC-H workload is all inserts, so it will initiate many scans to the storage that returns zero rows.

jon-chuang commented 2 years ago

I believe could be the result of fetching individual keys. Every cache miss will issue a scan for the key. If instead the reads were batched together (as indeed could be done via: https://github.com/singularity-data/risingwave/issues/2428), it may be able to improve the performance...

lmatz commented 2 years ago

We may introduce an executor-level cuckoo(for deletion) filter to filter once before doing check whether a key exists in shared storage by fetching SST's bloom filter. We may use this filter only before the executor's cache gets full or forever.

Since we do not need to put those keys already cached by the executor into the filter, the efficiency of the cuckoo filter could be quite high.

jon-chuang commented 2 years ago

Quite curious, if we rely on SST bloom filter, are we making a domain socket call for IPC? Maybe there should be a bloom filter in the hash join state itself?

jon-chuang commented 2 years ago

If instead the reads were batched together

Also, there doesn't seem to be an interface for this? I guess we will have to scan multiple key ranges anyway...

lmatz commented 2 years ago

Quite curious, if we rely on SST bloom filter, are we making a domain socket call for IPC? Maybe there should be a bloom filter in the hash join state itself?

We may consider the SST's bloom filter is also a type of block that can be cached in Block Cache, so if it is cached, then we just read from the memory. If it is not, we fetch either only the block or(I am not sure which one is in use) the entire SST from S3.

hzxa21 commented 2 years ago

Quite curious, if we rely on SST bloom filter, are we making a domain socket call for IPC? Maybe there should be a bloom filter in the hash join state itself?

We may consider the SST's bloom filter is also a type of block that can be cached in Block Cache, so if it is cached, then we just read from the memory. If it is not, we fetch either only the block or(I am not sure which one is in use) the entire SST from S3.

We already have a meta cache implemented in CN to cache SST meta including the bloom filter so SST/data blocks should not be fetched if the key is not in bloom filter.

See https://github.com/singularity-data/risingwave/blob/4348f9237d83547b1a842a757ea757e2d5aa3edc/src/storage/src/hummock/state_store.rs#L240

skyzh commented 2 years ago

Resolved with recent PRs.