Closed skyzh closed 2 years ago
We fill block cache on uploading SST and with the "unlimited" block cache default config, all SSTs will be fully cached. However, is it expected for executor to issue a lof of reads to storage when joining two tables? I thought executor has its own cache and I expect the executor cache hit ratio to be relatively high if we have enough memory.
is it expected for executor to issue a lof of reads to storage when joining two tables
For initial data, yes. No one knows whether a key exists in shared storage before actually get it. After that, this info will be cached.
The TPC-H workload is all inserts, so it will initiate many scans to the storage that returns zero rows.
I believe could be the result of fetching individual keys. Every cache miss will issue a scan for the key. If instead the reads were batched together (as indeed could be done via: https://github.com/singularity-data/risingwave/issues/2428), it may be able to improve the performance...
We may introduce an executor-level cuckoo(for deletion) filter to filter once before doing check whether a key exists in shared storage
by fetching SST's bloom filter. We may use this filter only before the executor's cache gets full or forever.
Since we do not need to put those keys already cached by the executor into the filter, the efficiency of the cuckoo filter could be quite high.
Quite curious, if we rely on SST bloom filter, are we making a domain socket call for IPC? Maybe there should be a bloom filter in the hash join state itself?
If instead the reads were batched together
Also, there doesn't seem to be an interface for this? I guess we will have to scan multiple key ranges anyway...
Quite curious, if we rely on SST bloom filter, are we making a domain socket call for IPC? Maybe there should be a bloom filter in the hash join state itself?
We may consider the SST's bloom filter is also a type of block that can be cached in Block Cache, so if it is cached, then we just read from the memory. If it is not, we fetch either only the block or(I am not sure which one is in use) the entire SST from S3.
Quite curious, if we rely on SST bloom filter, are we making a domain socket call for IPC? Maybe there should be a bloom filter in the hash join state itself?
We may consider the SST's bloom filter is also a type of block that can be cached in Block Cache, so if it is cached, then we just read from the memory. If it is not, we fetch either only the block or(I am not sure which one is in use) the entire SST from S3.
We already have a meta cache implemented in CN to cache SST meta including the bloom filter so SST/data blocks should not be fetched if the key is not in bloom filter.
Resolved with recent PRs.
When importing ~1GB data into RisingWave with two tables joining, the object store throughput is ridiculously high. Maybe we will need block cache metrics to analyze what's happening inside.