risingwavelabs / risingwave

Best-in-class stream processing, analytics, and management. Perform continuous analytics, or build event-driven applications, real-time ETL pipelines, and feature stores in minutes. Unified streaming and batch. PostgreSQL compatible.
https://go.risingwave.com/slack
Apache License 2.0
6.88k stars 569 forks source link

Investigate what is the actual bottleneck in hash agg processing for dirty groups #18748

Open kwannoel opened 1 week ago

kwannoel commented 1 week ago

It doesn't seem to be heap or cpu bottleneck. So what is the actual bottleneck, is it IO cost, due to lookups? If so we need a metric for it.

Or is it skew? because in some scenarios, the workload peaks at 1600%. But we have 32 cores.

Needs further investigation.

kwannoel commented 5 days ago

Some workloads to test:

  1. What happens when a large number of existing agg groups get updated.
  2. What happens when a large number of new agg groups are created.
  3. Does it change according to cache configurations.
  4. Test first_value agg.
  5. Make sure to use minio rate limit configuration, to simulate latency when fetching from aws s3.