risingwavelabs / risingwave

Best-in-class stream processing, analytics, and management. Perform continuous analytics, or build event-driven applications, real-time ETL pipelines, and feature stores in minutes. Unified streaming and batch. PostgreSQL compatible.
https://go.risingwave.com/slack
Apache License 2.0
6.77k stars 561 forks source link

perf(streaming): non-blocking join matching #13776

Open BugenZhao opened 9 months ago

BugenZhao commented 9 months ago

...to further reduce the memory usage and also the latency, I guess we can even make hash_eq_match return a Stream instead of collecting the rows into an entry before we can yield them.

Originally posted by @BugenZhao in https://github.com/risingwavelabs/risingwave/issues/10979#issuecomment-1639319345

lmatz commented 7 months ago

This is supposed to prevent the OOM case where a single row from the dimension table produces a huge number of joined results, e.g. 1M+, with the fact table, isn't it? (it occurred to me when looking at today's user OOM feedback

BugenZhao commented 7 months ago

This is supposed to prevent the OOM case where a single row from the dimension table produces a huge number of joined results, e.g. 1M+, with the fact table, isn't it? (it occurred to me when looking at today's user OOM feedback

I believe so. Work together with #10979.

github-actions[bot] commented 3 months ago

This issue has been open for 60 days with no activity. Could you please update the status? Feel free to continue discussion or close as not planned.