risingwavelabs / risingwave

Best-in-class stream processing, analytics, and management. Perform continuous analytics, or build event-driven applications, real-time ETL pipelines, and feature stores in minutes. Unified streaming and batch. PostgreSQL compatible.
https://go.risingwave.com/slack
Apache License 2.0
7.1k stars 586 forks source link

Global memory statistician for small memory fragments #13780

Open wcy-fdu opened 1 year ago

wcy-fdu commented 1 year ago

RisingWave will occupy some small memory during its working process, for example:

Although these memory footprints are small, they vary dynamically with workload. If there are many materialize views/Actors in the current system, the total amount of these small memories will also increase accordingly, thereby increasing the risk of OOM(We have witnessed such issues during longevity test).

The current strategy involves global memory management, where we use jemalloc to monitor the memory usage of Compute Node. Once a certain threshold is reached, we start evicting the LRU cache. However, we currently do not track these small memories because we consider them to be small and quickly released. Since customers create a lot of MVs in the cluster, we should count these small memories and make some mitigation strategies after they add up.

possible methods:

hzxa21 commented 1 year ago

RisingWave will occupy some small memory during its working process, for example:

  • actor channel will buffer 2048 rows(the upper limit may be size based 1MB late)
  • stateful executors' mem tables will occupy memory until it's size exceed 4MB

Two more examples worth mentioning:

fuyufjh commented 11 months ago

Any specific tasks?

github-actions[bot] commented 5 months ago

This issue has been open for 60 days with no activity. Could you please update the status? Feel free to continue discussion or close as not planned.