metrics: per-thread (runtime) CPU usage

risingwavelabs / risingwave

Best-in-class stream processing, analytics, and management. Perform continuous analytics, or build event-driven applications, real-time ETL pipelines, and feature stores in minutes. Unified streaming and batch. PostgreSQL compatible.

Apache License 2.0

7.09k stars 585 forks source link

Nodes are multi-functional in our current architecture. For example, the frontend node handles SQL statements from users and also performs simple batch query stages. The compute node is responsible for long-running streaming jobs and also handles batch queries in distributed mode. Sometimes in monotonic deployment, the compute node will also work on compacting storage data.

In order to analyze the workloads separately, we need to include per-thread CPU usage in the metrics. Fortunately, we've already refactored to utilize dedicated async runtimes for each "functionality" and differentiate them with thread names (like risingwave-main for RPC, risingwave-streaming-actor for streaming jobs). So recording the thread names and their CPU usage should be enough.

risingwavelabs / risingwave

metrics: per-thread (runtime) CPU usage #10966