risingwavelabs / risingwave

Best-in-class stream processing, analytics, and management. Perform continuous analytics, or build event-driven applications, real-time ETL pipelines, and feature stores in minutes. Unified streaming and batch. PostgreSQL compatible.
https://go.risingwave.com/slack
Apache License 2.0
7.03k stars 578 forks source link

Tracking: Monitoring, logging and tooling improvements #8018

Open hzxa21 opened 1 year ago

hzxa21 commented 1 year ago

As we have more and more users, we should start taking monitoring, logging and tooling seriously to make our life easier during debugging, especially when debugging live production issues. We can use this issue to track ideas on how to improve our logging and tooling in kernel (i.e. risectl).

Monitoring:

Logging:

Tooling:

jon-chuang commented 1 year ago

Monitor operation conflict / sanity check failure rates in stateful opeartor

What is this? Is it write conflict?

hzxa21 commented 1 year ago

Monitor operation conflict / sanity check failure rates in stateful opeartor

What is this? Is it write conflict?

I was mainly refering to the memtable operation conflicts: https://github.com/risingwavelabs/risingwave/blob/d8198fa138003e1f1431053f4f5f09e4a5fa8fd8/src/storage/src/mem_table.rs#L96.

Write conflict is also a good metric but not neccessary critical if it is caused by DML.