redpanda-data / redpanda

Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
https://redpanda.com
9.2k stars 562 forks source link

Implementing Persistent Topic Disk Usage Metric #18622

Open dhia-gharsallaoui opened 1 month ago

dhia-gharsallaoui commented 1 month ago

Who is this for and what problem do they have today?

Redpanda cluster administrators, have expressed a need for monitoring disk usage by topic within Redpanda clusters. Currently, the metrics available (data_bus_bytes_out, redpanda_storage_disk_total_bytes, and redpanda_storage_disk_free_bytes) do not provide topic-specific indexing, resetting to zero on system restarts or lacking detail on a per-topic basis. Administrators require a reliable and persistent way to track disk usage to manage resources efficiently and avoid potential system overloads or downtime.

What are the success criteria?

The desired outcome is to have a persistent, topic-specific disk usage metric that can be monitored over time and integrated with Prometheus for continuous tracking. This metric should not reset or lose accuracy upon system restarts.

Why is solving this problem impactful?

Implementing this feature would greatly improve resource management for Redpanda administrators by providing crucial insights into data storage and flow, enabling better capacity planning and operational stability. This is particularly important in large-scale deployments where efficient resource allocation can significantly impact system performance and cost-effectiveness. For Redpanda, enhancing these capabilities can improve user satisfaction, reduce the risk of system failures due to overloads, and solidify its reputation as a robust and scalable data-streaming solution.

JIRA Link: CORE-3064

mattschumpert commented 4 weeks ago

@deniscoady for tracking