linkedin / cruise-control

Cruise-control is the first of its kind to fully automate the dynamic workload rebalance and self-healing of a Kafka cluster. It provides great value to Kafka users by simplifying the operation of Kafka clusters.
https://github.com/linkedin/cruise-control/tags
BSD 2-Clause "Simplified" License
2.68k stars 574 forks source link

Partition load result shows discrepancy with the actual kafka partition #2123

Closed CCisGG closed 4 months ago

CCisGG commented 4 months ago

In one of our largest clusters, there are only ~20% partitions are listed in the partition load. This issue was super hard to debug given the current logs and heapdump, and was gone after a restart of the cruise control instance.

In order to better debug it next time, we need to add more useful logs.