apache / helix

Mirror of Apache Helix
Apache License 2.0
457 stars 218 forks source link

Enable logging detailed and specific error message for WAGED Rebalance failures #2828

Closed himanshukandwal closed 2 weeks ago

himanshukandwal commented 3 weeks ago

Is your feature request related to a problem? Please describe. Currently, when WAGED rebalancer encounters rebalance failure due to hard constraint failure (say insufficient capacity), it does not specify which key and reason of the insufficient capacity. Currently details are logged as part of the DEBUG messages and we have to turning on DEBUG logging to gain insights, and then turn off the helix DEBUG logging after the triage.

It's lots of human toil and also could miss the crucial information for debugging. So, suggestion is error log should add more informative message.

Describe the solution you'd like Ideally, each error log line should include KeyName, availableCapacity, and Required Capacity for capacity related failures.

Additional context N/A