linkedin / cruise-control

Cruise-control is the first of its kind to fully automate the dynamic workload rebalance and self-healing of a Kafka cluster. It provides great value to Kafka users by simplifying the operation of Kafka clusters.
https://github.com/linkedin/cruise-control/tags
BSD 2-Clause "Simplified" License
2.73k stars 585 forks source link

Add requests rate, message rate to the linear regression model to predict CPU #3

Open becketqin opened 7 years ago

becketqin commented 7 years ago

Currently we only use bytes in and bytes out as parameters to predict the CPU utilization in our linear regression model, which may not work very well. We should include request and message rate into the linear regression model.

igorcalabria commented 6 years ago

From our experience, request rate(especially produce requests) is a great metric to correlate with cpu usage. This is important because if you have any jobs(mapreduce, spark, etc) that write to kafka you'll have a massive number of messages and a small number of requests because of batching. This kind of workload barely impacts our broker's CPU.

On the other hand, web applications that don't have the same ability to batch messages causes a huge impact on performance with fewer messages

becketqin commented 6 years ago

@igorcalabria Thanks for the information. That is also what we witnessed. I am currently working on the experiment. Actually the PartitionMetricSamples and BrokerMetricSamples has already contain the RequestRate. We just need to put them into the cluster model so it can be used during the optimization.