linkedin / cruise-control

Cruise-control is the first of its kind to fully automate the dynamic workload rebalance and self-healing of a Kafka cluster. It provides great value to Kafka users by simplifying the operation of Kafka clusters.
https://github.com/linkedin/cruise-control/tags
BSD 2-Clause "Simplified" License
2.74k stars 587 forks source link

Can't retrieve partition load when broker is down #757

Closed henhiskan closed 4 years ago

henhiskan commented 5 years ago

We have a broker that went down 3 days ago, and we are not bringing it up again. CC still complains about the missing broker, and the partition load request doesn't work anymore.

Using CC 2.0.46

kafkacruisecontrol/partition_load?json=true&resource=CPU

{"errorMessage":"Error processing GET request \u0027/partition_load\u0027 due to: \u0027com.linkedin.kafka.cruisecontrol.exception.KafkaCruiseControlException: java.lang.IllegalArgumentException: Broker 7 does not exist.\u0027.","stackTrace":"java.util.concurrent.ExecutionException: com.linkedin.kafka.cruisecontrol.exception.KafkaCruiseControlException: java.lang.IllegalArgumentException: Broker 7 does not exist.\n\tat java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)\n\tat java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915)\n\tat com.linkedin.kafka.cruisecontrol.servlet.KafkaCruiseControlServlet.getAndMaybeReturnProgress(KafkaCruiseControlServlet.java:520)\n\tat com.linkedin.kafka.cruisecontrol.servlet.KafkaCruiseControlServlet.asyncRequest(KafkaCruiseControlServlet.java:474)\n\tat com.linkedin.kafka.cruisecontrol.servlet.KafkaCruiseControlServlet.getPartitionLoad(KafkaCruiseControlServlet.java:392)\n\tat com.linkedin.kafka.cruisecontrol.servlet.KafkaCruiseControlServlet.doGet(KafkaCruiseControlServlet.java:159)\n\tat javax.servlet.http.HttpServlet.service(HttpServlet.java:687)\n\tat javax.servlet.http.HttpServlet.service(HttpServlet.java:790)\n\tat org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:873)\n\tat org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:542)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)\n\tat org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1701)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1345)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)\n\tat org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480)\n\tat org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1668)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)\n\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)\n\tat org.eclipse.jetty.server.Server.handle(Server.java:502)\n\tat org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:370)\n\tat org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:267)\n\tat org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)\n\tat org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)\n\tat org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)\n\tat org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)\n\tat org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)\n\tat org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)\n\tat org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)\n\tat org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765)\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683)\n\tat java.lang.Thread.run(Thread.java:748)\nCaused by: com.linkedin.kafka.cruisecontrol.exception.KafkaCruiseControlException: java.lang.IllegalArgumentException: Broker 7 does not exist.\n\tat com.linkedin.kafka.cruisecontrol.KafkaCruiseControl.clusterModel(KafkaCruiseControl.java:650)\n\tat com.linkedin.kafka.cruisecontrol.async.GetClusterModelInRangeRunnable.getResult(GetClusterModelInRangeRunnable.java:32)\n\tat com.linkedin.kafka.cruisecontrol.async.GetClusterModelInRangeRunnable.getResult(GetClusterModelInRangeRunnable.java:17)\n\tat com.linkedin.kafka.cruisecontrol.async.OperationRunnable.run(OperationRunnable.java:45)\n\tat java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\t... 1 more\nCaused by: java.lang.IllegalArgumentException: Broker 7 does not exist.\n\tat com.linkedin.kafka.cruisecontrol.model.ClusterModel.setBrokerState(ClusterModel.java:253)\n\tat com.linkedin.kafka.cruisecontrol.monitor.LoadMonitor.lambda$clusterModel$1(LoadMonitor.java:536)\n\tat java.lang.Iterable.forEach(Iterable.java:75)\n\tat com.linkedin.kafka.cruisecontrol.monitor.LoadMonitor.clusterModel(LoadMonitor.java:536)\n\tat com.linkedin.kafka.cruisecontrol.monitor.LoadMonitor.clusterModel(LoadMonitor.java:459)\n\tat com.linkedin.kafka.cruisecontrol.KafkaCruiseControl.clusterModel(KafkaCruiseControl.java:644)\n\t... 8 more\n","version":1}
efeg commented 4 years ago

Verified (1) while a broker is dead, (2) during an ongoing rebalance, and (3) in the relevant code path that the issue does not exist in the latest version (i.e 2.0.79).

@henhiskan Please reopen the issue if it reoccurs.

imrohitnk commented 2 years ago

Facing the same issue in Version: 2.4.50

_ERROR: Error processing GET request '/partitionload' due to: 'com.linkedin.kafka.cruisecontrol.exception.KafkaCruiseControlException: java.lang.IllegalArgumentException: Broker 1083 does not exist.'.