Executor gets timeout exception during rebalance

chrisbeard commented 4 years ago

Configs

Kafka: v2.5 Cruise-control tag: 2.5.0 The cluster has thousands of partitions.

Issue

During a rebalance where there were ~50 inter-broker reassignments in progress, we are seeing a few things succeed but after ten seconds the executor runs into this exception. This has happened every time I've tried a rebalance.

java.util.concurrent.TimeoutException
        at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:108)
        at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:272)
        at com.linkedin.kafka.cruisecontrol.executor.ExecutionUtils.ongoingPartitionReassignments(ExecutionUtils.java:54)
        at com.linkedin.kafka.cruisecontrol.executor.ExecutionUtils.submitReplicaReassignmentTasks(ExecutionUtils.java:78)
        at com.linkedin.kafka.cruisecontrol.executor.Executor$ProposalExecutionRunnable.interBrokerMoveReplicas(Executor.java:1010)
        at com.linkedin.kafka.cruisecontrol.executor.Executor$ProposalExecutionRunnable.execute(Executor.java:829)
        at com.linkedin.kafka.cruisecontrol.executor.Executor$ProposalExecutionRunnable.run(Executor.java:771)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

Workaround

It looks like raising this 10s timeout to 60s helped my test rebalances complete. I'm not sure why 10s was insufficient for my rebalance though, given that the number of reassignments in-flight was fairly small.

Keydrain commented 4 years ago

Encountered similar timeouts.

Kafka: v2.4 Cruise Control: 2.4.9 Tens of thousands of partitions

Bumping that timeout up to 60s seems to work most of the time for my cluster. It would seem reasonable that the number of actual in-flight reassignments directly affects pull time from the controller?

efeg commented 4 years ago

@chrisbeard thanks for submitting the issue and @Keydrain thanks for sharing a use-case. There is a pending todo item to make this timeout configurable.

In addition to that, its default value should be bumped up to a more reasonable value and it should involve a retry logic in case of a timeout.

linkedin / cruise-control