Closed chowell5 closed 6 years ago
I assume you have auto.create.topics.enable
set to false
? In this case, it is the right thing to create the topic manually. The command also looks correct.
However, the exception seems to be thrown during the topic creation from Kafka (the exact line is here).
Do Kafka logs show Exception in thread "main" java.lang.IllegalStateException: Cruise Control cannot find sampling topic matches __CruiseControlMetrics in the target cluster.
? Is this a transient error that happened just once?
Hi,
Thanks for your reply, we do indeed set auto.create.topics.enable=false
in our broker config.
I cannot see anything in our broker logs related to this. The error Exception in thread "main" org.apache.kafka.common.errors.InvalidReplicationFactorException: replication factor must be larger than 0
was generated by Cruise Control on startup on stderr.
It's a persistent error and happens every time I try to start Cruise Control, on all of our clusters. I repeated the steps several times; I tried deleting the topic and the error went back to Exception in thread "main" java.lang.IllegalStateException: Cruise Control cannot find sampling topic matches __CruiseControlMetrics in the target cluster.
After I re-created it, it went back to Exception in thread "main" org.apache.kafka.common.errors.InvalidReplicationFactorException: replication factor must be larger than 0
I do agree from your link that it looks like it it should only happen during topic creation, but as you can see from the output of kafka-topics.sh
, Kafka had already created the topic by this point. Complete set of logs:
chris@xxx:~$ sudo docker logs cf71348ee9c4 2>&1 | tail -30
[2018-03-08 13:28:19,828] WARN The configuration 'num.metric.fetchers' was supplied but isn't a known config. (org.apache.kafka.clients.producer.ProducerConfig)
[2018-03-08 13:28:19,828] WARN The configuration 'sample.store.class' was supplied but isn't a known config. (org.apache.kafka.clients.producer.ProducerConfig)
[2018-03-08 13:28:19,828] WARN The configuration 'default.goals' was supplied but isn't a known config. (org.apache.kafka.clients.producer.ProducerConfig)
[2018-03-08 13:28:19,828] WARN The configuration 'num.load.snapshots' was supplied but isn't a known config. (org.apache.kafka.clients.producer.ProducerConfig)
[2018-03-08 13:28:19,828] WARN The configuration 'min.samples.per.load.snapshot' was supplied but isn't a known config. (org.apache.kafka.clients.producer.ProducerConfig)
[2018-03-08 13:28:19,828] INFO Kafka version : 0.10.2.1 (org.apache.kafka.common.utils.AppInfoParser)
[2018-03-08 13:28:19,828] INFO Kafka commitId : e89bffd6b2eff799 (org.apache.kafka.common.utils.AppInfoParser)
[2018-03-08 13:28:20,043] INFO Starting ZkClient event thread. (org.I0Itec.zkclient.ZkEventThread)
[2018-03-08 13:28:20,049] INFO Client environment:zookeeper.version=3.4.9-1757313, built on 08/23/2016 06:50 GMT (org.apache.zookeeper.ZooKeeper)
[2018-03-08 13:28:20,049] INFO Client environment:host.name=xxx (org.apache.zookeeper.ZooKeeper)
[2018-03-08 13:28:20,049] INFO Client environment:java.version=1.8.0_151 (org.apache.zookeeper.ZooKeeper)
[2018-03-08 13:28:20,050] INFO Client environment:java.vendor=Oracle Corporation (org.apache.zookeeper.ZooKeeper)
[2018-03-08 13:28:20,050] INFO Client environment:java.home=/usr/lib/jvm/oracle-java8-jdk-amd64/jre (org.apache.zookeeper.ZooKeeper)
[2018-03-08 13:28:20,050] INFO Client environment:java.class.path=/cruise-control/cruise-control/build/dependant-libs/commons-math3-3.6.1.jar:/cruise-control/cruise-control/build/dependant-libs/cruise-control-core.jar:/cruise-control/cruise-control/build/dependant-libs/cruise-control-metrics-reporter.jar:/cruise-control/cruise-control/build/dependant-libs/gson-2.7.jar:/cruise-control/cruise-control/build/dependant-libs/hamcrest-core-1.3.jar:/cruise-control/cruise-control/build/dependant-libs/javax.servlet-api-3.1.0.jar:/cruise-control/cruise-control/build/dependant-libs/jetty-http-9.4.6.v20170531.jar:/cruise-control/cruise-control/build/dependant-libs/jetty-io-9.4.6.v20170531.jar:/cruise-control/cruise-control/build/dependant-libs/jetty-security-9.4.6.v20170531.jar:/cruise-control/cruise-control/build/dependant-libs/jetty-server-9.4.6.v20170531.jar:/cruise-control/cruise-control/build/dependant-libs/jetty-servlet-9.4.6.v20170531.jar:/cruise-control/cruise-control/build/dependant-libs/jetty-util-9.4.6.v20170531.jar:/cruise-control/cruise-control/build/dependant-libs/jline-0.9.94.jar:/cruise-control/cruise-control/build/dependant-libs/jopt-simple-5.0.3.jar:/cruise-control/cruise-control/build/dependant-libs/junit-4.12.jar:/cruise-control/cruise-control/build/dependant-libs/kafka-clients-0.10.2.1.jar:/cruise-control/cruise-control/build/dependant-libs/kafka_2.11-0.10.2.1.jar:/cruise-control/cruise-control/build/dependant-libs/log4j-1.2.17.jar:/cruise-control/cruise-control/build/dependant-libs/lz4-1.3.0.jar:/cruise-control/cruise-control/build/dependant-libs/metrics-core-2.2.0.jar:/cruise-control/cruise-control/build/dependant-libs/metrics-core-3.2.2.jar:/cruise-control/cruise-control/build/dependant-libs/netty-3.10.5.Final.jar:/cruise-control/cruise-control/build/dependant-libs/scala-library-2.11.11.jar:/cruise-control/cruise-control/build/dependant-libs/scala-parser-combinators_2.11-1.0.4.jar:/cruise-control/cruise-control/build/dependant-libs/slf4j-api-1.7.25.jar:/cruise-control/cruise-control/build/dependant-libs/slf4j-log4j12-1.7.21.jar:/cruise-control/cruise-control/build/dependant-libs/snappy-java-1.1.2.6.jar:/cruise-control/cruise-control/build/dependant-libs/zkclient-0.10.jar:/cruise-control/cruise-control/build/dependant-libs/zookeeper-3.4.9.jar:/cruise-control/cruise-control/build/libs/cruise-control.jar:/cruise-control/cruise-control-metrics-reporter/build/libs/cruise-control-metrics-reporter.jar (org.apache.zookeeper.ZooKeeper)
[2018-03-08 13:28:20,050] INFO Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib (org.apache.zookeeper.ZooKeeper)
[2018-03-08 13:28:20,050] INFO Client environment:java.io.tmpdir=/tmp (org.apache.zookeeper.ZooKeeper)
[2018-03-08 13:28:20,050] INFO Client environment:java.compiler=<NA> (org.apache.zookeeper.ZooKeeper)
[2018-03-08 13:28:20,050] INFO Client environment:os.name=Linux (org.apache.zookeeper.ZooKeeper)
[2018-03-08 13:28:20,050] INFO Client environment:os.arch=amd64 (org.apache.zookeeper.ZooKeeper)
[2018-03-08 13:28:20,050] INFO Client environment:os.version=4.9.xxx (org.apache.zookeeper.ZooKeeper)
[2018-03-08 13:28:20,050] INFO Client environment:user.name=nobody (org.apache.zookeeper.ZooKeeper)
[2018-03-08 13:28:20,050] INFO Client environment:user.home=/nonexistent (org.apache.zookeeper.ZooKeeper)
[2018-03-08 13:28:20,050] INFO Client environment:user.dir=/cruise-control (org.apache.zookeeper.ZooKeeper)
[2018-03-08 13:28:20,051] INFO Initiating client connection, connectString=<xxx> sessionTimeout=30000 watcher=org.I0Itec.zkclient.ZkClient@609db43b (org.apache.zookeeper.ZooKeeper)
[2018-03-08 13:28:20,070] INFO Waiting for keeper state SyncConnected (org.I0Itec.zkclient.ZkClient)
[2018-03-08 13:28:20,084] INFO Opening socket connection to server x.x.x.x/x.x.x.x:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2018-03-08 13:28:20,085] INFO Socket connection established to x.x.x.x/x.x.x.x:2181, initiating session (org.apache.zookeeper.ClientCnxn)
[2018-03-08 13:28:20,093] INFO Session establishment complete on server x.x.x.x/x.x.x.x:2181, sessionid = 0x46201fe7b840f3e, negotiated timeout = 30000 (org.apache.zookeeper.ClientCnxn)
[2018-03-08 13:28:20,095] INFO zookeeper state changed (SyncConnected) (org.I0Itec.zkclient.ZkClient)
Exception in thread "main" org.apache.kafka.common.errors.InvalidReplicationFactorException: replication factor must be larger than 0
Cheers
Thanks for the clarification! -- so this error is generated by Cruise Control (CC) on startup, after the manual creation of __CruiseControlMetrics
topic.
This indicates that the error indeed happens when CC attempts to create Sample Store topics (default: __KafkaCruiseControlPartitionMetricSamples
and __KafkaCruiseControlModelTrainingSamples
defined in cruisecontrol.properties
). Note that in this case having auto.create.topics.enable=false
does not really have any adverse effects on explicit topic creation.
The issue here seems to be the non-positive replication factor interpretation by CC (the relevant line) due to getting a potentially empty sequence of brokers from the Kafka cluster (the relevant line).
/brokers/ids
) exists?Hi, thanks for your reply - that's correct. Thanks for the links to the code, that'll help me, I was struggling to figure out where the error was coming from.
We're running the binary distribution of kafka version 1.0.0 scala 2.12 from https://dist.apache.org/repos/dist/release etc.
I'm not certain exactly what you want to me to check, but it's very much a production cluster and we don't experience issues elsewhere.
I think zk data is OK, but it's at a different path to the one you mentioned
/usr/share/zookeeper/bin/zkCli.sh -server xxx:2181 ls /kafka/logs/brokers/ids
Connecting to xxx:2181
WATCHER::
WatchedEvent state:SyncConnected type:None path:null [10129, 10227, 10128, 10130, 10100, 10114, 10113, 10102, 10115]
Thank you
After speaking to my colleagues (thanks!) we traced this down to a misconfiguration; we're using the feature of kafka that allows to specify a "choot" in the ZK tree: https://kafka.apache.org/08/documentation.html
- zookeeper.connect:
Zookeeper also allows you to add a "chroot" path which will make all
kafka data for this cluster appear under a particular path. This is a
way to setup multiple Kafka clusters or other applications on the same
zookeeper cluster. To do this give a connection string in the form
hostname1:port1,hostname2:port2,hostname3:port3/chroot/path which
would put all this cluster's data under the path /chroot/path. Note
that you must create this path yourself prior to starting the broker
and consumers must use the same connection string.
After appending /kafka/logs
to the zookeeper.connect
string CC correctly starts up. I believe this should remain a bug though as the error message is very misleading, it should report that the broker data couldn't be fetched from Zookeepr, rather than something about replication factors.
Thanks for your help in debugging this.
The error message is more clear after PR https://github.com/linkedin/cruise-control/pull/156 and https://github.com/linkedin/cruise-control/pull/169.
Hi,
I'm trying to get Cruise Control up and running, first of all I got
Well that's totally expected, I didn't create the topic. So I created it.
Now I receive
This is the only message I receive, no backtrace.
Config looks fine to me, definitely not 0.