yahoo / CMAK

CMAK is a tool for managing Apache Kafka clusters
Apache License 2.0
11.83k stars 2.51k forks source link

Issue with JMX Polling and Preferred Replica Election #633

Open Ahuri3 opened 5 years ago

Ahuri3 commented 5 years ago

Hello !

I am using :

When I try to do a Preferred Replica Election I get an error : Yikes! Preferred replica election data is empty Try again.

When I check in the logs I get :

info kafka_manager[29511]: [#033[37minfo#033[0m] k.m.a.c.KafkaCommandActor - Running replica leader election : Set()
info kafka_manager[29511]: kafka.manager.utils.UtilException: Preferred replica election data is empty
info kafka_manager[29511]: #011at kafka.manager.utils.package$.checkCondition(package.scala:52) ~[kafka-manager.kafka-manager-1.3.3.23-sans-externalized.jar:na]
info kafka_manager[29511]: #011at kafka.manager.utils.zero81.PreferredReplicaLeaderElectionCommand$.writePreferredReplicaElectionData(PreferredReplicaLeaderElectionCommand.scala:52) ~[kafka-manager.kafka-manager-1.3.3.23-sans-externalized.jar:na]
info kafka_manager[29511]: #011at kafka.manager.actor.cluster.KafkaCommandActor$$anonfun$processCommandRequest$7$$anonfun$apply$12$$anonfun$apply$6.apply$mcV$sp(KafkaCommandActor.scala:123) ~[kafka-manager.kafka-manager-1.3.3.23-sans-externalized.jar:na]
info kafka_manager[29511]: #011at kafka.manager.actor.cluster.KafkaCommandActor$$anonfun$processCommandRequest$7$$anonfun$apply$12$$anonfun$apply$6.apply(KafkaCommandActor.scala:123) ~[kafka-manager.kafka-manager-1.3.3.23-sans-externalized.jar:na]
info kafka_manager[29511]: #011at kafka.manager.actor.cluster.KafkaCommandActor$$anonfun$processCommandRequest$7$$anonfun$apply$12$$anonfun$apply$6.apply(KafkaCommandActor.scala:123) ~[kafka-manager.kafka-manager-1.3.3.23-sans-externalized.jar:na]
info kafka_manager[29511]: #011at scala.util.Try$.apply(Try.scala:192) ~[org.scala-lang.scala-library-2.11.12.jar:na]
info kafka_manager[29511]: #011at kafka.manager.actor.cluster.KafkaCommandActor$$anonfun$processCommandRequest$7$$anonfun$apply$12.apply(KafkaCommandActor.scala:122) ~[kafka-manager.kafka-manager-1.3.3.23-sans-externalized.jar:na]
info kafka_manager[29511]: #011at kafka.manager.actor.cluster.KafkaCommandActor$$anonfun$processCommandRequest$7$$anonfun$apply$12.apply(KafkaCommandActor.scala:121) ~[kafka-manager.kafka-manager-1.3.3.23-sans-externalized.jar:na]
info kafka_manager[29511]: #011at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) ~[org.scala-lang.scala-library-2.11.12.jar:na]
info kafka_manager[29511]: #011at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) ~[org.scala-lang.scala-library-2.11.12.jar:na]

I am also seing JMX errors in the logs :

info kafka_manager[29403]: [#033[31merror#033[0m] k.m.j.KafkaJMX$ - Failed to connect to service:jmx:rmi:///jndi/rmi://pp-li-kafk00-0002.node.staging-euw1.consul:7199/jmxrmi
info kafka_manager[29403]: java.rmi.ConnectException: Connection refused to host: pp-li-kafk00-0002.node.staging-euw1.consul; nested exception is:
info kafka_manager[29403]: #011java.net.ConnectException: Connection timed out (Connection timed out)
info kafka_manager[29403]: #011at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:619) ~[na:1.8.0_212]
info kafka_manager[29403]: #011at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:216) ~[na:1.8.0_212]
info kafka_manager[29403]: #011at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:202) ~[na:1.8.0_212]
info kafka_manager[29403]: #011at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:129) ~[na:1.8.0_212]
info kafka_manager[29403]: #011at java.rmi.server.RemoteObjectInvocationHandler.invokeRemoteMethod(RemoteObjectInvocationHandler.java:227) ~[na:1.8.0_212]
info kafka_manager[29403]: #011at java.rmi.server.RemoteObjectInvocationHandler.invoke(RemoteObjectInvocationHandler.java:179) ~[na:1.8.0_212]
info kafka_manager[29403]: #011at com.sun.proxy.$Proxy5.newClient(Unknown Source) ~[na:na]
info kafka_manager[29403]: #011at javax.management.remote.rmi.RMIConnector.getConnection(RMIConnector.java:2430) ~[na:1.8.0_212]
info kafka_manager[29403]: #011at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:308) ~[na:1.8.0_212]
info kafka_manager[29403]: #011at javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:270) ~[na:1.8.0_212]
info kafka_manager[29403]: Caused by: java.net.ConnectException: Connection timed out (Connection timed out)
info kafka_manager[29403]: #011at java.net.PlainSocketImpl.socketConnect(Native Method) ~[na:1.8.0_212]
info kafka_manager[29403]: #011at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) ~[na:1.8.0_212]
info kafka_manager[29403]: #011at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) ~[na:1.8.0_212]
info kafka_manager[29403]: #011at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) ~[na:1.8.0_212]
info kafka_manager[29403]: #011at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[na:1.8.0_212]
info kafka_manager[29403]: #011at java.net.Socket.connect(Socket.java:589) ~[na:1.8.0_212]
info kafka_manager[29403]: #011at java.net.Socket.connect(Socket.java:538) ~[na:1.8.0_212]
info kafka_manager[29403]: #011at java.net.Socket.<init>(Socket.java:434) ~[na:1.8.0_212]
info kafka_manager[29403]: #011at java.net.Socket.<init>(Socket.java:211) ~[na:1.8.0_212]
info kafka_manager[29403]: #011at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:40) ~[na:1.8.0_212]

This is the kafka systemd unit file : :

# /etc/systemd/system/kafka.service

[Unit]
Description=Kafka Daemon

[Service]
Type=simple
User=kafka
Group=kafka
LimitNOFILE=32768
Restart=on-failure
Environment="KAFKA_HEAP_OPTS=-Xmx1G -Xms1G"
Environment="JMX_PORT=7199"
Environment="LOG_DIR=/var/log/kafka"
Environment="KAFKA_JMX_OPTS=-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=pp-li-kafk00-0002.node.staging-euw1.consul -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.

ExecStart=/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/server.properties

[Install]
WantedBy=multi-user.target

FYI this is in the server.properties :

advertised.host.name=pp-li-kafk00-0002.node.staging-euw1.consul

I have tried to follow the instructions I have found here : https://github.com/yahoo/kafka-manager/issues/187#issuecomment-261467734

I have checked and double checked, The JMX ports are open between the kafka_manager server and the kafka broker servers :

# nc -vz 10.32.2.131 7199
pp-li-kafk00-0002.c.mailjet-staging.internal [10.32.2.131] 7199 (?) open

Any idea what we might be doing wrong ?

Regards,

Leo

linehrr commented 5 years ago

I think you have to create preferred assignment first by going into desired topic and click on Generate partition assignment.

Ahuri3 commented 5 years ago

I had to activate KMTopicManagerFeature in order to be able to assign partitions :

application.features=["KMClusterManagerFeature","KMTopicManagerFeature","KMPreferredReplicaElectionFeature","KMReassignPartitionsFeature"]

I thought that

application.features=["KMClusterManagerFeature","KMPreferredReplicaElectionFeature","KMReassignPartitionsFeature"]

Would have been enough.

Once the partitions have been assigned I can run the Preferred Replica Election. Thank you !

I am still having problems with the JMX for the moment