pinterest / DoctorK

DoctorK is a service for Kafka cluster auto healing and workload balancing
Apache License 2.0
633 stars 92 forks source link

Errors in starting KafkaStatsMain 0.2.4.3 #90

Closed venkatwilliams closed 5 years ago

venkatwilliams commented 5 years ago

Getting the following errors while running KafkaStats. Would require someones guidance to fix the following issues.

Errors: 1.ERROR com.pinterest.doctorkafka.stats.BrokerStatsRetriever - Failed to get ec2 metadata 2.ERROR com.pinterest.doctorkafka.util.OperatorUtil - Failed to connect to MBeanServer bd-venkat:9999 3.WARN com.pinterest.doctorkafka.util.MetricsPusher - Failed to send stats to OpenTSDB, will retry up to next interval com.pinterest.doctorkafka.util.OpenTsdbClient$ConnectionFailedException: java.net.ConnectException: Connection refused (Connection refused)

Starting command: java -server \ -Dlog4j.configurationFile=file:./log4j2.xml \ -cp libs/*:kafkastats-0.2.4.3-jar-with-dependencies.jar \ com.pinterest.doctorkafka.stats.KafkaStatsMain \ -broker localhost \ -jmxport 9999 \ -topic brokerstats \ -zookeeper localhost:2181 \ -uptimeinseconds 3600 \ -pollingintervalinseconds 60 \ -ostrichport 2051 \ -tsdhostport localhost:18126 \ -kafka_config /Users/venkat/kafka_2.11-1.1.0/config/server1.properties \ -producer_config /Users/venkat/kafka_2.11-1.1.0/config/producer.properties \ -primary_network_ifacename eth0

Detailed Log: 20:13:17.386 [ZkClient-EventThread-12-localhost:2181] INFO org.I0Itec.zkclient.ZkEventThread - Starting ZkClient event thread. 20:13:17.473 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:zookeeper.version=3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f, built on 03/23/2017 10:13 GMT 20:13:17.473 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:host.name=192.168.0.103 20:13:17.474 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.version=1.8.0_191 20:13:17.474 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.vendor=Oracle Corporation 20:13:17.474 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.home=/Library/Java/JavaVirtualMachines/jdk1.8.0_191.jdk/Contents/Home/jre 20:13:17.474 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.class.path=libs/kafkastats-0.2.4.3-jar-with-dependencies.jar:libs/doctorkafka-0.2.4.3-jar-with-dependencies.jar:libs/doctorkafka-0.2.4.3.jar:libs/kafkastats-0.2.4.3.jar:kafkastats-0.2.4.3-jar-with-dependencies.jar 20:13:17.474 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.library.path=/Users/venkat/Library/Java/Extensions:/Library/Java/Extensions:/Network/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java:. 20:13:17.474 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.io.tmpdir=/var/folders/28/qrgydfnx1jv4cl703nzp3lhc92z1zs/T/ 20:13:17.474 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:java.compiler= 20:13:17.474 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:os.name=Mac OS X 20:13:17.474 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:os.arch=x86_64 20:13:17.474 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:os.version=10.13.6 20:13:17.474 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:user.name=venkat 20:13:17.474 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:user.home=/Users/venkat 20:13:17.474 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment:user.dir=/Users/venkat/softwares/doctorkafka 20:13:17.475 [main] INFO org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.I0Itec.zkclient.ZkClient@4bdeaabb 20:13:17.483 [main] DEBUG org.apache.zookeeper.ClientCnxn - zookeeper.disableAutoWatchReset is false 20:13:17.498 [main] DEBUG org.I0Itec.zkclient.ZkClient - Awaiting connection to Zookeeper server 20:13:17.498 [main] INFO org.I0Itec.zkclient.ZkClient - Waiting for keeper state SyncConnected 20:13:17.500 [main-SendThread(localhost:2181)] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) 20:13:17.522 [main-SendThread(localhost:2181)] INFO org.apache.zookeeper.ClientCnxn - Socket connection established to localhost/127.0.0.1:2181, initiating session 20:13:17.524 [main-SendThread(localhost:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Session establishment request sent on localhost/127.0.0.1:2181 20:13:17.532 [main-SendThread(localhost:2181)] INFO org.apache.zookeeper.ClientCnxn - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x1689a86df6e0073, negotiated timeout = 30000 20:13:17.534 [main-EventThread] DEBUG org.I0Itec.zkclient.ZkClient - Received event: WatchedEvent state:SyncConnected type:None path:null 20:13:17.535 [main-EventThread] INFO org.I0Itec.zkclient.ZkClient - zookeeper state changed (SyncConnected) 20:13:17.535 [main-EventThread] DEBUG org.I0Itec.zkclient.ZkClient - Leaving process event 20:13:17.535 [main] DEBUG org.I0Itec.zkclient.ZkClient - State is SyncConnected 20:13:17.548 [main] INFO kafka.utils.Log4jControllerRegistration$ - Registered kafka:type=kafka.Log4jController MBean 20:13:17.612 [main-SendThread(localhost:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Reading reply sessionid:0x1689a86df6e0073, packet:: clientPath:null serverPath:null finished:false header:: 1,8 replyHeader:: 1,1251,0 request:: '/brokers/ids,F response:: v{'0,'1,'2} 20:13:17.625 [main-SendThread(localhost:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Reading reply sessionid:0x1689a86df6e0073, packet:: clientPath:null serverPath:null finished:false header:: 2,4 replyHeader:: 2,1251,0 request:: '/brokers/ids/0,F response:: #7b226c697374656e65725f73656375726974795f70726f746f636f6c5f6d6170223a7b22504c41494e54455854223a22504c41494e54455854227d2c22656e64706f696e7473223a5b22504c41494e544558543a2f2f6c6f63616c686f73743a39303932225d2c226a6d785f706f7274223a2d312c22686f7374223a226c6f63616c686f7374222c2274696d657374616d70223a2231353438383535383736373337222c22706f7274223a393039322c2276657273696f6e223a347d,s{1209,1209,1548855876737,1548855876737,0,0,0,101500895680659568,188,0,1209} 20:13:17.873 [main-SendThread(localhost:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Reading reply sessionid:0x1689a86df6e0073, packet:: clientPath:null serverPath:null finished:false header:: 3,4 replyHeader:: 3,1251,0 request:: '/brokers/ids/1,F response:: #7b226c697374656e65725f73656375726974795f70726f746f636f6c5f6d6170223a7b22504c41494e54455854223a22504c41494e54455854227d2c22656e64706f696e7473223a5b22504c41494e544558543a2f2f6c6f63616c686f73743a39303933225d2c226a6d785f706f7274223a2d312c22686f7374223a226c6f63616c686f7374222c2274696d657374616d70223a2231353438383535383738313735222c22706f7274223a393039332c2276657273696f6e223a347d,s{1214,1214,1548855878178,1548855878178,0,0,0,101500895680659569,188,0,1214} 20:13:17.874 [main-SendThread(localhost:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Reading reply sessionid:0x1689a86df6e0073, packet:: clientPath:null serverPath:null finished:false header:: 4,4 replyHeader:: 4,1251,0 request:: '/brokers/ids/2,F response:: #7b226c697374656e65725f73656375726974795f70726f746f636f6c5f6d6170223a7b22504c41494e54455854223a22504c41494e54455854227d2c22656e64706f696e7473223a5b22504c41494e544558543a2f2f6c6f63616c686f73743a39303934225d2c226a6d785f706f7274223a2d312c22686f7374223a226c6f63616c686f7374222c2274696d657374616d70223a2231353438383535383738323132222c22706f7274223a393039342c2276657273696f6e223a347d,s{1233,1233,1548855878214,1548855878214,0,0,0,101500895680659570,188,0,1233} 20:13:17.906 [main] INFO org.apache.kafka.clients.producer.ProducerConfig - ProducerConfig values: acks = 1 batch.size = 1638400 bootstrap.servers = [localhost:9092] buffer.memory = 3554432 client.id = compression.type = none connections.max.idle.ms = 540000 enable.idempotence = false interceptor.classes = [] key.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer linger.ms = 0 max.block.ms = 60000 max.in.flight.requests.per.connection = 5 max.request.size = 1048576 metadata.max.age.ms = 300000 metric.reporters = [] metrics.num.samples = 2 metrics.recording.level = INFO metrics.sample.window.ms = 30000 partitioner.class = class org.apache.kafka.clients.producer.internals.DefaultPartitioner receive.buffer.bytes = 32768 reconnect.backoff.max.ms = 1000 reconnect.backoff.ms = 50 request.timeout.ms = 30000 retries = 0 retry.backoff.ms = 100 sasl.jaas.config = null sasl.kerberos.kinit.cmd = /usr/bin/kinit sasl.kerberos.min.time.before.relogin = 60000 sasl.kerberos.service.name = null sasl.kerberos.ticket.renew.jitter = 0.05 sasl.kerberos.ticket.renew.window.factor = 0.8 sasl.mechanism = GSSAPI security.protocol = PLAINTEXT send.buffer.bytes = 131072 ssl.cipher.suites = null ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1] ssl.endpoint.identification.algorithm = null ssl.key.password = null ssl.keymanager.algorithm = SunX509 ssl.keystore.location = null ssl.keystore.password = null ssl.keystore.type = JKS ssl.protocol = TLS ssl.provider = null ssl.secure.random.implementation = null ssl.trustmanager.algorithm = PKIX ssl.truststore.location = null ssl.truststore.password = null ssl.truststore.type = JKS transaction.timeout.ms = 60000 transactional.id = null value.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer

20:13:17.927 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name bufferpool-wait-time 20:13:17.932 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name buffer-exhausted-records 20:13:17.936 [main] DEBUG org.apache.kafka.clients.Metadata - Updated cluster metadata version 1 to Cluster(id = null, nodes = [localhost:9092 (id: -1 rack: null)], partitions = []) 20:13:17.942 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name produce-throttle-time 20:13:17.952 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name connections-closed: 20:13:17.952 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name connections-created: 20:13:17.953 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name successful-authentication: 20:13:17.953 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name failed-authentication: 20:13:17.954 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name bytes-sent-received: 20:13:17.954 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name bytes-sent: 20:13:17.955 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name bytes-received: 20:13:17.955 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name select-time: 20:13:17.956 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name io-time: 20:13:17.959 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name batch-size 20:13:17.960 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name compression-rate 20:13:17.960 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name queue-time 20:13:17.960 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name request-time 20:13:17.960 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name records-per-request 20:13:17.961 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name record-retries 20:13:17.961 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name errors 20:13:17.961 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name record-size 20:13:17.962 [main] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name batch-split-rate 20:13:17.963 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.clients.producer.internals.Sender - [Producer clientId=producer-1] Starting Kafka producer I/O thread. 20:13:17.964 [main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka version : 1.1.0 20:13:17.964 [main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka commitId : fdcf75ea326b8e07 20:13:17.965 [main] DEBUG org.apache.kafka.clients.producer.KafkaProducer - [Producer clientId=producer-1] Kafka producer started 20:13:17.970 [main] INFO com.pinterest.doctorkafka.stats.BrokerStatsReporter - Starting broker stats reporter..... 20:13:17.988 [StatsReporter] ERROR com.pinterest.doctorkafka.stats.BrokerStatsRetriever - Failed to get ec2 metadata java.io.IOException: Cannot run program "ec2metadata": error=2, No such file or directory at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048) ~[?:1.8.0_191] at java.lang.Runtime.exec(Runtime.java:620) ~[?:1.8.0_191] at java.lang.Runtime.exec(Runtime.java:450) ~[?:1.8.0_191] at java.lang.Runtime.exec(Runtime.java:347) ~[?:1.8.0_191] at com.pinterest.doctorkafka.stats.BrokerStatsRetriever.setBrokerInstanceInfo(BrokerStatsRetriever.java:409) [kafkastats-0.2.4.3-jar-with-dependencies.jar:?] at com.pinterest.doctorkafka.stats.BrokerStatsRetriever.retrieveBrokerStats(BrokerStatsRetriever.java:579) [kafkastats-0.2.4.3-jar-with-dependencies.jar:?] at com.pinterest.doctorkafka.stats.BrokerStatsReporter.run(BrokerStatsReporter.java:57) [kafkastats-0.2.4.3-jar-with-dependencies.jar:?] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_191] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:1.8.0_191] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_191] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [?:1.8.0_191] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_191] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_191] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_191] Caused by: java.io.IOException: error=2, No such file or directory at java.lang.UNIXProcess.forkAndExec(Native Method) ~[?:1.8.0_191] at java.lang.UNIXProcess.(UNIXProcess.java:247) ~[?:1.8.0_191] at java.lang.ProcessImpl.start(ProcessImpl.java:134) ~[?:1.8.0_191] at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029) ~[?:1.8.0_191] ... 13 more 20:13:17.997 [StatsReporter] INFO com.pinterest.doctorkafka.stats.BrokerStatsRetriever - set hostname to bd-venkat Jan 30, 2019 8:13:18 PM com.twitter.ostrich.admin.BackgroundProcess start INFO: Starting LatchedStatsListener 20:13:18.140 [StatsReporter] ERROR com.pinterest.doctorkafka.util.OperatorUtil - Failed to connect to MBeanServer bd-venkat:9999 java.io.IOException: Failed to retrieve RMIServer stub: javax.naming.ServiceUnavailableException [Root exception is java.rmi.ConnectException: Connection refused to host: localhost; nested exception is: java.net.ConnectException: Connection refused (Connection refused)] at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:369) ~[?:1.8.0_191] at javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:270) ~[?:1.8.0_191] at com.pinterest.doctorkafka.util.OperatorUtil.getMBeanServerConnection(OperatorUtil.java:90) [kafkastats-0.2.4.3-jar-with-dependencies.jar:?] at com.pinterest.doctorkafka.stats.BrokerStatsRetriever.retrieveBrokerStats(BrokerStatsRetriever.java:588) [kafkastats-0.2.4.3-jar-with-dependencies.jar:?] at com.pinterest.doctorkafka.stats.BrokerStatsReporter.run(BrokerStatsReporter.java:57) [kafkastats-0.2.4.3-jar-with-dependencies.jar:?] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_191] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:1.8.0_191] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_191] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [?:1.8.0_191] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_191] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_191] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_191] Caused by: javax.naming.ServiceUnavailableException at com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:136) ~[?:1.8.0_191] at com.sun.jndi.toolkit.url.GenericURLContext.lookup(GenericURLContext.java:205) ~[?:1.8.0_191] at javax.naming.InitialContext.lookup(InitialContext.java:417) ~[?:1.8.0_191] at javax.management.remote.rmi.RMIConnector.findRMIServerJNDI(RMIConnector.java:1955) ~[?:1.8.0_191] at javax.management.remote.rmi.RMIConnector.findRMIServer(RMIConnector.java:1922) ~[?:1.8.0_191] at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:287) ~[?:1.8.0_191] ... 11 more Caused by: java.rmi.ConnectException: Connection refused to host: localhost; nested exception is: java.net.ConnectException: Connection refused (Connection refused) at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:619) ~[?:1.8.0_191] at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:216) ~[?:1.8.0_191] at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:202) ~[?:1.8.0_191] at sun.rmi.server.UnicastRef.newCall(UnicastRef.java:338) ~[?:1.8.0_191] at sun.rmi.registry.RegistryImpl_Stub.lookup(RegistryImpl_Stub.java:112) ~[?:1.8.0_191] at com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:132) ~[?:1.8.0_191] at com.sun.jndi.toolkit.url.GenericURLContext.lookup(GenericURLContext.java:205) ~[?:1.8.0_191] at javax.naming.InitialContext.lookup(InitialContext.java:417) ~[?:1.8.0_191] at javax.management.remote.rmi.RMIConnector.findRMIServerJNDI(RMIConnector.java:1955) ~[?:1.8.0_191] at javax.management.remote.rmi.RMIConnector.findRMIServer(RMIConnector.java:1922) ~[?:1.8.0_191] at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:287) ~[?:1.8.0_191] ... 11 more Caused by: java.net.ConnectException: Connection refused (Connection refused) at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:1.8.0_191] at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) ~[?:1.8.0_191] at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) ~[?:1.8.0_191] at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) ~[?:1.8.0_191] at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:1.8.0_191] at java.net.Socket.connect(Socket.java:589) ~[?:1.8.0_191] at java.net.Socket.connect(Socket.java:538) ~[?:1.8.0_191] at java.net.Socket.(Socket.java:434) ~[?:1.8.0_191] at java.net.Socket.(Socket.java:211) ~[?:1.8.0_191] at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:40) ~[?:1.8.0_191] at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:148) ~[?:1.8.0_191] at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:613) ~[?:1.8.0_191] at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:216) ~[?:1.8.0_191] at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:202) ~[?:1.8.0_191] at sun.rmi.server.UnicastRef.newCall(UnicastRef.java:338) ~[?:1.8.0_191] at sun.rmi.registry.RegistryImpl_Stub.lookup(RegistryImpl_Stub.java:112) ~[?:1.8.0_191] at com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:132) ~[?:1.8.0_191] at com.sun.jndi.toolkit.url.GenericURLContext.lookup(GenericURLContext.java:205) ~[?:1.8.0_191] at javax.naming.InitialContext.lookup(InitialContext.java:417) ~[?:1.8.0_191] at javax.management.remote.rmi.RMIConnector.findRMIServerJNDI(RMIConnector.java:1955) ~[?:1.8.0_191] at javax.management.remote.rmi.RMIConnector.findRMIServer(RMIConnector.java:1922) ~[?:1.8.0_191] at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:287) ~[?:1.8.0_191] ... 11 more 20:13:18.159 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-1] Initialize connection to node localhost:9092 (id: -1 rack: null) for sending metadata request 20:13:18.159 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-1] Initiating connection to node localhost:9092 (id: -1 rack: null) 20:13:18.162 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name node--1.bytes-sent 20:13:18.163 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name node--1.bytes-received 20:13:18.164 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name node--1.latency 20:13:18.166 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.network.Selector - [Producer clientId=producer-1] Created socket with SO_RCVBUF = 326640, SO_SNDBUF = 146988, SO_TIMEOUT = 0 to node -1 20:13:18.254 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-1] Completed connection to node -1. Fetching API versions. 20:13:18.254 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-1] Initiating API versions fetch from node -1. 20:13:18.262 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-1] Recorded API versions for node -1: (Produce(0): 0 to 5 [usable: 5], Fetch(1): 0 to 7 [usable: 7], ListOffsets(2): 0 to 2 [usable: 2], Metadata(3): 0 to 5 [usable: 5], LeaderAndIsr(4): 0 to 1 [usable: 1], StopReplica(5): 0 [usable: 0], UpdateMetadata(6): 0 to 4 [usable: 4], ControlledShutdown(7): 0 to 1 [usable: 1], OffsetCommit(8): 0 to 3 [usable: 3], OffsetFetch(9): 0 to 3 [usable: 3], FindCoordinator(10): 0 to 1 [usable: 1], JoinGroup(11): 0 to 2 [usable: 2], Heartbeat(12): 0 to 1 [usable: 1], LeaveGroup(13): 0 to 1 [usable: 1], SyncGroup(14): 0 to 1 [usable: 1], DescribeGroups(15): 0 to 1 [usable: 1], ListGroups(16): 0 to 1 [usable: 1], SaslHandshake(17): 0 to 1 [usable: 1], ApiVersions(18): 0 to 1 [usable: 1], CreateTopics(19): 0 to 2 [usable: 2], DeleteTopics(20): 0 to 1 [usable: 1], DeleteRecords(21): 0 [usable: 0], InitProducerId(22): 0 [usable: 0], OffsetForLeaderEpoch(23): 0 [usable: 0], AddPartitionsToTxn(24): 0 [usable: 0], AddOffsetsToTxn(25): 0 [usable: 0], EndTxn(26): 0 [usable: 0], WriteTxnMarkers(27): 0 [usable: 0], TxnOffsetCommit(28): 0 [usable: 0], DescribeAcls(29): 0 [usable: 0], CreateAcls(30): 0 [usable: 0], DeleteAcls(31): 0 [usable: 0], DescribeConfigs(32): 0 to 1 [usable: 1], AlterConfigs(33): 0 [usable: 0], AlterReplicaLogDirs(34): 0 [usable: 0], DescribeLogDirs(35): 0 [usable: 0], SaslAuthenticate(36): 0 [usable: 0], CreatePartitions(37): 0 [usable: 0], CreateDelegationToken(38): 0 [usable: 0], RenewDelegationToken(39): 0 [usable: 0], ExpireDelegationToken(40): 0 [usable: 0], DescribeDelegationToken(41): 0 [usable: 0], DeleteGroups(42): 0 [usable: 0]) 20:13:18.263 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-1] Sending metadata request (type=MetadataRequest, topics=brokerstats) to node localhost:9092 (id: -1 rack: null) 20:13:18.265 [kafka-producer-network-thread | producer-1] INFO org.apache.kafka.clients.Metadata - Cluster ID: 6xm2CYjVSye8SZnX4_By5Q 20:13:18.266 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.clients.Metadata - Updated cluster metadata version 2 to Cluster(id = 6xm2CYjVSye8SZnX4_By5Q, nodes = [localhost:9092 (id: 0 rack: null), localhost:9093 (id: 1 rack: null), localhost:9094 (id: 2 rack: null)], partitions = [Partition(topic = brokerstats, partition = 0, leader = 1, replicas = [1], isr = [1], offlineReplicas = [])]) 20:13:18.280 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-1] Initiating connection to node localhost:9093 (id: 1 rack: null) 20:13:18.280 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name node-1.bytes-sent 20:13:18.281 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name node-1.bytes-received 20:13:18.281 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name node-1.latency 20:13:18.282 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.network.Selector - [Producer clientId=producer-1] Created socket with SO_RCVBUF = 326640, SO_SNDBUF = 146988, SO_TIMEOUT = 0 to node 1 20:13:18.282 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-1] Completed connection to node 1. Fetching API versions. 20:13:18.282 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-1] Initiating API versions fetch from node 1. 20:13:18.283 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-1] Recorded API versions for node 1: (Produce(0): 0 to 5 [usable: 5], Fetch(1): 0 to 7 [usable: 7], ListOffsets(2): 0 to 2 [usable: 2], Metadata(3): 0 to 5 [usable: 5], LeaderAndIsr(4): 0 to 1 [usable: 1], StopReplica(5): 0 [usable: 0], UpdateMetadata(6): 0 to 4 [usable: 4], ControlledShutdown(7): 0 to 1 [usable: 1], OffsetCommit(8): 0 to 3 [usable: 3], OffsetFetch(9): 0 to 3 [usable: 3], FindCoordinator(10): 0 to 1 [usable: 1], JoinGroup(11): 0 to 2 [usable: 2], Heartbeat(12): 0 to 1 [usable: 1], LeaveGroup(13): 0 to 1 [usable: 1], SyncGroup(14): 0 to 1 [usable: 1], DescribeGroups(15): 0 to 1 [usable: 1], ListGroups(16): 0 to 1 [usable: 1], SaslHandshake(17): 0 to 1 [usable: 1], ApiVersions(18): 0 to 1 [usable: 1], CreateTopics(19): 0 to 2 [usable: 2], DeleteTopics(20): 0 to 1 [usable: 1], DeleteRecords(21): 0 [usable: 0], InitProducerId(22): 0 [usable: 0], OffsetForLeaderEpoch(23): 0 [usable: 0], AddPartitionsToTxn(24): 0 [usable: 0], AddOffsetsToTxn(25): 0 [usable: 0], EndTxn(26): 0 [usable: 0], WriteTxnMarkers(27): 0 [usable: 0], TxnOffsetCommit(28): 0 [usable: 0], DescribeAcls(29): 0 [usable: 0], CreateAcls(30): 0 [usable: 0], DeleteAcls(31): 0 [usable: 0], DescribeConfigs(32): 0 to 1 [usable: 1], AlterConfigs(33): 0 [usable: 0], AlterReplicaLogDirs(34): 0 [usable: 0], DescribeLogDirs(35): 0 [usable: 0], SaslAuthenticate(36): 0 [usable: 0], CreatePartitions(37): 0 [usable: 0], CreateDelegationToken(38): 0 [usable: 0], RenewDelegationToken(39): 0 [usable: 0], ExpireDelegationToken(40): 0 [usable: 0], DescribeDelegationToken(41): 0 [usable: 0], DeleteGroups(42): 0 [usable: 0]) 20:13:18.287 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name topic.brokerstats.records-per-batch 20:13:18.287 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name topic.brokerstats.bytes 20:13:18.287 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name topic.brokerstats.compression-rate 20:13:18.287 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name topic.brokerstats.record-retries 20:13:18.288 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Added sensor with name topic.brokerstats.record-errors 20:13:18.301 [StatsReporter] INFO com.pinterest.doctorkafka.stats.BrokerStatsReporter - published to kafka : {"timestamp": 1548859397978, "id": 1, "name": "bd-venkat", "zkUrl": "localhost:2181", "kafkaVersion": "1.1.0", "statsVersion": "0.1.15", "hasFailure": true, "failureReason": "JmxConnectionFailure", "availabilityZone": null, "instanceType": null, "amiId": null, "rackId": null, "logFilesPath": "/tmp/kafka-logs1", "cpuUsage": 0.0, "freeDiskSpaceInBytes": 196621987840, "totalDiskSpaceInBytes": 250790436864, "leadersBytesIn1MinRate": 0, "leadersBytesOut1MinRate": 0, "leadersBytesIn5MinRate": 0, "leadersBytesOut5MinRate": 0, "leadersBytesIn15MinRate": 0, "leadersBytesOut15MinRate": 0, "sysBytesIn1MinRate": 0, "sysBytesOut1MinRate": 0, "numReplicas": 0, "numLeaders": 0, "topicsBytesIn1MinRate": null, "topicsBytesOut1MinRate": null, "topicsBytesIn5MinRate": null, "topicsBytesOut5MinRate": null, "topicsBytesIn15MinRate": null, "topicsBytesOut15MinRate": null, "leaderReplicaStats": null, "leaderReplicas": null, "followerReplicas": null, "inReassignmentReplicas": null} Jan 30, 2019 8:13:18 PM com.twitter.ostrich.admin.AdminHttpService start INFO: Admin HTTP interface started on port 2051. 20:13:18.345 [main] INFO com.pinterest.doctorkafka.util.OperatorUtil - Starting the OpenTsdb metrics pusher 20:13:18.352 [main] INFO com.pinterest.doctorkafka.util.OperatorUtil - OpenTsdb metrics pusher started! 20:13:27.875 [main-SendThread(localhost:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Got ping response for sessionid: 0x1689a86df6e0073 after 0ms 20:13:28.374 [Thread-5] DEBUG com.pinterest.doctorkafka.util.MetricsPusher - Ostrich Metrics 1548859408: put KafkaOperator.jvm_gc_PS_MarkSweep_msec 1548859408 0.0 host=bd-venkat put KafkaOperator.jvm_gc_PS_Scavenge_cycles 1548859408 0.0 host=bd-venkat put KafkaOperator.jvm_gc_cycles 1548859408 0.0 host=bd-venkat put KafkaOperator.jvm_gc_msec 1548859408 0.0 host=bd-venkat put KafkaOperator.jvm_gc_PS_Scavenge_msec 1548859408 0.0 host=bd-venkat put KafkaOperator.jvm_gc_PS_MarkSweep_cycles 1548859408 0.0 host=bd-venkat put KafkaOperator.kafka.stats.collector.success 1548859408 0.0 host=bd-venkat host=bd-venkat put KafkaOperator.jvm_post_gc_PS_Survivor_Space_used 1548859408 0.0 host=bd-venkat put KafkaOperator.jvm_thread_daemon_count 1548859408 10.0 host=bd-venkat put KafkaOperator.jvm_current_mem_used 1548859408 9.5334352E7 host=bd-venkat put KafkaOperator.jvm_buffer_direct_used 1548859408 826.0 host=bd-venkat put KafkaOperator.jvm_thread_peak_count 1548859408 15.0 host=bd-venkat put KafkaOperator.jvm_fd_limit 1548859408 10240.0 host=bd-venkat put KafkaOperator.jvm_fd_count 1548859408 36.0 host=bd-venkat put KafkaOperator.jvm_buffer_mapped_count 1548859408 0.0 host=bd-venkat put KafkaOperator.jvm_current_mem_PS_Old_Gen_used 1548859408 1.2706672E7 host=bd-venkat put KafkaOperator.jvm_current_mem_Metaspace_max 1548859408 -1.0 host=bd-venkat put KafkaOperator.jvm_post_gc_PS_Old_Gen_used 1548859408 1.2706672E7 host=bd-venkat put KafkaOperator.jvm_num_cpus 1548859408 8.0 host=bd-venkat put KafkaOperator.jvm_current_mem_Code_Cache_used 1548859408 5230912.0 host=bd-venkat put KafkaOperator.jvm_start_time 1548859408 1.54885934E12 host=bd-venkat put KafkaOperator.jvm_nonheap_used 1548859408 4.1529464E7 host=bd-venkat put KafkaOperator.jvm_current_mem_PS_Eden_Space_max 1548859408 1.40928614E9 host=bd-venkat put KafkaOperator.jvm_buffer_direct_count 1548859408 4.0 host=bd-venkat put KafkaOperator.jvm_current_mem_Compressed_Class_Space_max 1548859408 1.07374182E9 host=bd-venkat put KafkaOperator.jvm_heap_max 1548859408 3.81786522E9 host=bd-venkat put KafkaOperator.jvm_compilation_time_msec 1548859408 1793.0 host=bd-venkat put KafkaOperator.jvm_post_gc_PS_Survivor_Space_max 1548859408 1.1010048E7 host=bd-venkat put KafkaOperator.jvm_buffer_mapped_max 1548859408 0.0 host=bd-venkat put KafkaOperator.jvm_buffer_mapped_used 1548859408 0.0 host=bd-venkat put KafkaOperator.jvm_nonheap_max 1548859408 -1.0 host=bd-venkat put KafkaOperator.jvm_nonheap_committed 1548859408 4.2631168E7 host=bd-venkat put KafkaOperator.jvm_buffer_direct_max 1548859408 826.0 host=bd-venkat put KafkaOperator.jvm_post_gc_PS_Eden_Space_max 1548859408 1.40928614E9 host=bd-venkat put KafkaOperator.jvm_post_gc_used 1548859408 1.2706672E7 host=bd-venkat put KafkaOperator.jvm_classes_total_loaded 1548859408 5137.0 host=bd-venkat put KafkaOperator.jvm_current_mem_Metaspace_used 1548859408 3.1875904E7 host=bd-venkat put KafkaOperator.jvm_post_gc_PS_Eden_Space_used 1548859408 0.0 host=bd-venkat put KafkaOperator.jvm_current_mem_PS_Old_Gen_max 1548859408 2.86366106E9 host=bd-venkat put KafkaOperator.jvm_current_mem_PS_Eden_Space_used 1548859408 4.1095296E7 host=bd-venkat put KafkaOperator.jvm_classes_current_loaded 1548859408 5137.0 host=bd-venkat put KafkaOperator.jvm_current_mem_Compressed_Class_Space_used 1548859408 4425568.0 host=bd-venkat put KafkaOperator.jvm_current_mem_PS_Survivor_Space_used 1548859408 0.0 host=bd-venkat put KafkaOperator.jvm_uptime 1548859408 12302.0 host=bd-venkat put KafkaOperator.jvm_current_mem_Code_Cache_max 1548859408 2.5165824E8 host=bd-venkat put KafkaOperator.jvm_classes_total_unloaded 1548859408 0.0 host=bd-venkat put KafkaOperator.jvm_current_mem_PS_Survivor_Space_max 1548859408 1.1010048E7 host=bd-venkat put KafkaOperator.jvm_heap_used 1548859408 5.3801968E7 host=bd-venkat put KafkaOperator.jvm_heap_committed 1548859408 2.00802304E8 host=bd-venkat put KafkaOperator.jvm_post_gc_PS_Old_Gen_max 1548859408 2.86366106E9 host=bd-venkat put KafkaOperator.jvm_thread_count 1548859408 15.0 host=bd-venkat

20:13:28.377 [Thread-5] WARN com.pinterest.doctorkafka.util.MetricsPusher - Failed to send stats to OpenTSDB, will retry up to next interval com.pinterest.doctorkafka.util.OpenTsdbClient$ConnectionFailedException: java.net.ConnectException: Connection refused (Connection refused) at com.pinterest.doctorkafka.util.OpenTsdbClient.sendMetrics(OpenTsdbClient.java:109) at com.pinterest.doctorkafka.util.MetricsPusher.sendMetrics(MetricsPusher.java:101) at com.pinterest.doctorkafka.util.MetricsPusher.run(MetricsPusher.java:129) Caused by: java.net.ConnectException: Connection refused (Connection refused) at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589) at com.pinterest.doctorkafka.util.OpenTsdbClient.sendMetrics(OpenTsdbClient.java:107) ... 2 common frames omitted 20:13:28.483 [Thread-5] WARN com.pinterest.doctorkafka.util.MetricsPusher - Failed to send stats to OpenTSDB, will retry up to next interval com.pinterest.doctorkafka.util.OpenTsdbClient$ConnectionFailedException: java.net.ConnectException: Connection refused (Connection refused) at com.pinterest.doctorkafka.util.OpenTsdbClient.sendMetrics(OpenTsdbClient.java:109) at com.pinterest.doctorkafka.util.MetricsPusher.sendMetrics(MetricsPusher.java:101) at com.pinterest.doctorkafka.util.MetricsPusher.run(MetricsPusher.java:129) Caused by: java.net.ConnectException: Connection refused (Connection refused) at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589) at com.pinterest.doctorkafka.util.OpenTsdbClient.sendMetrics(OpenTsdbClient.java:107) ... 2 common frames omitted ^C20:13:30.857 [Thread-5] WARN com.pinterest.doctorkafka.util.MetricsPusher - Failed to send stats to OpenTSDB, will retry up to next interval com.pinterest.doctorkafka.util.OpenTsdbClient$ConnectionFailedException: java.net.ConnectException: Connection refused (Connection refused) at com.pinterest.doctorkafka.util.OpenTsdbClient.sendMetrics(OpenTsdbClient.java:109) at com.pinterest.doctorkafka.util.MetricsPusher.sendMetrics(MetricsPusher.java:101) at com.pinterest.doctorkafka.util.MetricsPusher.run(MetricsPusher.java:129) Caused by: java.net.ConnectException: Connection refused (Connection refused) at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589) at com.pinterest.doctorkafka.util.OpenTsdbClient.sendMetrics(OpenTsdbClient.java:107) ... 2 common frames omitted 20:13:30.859 [Thread-1] INFO org.apache.kafka.clients.producer.KafkaProducer - [Producer clientId=producer-1] Closing the Kafka producer with timeoutMillis = 9223372036854775807 ms. 20:13:30.859 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.clients.producer.internals.Sender - [Producer clientId=producer-1] Beginning shutdown of Kafka producer I/O thread, sending remaining records. 20:13:30.863 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name connections-closed: 20:13:30.863 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name connections-created: 20:13:30.863 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name successful-authentication: 20:13:30.864 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name failed-authentication: 20:13:30.864 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name bytes-sent-received: 20:13:30.864 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name bytes-sent: 20:13:30.865 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name bytes-received: 20:13:30.865 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name select-time: 20:13:30.865 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name io-time: 20:13:30.866 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name node--1.bytes-sent 20:13:30.866 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name node--1.bytes-received 20:13:30.866 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name node--1.latency 20:13:30.867 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name node-1.bytes-sent 20:13:30.867 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name node-1.bytes-received 20:13:30.867 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.common.metrics.Metrics - Removed sensor with name node-1.latency 20:13:30.867 [kafka-producer-network-thread | producer-1] DEBUG org.apache.kafka.clients.producer.internals.Sender - [Producer clientId=producer-1] Shutdown of Kafka producer I/O thread has completed. 20:13:30.868 [Thread-1] DEBUG org.apache.kafka.clients.producer.KafkaProducer - [Producer clientId=producer-1] Kafka producer has been closed

yuyang08 commented 5 years ago

@venkatwilliams we have been running dotorkafka 0.2.4.3 and 0.2.4.3 fine without issue. Resolving this issue as non-repro.