linkedin / kafka-monitor

Xinfra Monitor monitors the availability of Kafka clusters by producing synthetic workloads using end-to-end pipelines to obtain derived vital statistics - E2E latency, service produce/consume availability, offsets commit availability & latency, message loss rate and more.
https://engineering.linkedin.com/blog/2016/05/open-sourcing-kafka-monitor
Apache License 2.0
2.02k stars 446 forks source link

"cluster-topic-manipulation-service" issue #348

Open nickstatka777 opened 3 years ago

nickstatka777 commented 3 years ago

Hello team, I faced the issue with cluster-topic-manipulation-service it doesn't work for me at all. So when I'm configuring this it doesn't create a topic for iterations and didn't send any info into "reporter-service" and to "statsd-service" metrics accordingly. However, the other configuration works as it should. Could you please check my config file, maybe I made some mistakes:

{
    "single-cluster-monitor": {
        "class.name": "com.linkedin.xinfra.monitor.apps.SingleClusterMonitor",
        "topic": "xinfra-monitor-topic",
        "zookeeper.connect": "localhost:2181",
        "bootstrap.servers": "localhost:9093",
        "request.timeout.ms": 3000,
        "produce.record.delay.ms": 1000,
        "topic-management.topicManagementEnabled": true,
        "topic-management.topicCreationEnabled": true,
        "topic-management.replicationFactor" : 3,
        "topic-management.partitionsToBrokersRatio" : 2.0,
        "topic-management.rebalance.interval.ms" : 600000,
        "topic-management.preferred.leader.election.check.interval.ms" : 300000,
        "topic-management.topicAddPartitionEnabled": "true",
        "topic-management.topicReassignPartitionAndElectLeaderEnabled": true,
        "client.id": "xinfra-monitor-adminclient",
        "security.protocol": "SSL",
        "ssl.key.password": "${PASSWORD}",
        "ssl.keystore.location": "keystore.p12",
        "ssl.keystore.password": "${PASSWORD}",
        "ssl.keystore.type": "PKCS12",
        "ssl.truststore.location": "truststore.p12",
        "ssl.truststore.password": "${PASSWORD}",
        "ssl.truststore.type": "PKCS12",
        "produce.producer.props": {
            "class.name": "com.linkedin.xinfra.monitor.producer.NewProducer",
            "client.id": "xinfra-monitor-producer",
            "security.protocol": "SSL",
            "ssl.keystore.location": "keystore.p12",
            "ssl.keystore.password": "${PASSWORD}",
            "ssl.keystore.type": "PKCS12",
            "ssl.truststore.location": "truststore.p12",
            "ssl.truststore.password": "${PASSWORD}",
            "ssl.truststore.type": "PKCS12"
        },
        "consume.latency.sla.ms": "20000",
        "consume.consumer.props": {
            "group.id": "a-group-id-mb-888",
            "class.name": "com.linkedin.kmf.consumer.NewConsumer",
            "client.id": "xinfra-monitor-consumer",
            "security.protocol": "SSL",
            "ssl.key.password": "${PASSWORD}",
            "ssl.keystore.location": "keystore.p12",
            "ssl.keystore.password": "${PASSWORD}",
            "ssl.keystore.type": "PKCS12",
            "ssl.truststore.location": "truststore.p12",
            "ssl.truststore.password": "${PASSWORD}",
            "ssl.truststore.type": "PKCS12"
            }
        },
        "jolokia-service": {
            "class.name": "com.linkedin.xinfra.monitor.services.JolokiaService"
        },
        "cluster-topic-manipulation-service":{
        "class.name":"com.linkedin.xinfra.monitor.services.ClusterTopicManipulationService",
        "zookeeper.connect": "localhost:2181",
        "bootstrap.servers":"localhost:9093",
        "topic": "xinfra-monitor-manipulation-topic"
        },
        "offset-commit-service": {
            "class.name": "com.linkedin.xinfra.monitor.services.OffsetCommitService",
            "zookeeper.connect": "localhost:2181",
            "bootstrap.servers": "localhost:9093",
            "consumer.props": {
                "group.id": "target-consumer-group"
            }
        },
        "reporter-service": {
            "class.name": "com.linkedin.xinfra.monitor.services.DefaultMetricsReporterService",
            "report.interval.sec": 1,
            "report.metrics.list": [
                ...
            ]
        },
        "statsd-service": {
            "class.name": "com.linkedin.xinfra.monitor.services.StatsdMetricsReporterService",
            "report.statsd.host": "127.0.0.1",
            "report.statsd.port": "9125",
            "report.statsd.prefix": "xinfra-monitor",
            "report.interval.sec": 1,
            "report.metrics.list": [
                ...
            ]   
        }
    }    

In addition, I have a question, is it possible that the "cluster-topic-manipulation-service" would work with Kafka brokers only, without zookeepers? This is very important for our infrastructure. Also, I noticed that there is no possibility to reload service configuration without application restart. Our case is necessary for pkcs12 Keystore reloading every 24 hours. Could you please add this possibility in future releases?

github-actions[bot] commented 3 years ago

This is your first issue in the repository. Thank you for raising this issue.' first issue