linkedin / cruise-control

Cruise-control is the first of its kind to fully automate the dynamic workload rebalance and self-healing of a Kafka cluster. It provides great value to Kafka users by simplifying the operation of Kafka clusters.
https://github.com/linkedin/cruise-control/tags
BSD 2-Clause "Simplified" License
2.74k stars 587 forks source link

Error setup Cruise-Control-UI for Cruisecontrol #2182

Closed Sotatek-GiangPham2 closed 1 month ago

Sotatek-GiangPham2 commented 1 month ago

Hi,

Recently, i deployed a Kafka Strimzi using Cruise Control and Cruise UI based on a blog about this title https://strimzi.io/blog/2023/01/11/hacking-for-cruise-control-ui/. Unfortunately, my cruise-control pod appeared some errors i can't solve, everyone can suggest solutions for me. Thanks

Best regards! Giang

2024-08-14 11:10:00 WARN  KafkaAdminClient:610 - [AdminClient clientId=adminclient-3] Overriding the default value for default.api.timeout.ms (0) with the explicitly configured request timeout 180000
2024-08-14 11:10:00 WARN  AdminClientConfig:384 - The configuration 'sasl.login.callback.handler.class' was supplied but isn't a known config.
2024-08-14 11:10:00 WARN  AdminClientConfig:384 - The configuration 'sasl.jaas.config' was supplied but isn't a known config.
2024-08-14 11:10:00 WARN  AdminClientConfig:384 - The configuration 'sasl.client.callback.handler.class' was supplied but isn't a known config.
2024-08-14 11:10:00 INFO  AppInfoParser:119 - Kafka version: 3.1.0
2024-08-14 11:10:00 INFO  AppInfoParser:120 - Kafka commitId: 37edeed0777bacb3
2024-08-14 11:10:00 INFO  AppInfoParser:121 - Kafka startTimeMs: 1723633800326
2024-08-14 11:10:00 ERROR KafkaCruiseControlMain:33 - Uncaught exception on thread Thread[main,5,main]
java.lang.NullPointerException: null
        at com.linkedin.kafka.cruisecontrol.config.BrokerCapacityConfigFileResolver.loadCapacities(BrokerCapacityConfigFileResolver.java:298) ~[cruise-control-2.5.103.jar:?]
        at com.linkedin.kafka.cruisecontrol.config.BrokerCapacityConfigFileResolver.configure(BrokerCapacityConfigFileResolver.java:161) ~[cruise-control-2.5.103.jar:?]
        at com.linkedin.kafka.cruisecontrol.config.KafkaCruiseControlConfigUtils.getConfiguredInstance(KafkaCruiseControlConfigUtils.java:49) ~[cruise-control-2.5.103.jar:?]
        at com.linkedin.kafka.cruisecontrol.config.KafkaCruiseControlConfig.getConfiguredInstance(KafkaCruiseControlConfig.java:98) ~[cruise-control-2.5.103.jar:?]
        at org.apache.kafka.common.config.AbstractConfig.getConfiguredInstance(AbstractConfig.java:419) ~[kafka-clients-3.1.0.jar:?]
        at com.linkedin.kafka.cruisecontrol.config.KafkaCruiseControlConfig.getConfiguredInstance(KafkaCruiseControlConfig.java:80) ~[cruise-control-2.5.103.jar:?]
        at com.linkedin.kafka.cruisecontrol.monitor.LoadMonitor.<init>(LoadMonitor.java:151) ~[cruise-control-2.5.103.jar:?]
        at com.linkedin.kafka.cruisecontrol.monitor.LoadMonitor.<init>(LoadMonitor.java:124) ~[cruise-control-2.5.103.jar:?]
        at com.linkedin.kafka.cruisecontrol.KafkaCruiseControl.<init>(KafkaCruiseControl.java:126) ~[cruise-control-2.5.103.jar:?]
        at com.linkedin.kafka.cruisecontrol.async.AsyncKafkaCruiseControl.<init>(AsyncKafkaCruiseControl.java:34) ~[cruise-control-2.5.103.jar:?]
        at com.linkedin.kafka.cruisecontrol.KafkaCruiseControlApp.<init>(KafkaCruiseControlApp.java:50) ~[cruise-control-2.5.103.jar:?]
        at com.linkedin.kafka.cruisecontrol.KafkaCruiseControlMain.main(KafkaCruiseControlMain.java:38) ~[cruise-control-2.5.103.jar:?]

File kafka.yaml

apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
  name: kafka-cluster
  namespace: test
spec:
  cruiseControl:
    image: trourest186/cruise-crontrol-ui:v2.0
    config:
      webserver.security.enable: false
  entityOperator:
    topicOperator: {}
    userOperator: {}
  kafka:
    config:
      default.replication.factor: 2
      inter.broker.protocol.version: "3.7"
      min.insync.replicas: 2
      offsets.topic.replication.factor: 3
      transaction.state.log.min.isr: 2
      transaction.state.log.replication.factor: 3
    listeners:
    - name: plain
      port: 9092
      tls: false
      type: internal
    - name: tls
      port: 9093
      tls: true
      type: internal
    - name: external
      port: 9094
      tls: false
      type: nodeport
    replicas: 3
    storage:
      type: jbod
      volumes:
      - deleteClaim: false
        id: 0
        size: 30Gi
        type: persistent-claim
    template:
      pod:
        affinity:
          nodeAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
              nodeSelectorTerms:
              - matchExpressions:
                - key: nodepool
                  operator: In
                  values:
                  - sic-storage
        tolerations:
        - effect: NoSchedule
          key: node-role.kubernetes.io/control-plane
          operator: Exists
    version: 3.7.0
  zookeeper:
    replicas: 1
    storage:
      deleteClaim: false
      size: 10Gi
      type: persistent-claim
    template:
      pod:
        affinity:
          nodeAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
              nodeSelectorTerms:
              - matchExpressions:
                - key: nodepool
                  operator: In
                  values:
                  - sic-storage
cpaika commented 1 month ago

Looks like you're missing broker capacity in your cruise-control configuration: https://github.com/linkedin/cruise-control/blob/main/cruise-control/src/main/java/com/linkedin/kafka/cruisecontrol/config/BrokerCapacityConfigFileResolver.java#L298

See: https://github.com/linkedin/cruise-control/wiki/Configurations#populating-the-capacity-config-file

This issue probably belongs on the Strimzi project.

mhratson commented 1 month ago

Thanks @cpaika for the help.