strimzi / strimzi-kafka-operator

Apache Kafka® running on Kubernetes
https://strimzi.io/
Apache License 2.0
4.81k stars 1.29k forks source link

[Bug]: Kafka Exporter pod restarting frequently #10748

Open Nitaksh opened 3 hours ago

Nitaksh commented 3 hours ago

Bug Description

The Kafka Exporter pod restarts very frequently when the Kafka cluster is being used (i.e consumer groups getting created and deleted often). Cpu and memory usage are both under control, so that's not an issue. Also the pod doesn't restart when cluster is not in use.

Steps to reproduce

No response

Expected behavior

The pod should not restart

Strimzi version

0.42.0

Kubernetes version

1.25.9

Installation method

Helm Chart

Infrastructure

Bare-Metal

Configuration files and logs

Deployment (Only required parts have been included here) :

kind: Kafka
metadata:
  name: strimzi-cluster
  namespace: strimzi
  annotations:
    strimzi.io/node-pools: enabled
    strimzi.io/kraft: enabled
spec:
  kafkaExporter:
    groupRegex: ".*"
    topicRegex: ".*"
    groupExcludeRegex: "^excluded-.*"
    topicExcludeRegex: "^excluded-.*"
    showAllOffsets: false
    resources:
      requests:
        cpu: 500m
        memory: 512Mi
      limits:
        cpu: 500m
        memory: 512Mi
    logging: debug
    template:
      pod:
        metadata:
          annotations:
            prometheus.io/path: /metrics
            prometheus.io/port: "9404"
            prometheus.io/scrape: "true"
        affinity:
          nodeAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
              nodeSelectorTerms:
                - matchExpressions:
                  - key: ****
                    operator: In
                    values:
                    - ****

container logs :


I1022 09:20:23.442208      11 kafka_exporter.go:800] Starting kafka_exporter (version=1.7.0, branch=HEAD, revision=7e840e81a0170375214e2c1e1dc7ce94aeff8712)
I1022 09:20:23.442272      11 kafka_exporter.go:801] Build context (go=go1.20.4, platform=linux/amd64, user=root@e0e313c68f71, date=20230524-04:23:04, tags=netgo)
I1022 09:20:23.458395      11 kafka_exporter.go:971] Listening on HTTP :9404
I1022 09:20:51.222314      11 kafka_exporter.go:383] Refreshing client metadata
I1022 09:20:51.540687      11 kafka_exporter.go:658] Fetching consumer group metrics
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x9eeffa]
goroutine 30 [running]:
main.(*Exporter).collect.func4(0xc000203880)
 /app/kafka_exporter.go:586 +0xb1a
created by main.(*Exporter).collect
 /app/kafka_exporter.go:662 +0x9b0
scholzj commented 3 hours ago

This is likely either related to some bug in the Kafka Exporter itself or to something happening in your Kafka cluster. In any case, it does not seem like a Strimzi bug.

Nitaksh commented 3 hours ago

Ok, I'll open an issue here then https://github.com/danielqsj/kafka_exporter