Excessive (10GB+) memory usage for Kafka exporter

We have a few topics in our Kafka cluster, around 10 brokers, and only a few partitions. Consumers (groups) in the 1000s.

We have deployed Kafka using Strimzi Kafka Operator Helm chart (0.34.0).

This comes equipped with a kafka-exporter instance.

When trying to scrape the exposed endpoint the following events a logged in the pod description:

The pods will try to consume absolutely bogus amounts of memory causing them all to get evicted

The following is our Kafka CRD (again, using Strimzi operator):

We haven't really set any custom configuration, but i do not see how these amounts of energy consumption should even be possible.

Any hints as to where the potential memory leak could be / where we might have set it up wrong?

Fyi, we did add gracious amounts of resources to the Kafka CRD.

EDIT:

I have identified an extremely large number of consumer group offsets being stored in the cluster and therefore processed by the exporter, as we have spun consumers up and down with unique group ids many times creating millions over time.

I deleted all inactive consumer groups, as they congested the exporter, by executing the following script inside a broker:

/opt/kafka/bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --list | while read GROUP; do \
  STATUS=$(/opt/kafka/bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --describe --group "$GROUP" | awk '{print $6}'); \
  if [ "$STATUS" = "Empty" ]; then \
    echo "Deleting consumer group: $GROUP"; \
    /opt/kafka/bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --delete --group "$GROUP"; \
  else \
    echo "Skipping active consumer group: $GROUP"; \
  fi; \
done

This issue will be closed, as it was fact just an absurd amount of data. We could of course also limit the exporter to not scrape the offsets at all.

danielqsj / kafka_exporter

Excessive (10GB+) memory usage for Kafka exporter #385