linkedin / Burrow

Kafka Consumer Lag Checking
Apache License 2.0
3.73k stars 797 forks source link

burrow 1.0 reports only one consumer while two consumers are active #319

Open psychonaut opened 6 years ago

psychonaut commented 6 years ago

I am migrating to burrow 1.0 our configuration and I've noticed that I have only one consumer available:

 [kkrzyzaniak@kafka01:~]$ curl --silent http://localhost:8000/v3/kafka/central/consumer
 {
   "error": false,
   "message": "consumer list returned",
   "consumers": [
     "logstash-turbo"
   ],
   "request": {
     "url": "/v3/kafka/central/consumer",
     "host": "kafka01"
   }
 }

While there are two active consumers:

 [kkrzyzaniak@kafka01:~]$ /opt/kafka/bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --list
 Note: This will only show information about consumers that use the Java consumer API (non-ZooKeeper-based consumers).

 logstash
 logstash-turbo

My config:

[general]
pidfile="/run/burrow/burrow.pid"
stdout-logfile="/var/log/burrow/burrow.out"
access-control-allow-origin=""

[logging]
filename="/var/log/burrow/burrow.log"
level="info"
maxsize=100
maxbackups=30
maxage=10
use-localtime=false
use-compression=true

[zookeeper]
servers=[ "zk01:2181", "zk02:2181", "zk03:2181" ]
timeout=6
root-path="/burrow"

[client-profile.test]
client-id="burrow-test"
kafka-version="0.10.0"

[cluster.central]
class-name="kafka"
servers=[ "kafka01:9092", "kafka02:9092", "kafka03:9092" ]
client-profile="test"
topic-refresh=120
offset-refresh=30

[consumer.central]
class-name="kafka"
cluster="central"
servers=[ "kafka01:9092", "kafka02:9092", "kafka03:9092" ]
client-profile="test"
group-blacklist=""
group-whitelist=""

[httpserver.default]
address=":8000"

[storage.default]
class-name="inmemory"
workers=20
intervals=15
expire-group=604800
min-distance=1
charkost commented 6 years ago

This happens to me as well when i start a new consumer group but i never produce any new message to the target topic. The new consumer group appears for the first 2 minutes in burrow's consumer list and then dissapears. However in kafka-consumer-groups.sh's list the new consumer group remains present.

psychonaut commented 6 years ago

In my case both consumers are very active.

toddpalino commented 6 years ago

If you switch the logging to DEBUG (either by changing the config and bouncing, or by a POST to "/v3/admin/loglevel" with the payload {"level": "debug"}) you're going to see every single offset and whether it was accepted or rejected, and why. This should help to shed some light on what's happening to that consumer group's offsets.

toddpalino commented 6 years ago

Any update after turning on debug logging?

psychonaut commented 6 years ago

There's a lot info after debug switched on but I don't know where to look (I don't see any obvious errors). repping by logstash gives me:

{"level":"info","ts":1514905302.7366343,"msg":"configuring","type":"module","coordinator":"consumer","class":"kafka","name":"logstash"} {"level":"info","ts":1514905302.8015866,"msg":"starting","type":"module","coordinator":"consumer","class":"kafka","name":"logstash"} {"level":"info","ts":1514905302.8081093,"msg":"starting consumers","type":"module","coordinator":"consumer","class":"kafka","name":"logstash","topic":"__consumer_offsets","count":50} {"level":"warn","ts":1514905476.1259022,"msg":"unknown consumer","type":"module","coordinator":"storage","class":"inmemory","name":"default","worker":2,"cluster":"central","consumer":"logstash","topic":"","partition":0,"topic_partition_count":0,"offset":0,"timestamp":0,"owner":"","request":"StorageFetchConsumer"} {"level":"info","ts":1514905558.3200693,"msg":"stopping","type":"module","coordinator":"consumer","class":"kafka","name":"logstash"} {"level":"warn","ts":1514905610.0265548,"msg":"unknown consumer","type":"module","coordinator":"storage","class":"inmemory","name":"default","worker":2,"cluster":"central","consumer":"logstash","topic":"","partition":0,"topic_partition_count":0,"offset":0,"timestamp":0,"owner":"","request":"StorageFetchConsumer"} {"level":"warn","ts":1514905612.3299513,"msg":"unknown consumer","type":"module","coordinator":"storage","class":"inmemory","name":"default","worker":2,"cluster":"central","consumer":"logstash","topic":"","partition":0,"topic_partition_count":0,"offset":0,"timestamp":0,"owner":"","request":"StorageFetchConsumer"} {"level":"warn","ts":1514991160.2250674,"msg":"unknown consumer","type":"module","coordinator":"storage","class":"inmemory","name":"default","worker":2,"cluster":"central","consumer":"logstash","topic":"","partition":0,"topic_partition_count":0,"offset":0,"timestamp":0,"owner":"","request":"StorageFetchConsumer"} {"level":"info","ts":1514991160.2251618,"msg":"cluster or consumer not found","type":"module","coordinator":"evaluator","class":"caching","name":"default","cluster":"central","consumer":"logstash","showall":false} {"level":"info","ts":1517307116.4562917,"msg":"cluster or consumer not found","type":"module","coordinator":"evaluator","class":"caching","name":"default","cluster":"central","consumer":"logstash","showall":false} {"level":"warn","ts":1517415250.064136,"msg":"unknown consumer","type":"module","coordinator":"storage","class":"inmemory","name":"default","worker":2,"cluster":"central","consumer":"logstash","topic":"","partition":0,"topic_partition_count":0,"offset":0,"timestamp":0,"owner":"","request":"StorageFetchConsumer"} {"level":"debug","ts":1517415250.0641832,"msg":"evaluation result","type":"module","coordinator":"evaluator","class":"caching","name":"default","cluster":"central","consumer":"logstash","status":"NOTFOUND"} {"level":"info","ts":1517415250.0642018,"msg":"cluster or consumer not found","type":"module","coordinator":"evaluator","class":"caching","name":"default","cluster":"central","consumer":"logstash","showall":false}

toddpalino commented 6 years ago

This log doesn't seem to match up with your config above. I note that the first line is starting a consumer module named "logstash", which isn't in your config.

However, at the end of the day, this log segment doesn't show any offsets captured for a group named "logstash". Even though the Kafka CLI tools say that this group exists, Burrow will only know about it if it is committing offsets.

psychonaut commented 6 years ago

config is exactly copy&paste from node where I run this debug.

maverickagm commented 6 years ago

I have the same issue as well. Only 2 of my 3 consumers show up. I can run the following and see the missing consumer committing offsets:

watch -n1 -t  "kafka-run-class kafka.admin.ConsumerGroupCommand --bootstrap-server localhost:9093 --command-config /etc/kafka/consumer.properties --new-consumer --group the_missing_consumer_group --describe 2>/dev/null"

Grepping for the missing consumer in the debug logs doesn't yield anything.

psychonaut commented 6 years ago

Looks like upgrading to burrow 1.1.0 fix my issue:

[kkrzyzaniak@kafka01.heka-elastic-kibana:~]$ http GET :8000/v3/kafka/central/consumer
{
    "consumers": [
        "logstash-turbo", 
        "logstash"
    ], 
    "error": false, 
    "message": "consumer list returned", 
    "request": {
        "host": "kafka01", 
        "url": "/v3/kafka/central/consumer"
    }
}