linkedin / Burrow

Kafka Consumer Lag Checking
Apache License 2.0
3.75k stars 798 forks source link

SIGSEGV on startup with Confluent Kafka #353

Open pint2oo opened 6 years ago

pint2oo commented 6 years ago

Hi there,

Burrow panics at startup with Confluent Kafka 3.2.2 (implementing Apache Kafka 0.10.2.1). I'm running the latest version of Burrow which includes fix from #345 .

burrow.out :

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x859ffa]

goroutine 127 [running]:
github.com/linkedin/Burrow/core/internal/consumer.(*KafkaClient).processConsumerOffsetsMessage(0xc42023c090, 0x0)
        $GOPATH/src/github.com/linkedin/Burrow/core/internal/consumer/kafka_client.go:233 +0x3a
github.com/linkedin/Burrow/core/internal/consumer.(*KafkaClient).partitionConsumer(0xc42023c090, 0xcbd2c0, 0xc420aa8380)
        $GOPATH/src/github.com/linkedin/Burrow/core/internal/consumer/kafka_client.go:170 +0x562
created by github.com/linkedin/Burrow/core/internal/consumer.(*KafkaClient).startKafkaConsumer
        $GOPATH/src/github.com/linkedin/Burrow/core/internal/consumer/kafka_client.go:225 +0x5af

burrow.log :

{"level":"info","ts":1518192593.3013716,"msg":"Started Burrow"}
{"level":"info","ts":1518192593.3016908,"msg":"configuring","type":"coordinator","name":"zookeeper"}
{"level":"info","ts":1518192593.3028903,"msg":"configuring","type":"coordinator","name":"storage"}
{"level":"info","ts":1518192593.3029835,"msg":"configuring","type":"module","coordinator":"storage","class":"inmemory","name":"default"}
{"level":"info","ts":1518192593.3031235,"msg":"configuring","type":"coordinator","name":"evaluator"}
{"level":"info","ts":1518192593.3031826,"msg":"configuring","type":"module","coordinator":"evaluator","class":"caching","name":"default"}
{"level":"info","ts":1518192593.3032408,"msg":"configuring","type":"coordinator","name":"httpserver"}
{"level":"info","ts":1518192593.3033948,"msg":"configuring","type":"coordinator","name":"notifier"}
{"level":"info","ts":1518192593.3034384,"msg":"configuring","type":"coordinator","name":"cluster"}
{"level":"info","ts":1518192593.3034916,"msg":"configuring","type":"module","coordinator":"cluster","class":"kafka","name":"local"}
{"level":"info","ts":1518192593.3040285,"msg":"configuring","type":"coordinator","name":"consumer"}
{"level":"info","ts":1518192593.3041415,"msg":"configuring","type":"module","coordinator":"consumer","class":"kafka_zk","name":"local_zk"}
{"level":"info","ts":1518192593.304852,"msg":"configuring","type":"module","coordinator":"consumer","class":"kafka","name":"local"}
{"level":"info","ts":1518192593.3053997,"msg":"starting","type":"coordinator","name":"zookeeper"}
{"level":"info","ts":1518192593.3076718,"msg":"Connected to 127.0.0.1:2181","type":"coordinator","name":"zookeeper"}
{"level":"info","ts":1518192593.3122435,"msg":"Authenticated: id=99496255347884036, timeout=6000","type":"coordinator","name":"zookeeper"}
{"level":"info","ts":1518192593.312441,"msg":"Re-submitting `0` credentials after reconnect","type":"coordinator","name":"zookeeper"}
{"level":"info","ts":1518192593.3162775,"msg":"starting","type":"coordinator","name":"storage"}
{"level":"info","ts":1518192593.3163555,"msg":"starting","type":"module","coordinator":"storage","class":"inmemory","name":"default"}
{"level":"info","ts":1518192593.316457,"msg":"starting","type":"coordinator","name":"evaluator"}
{"level":"info","ts":1518192593.3164802,"msg":"starting","type":"module","coordinator":"evaluator","class":"caching","name":"default"}
{"level":"info","ts":1518192593.3164997,"msg":"starting","type":"coordinator","name":"httpserver"}
{"level":"info","ts":1518192593.316799,"msg":"started listener","type":"coordinator","name":"httpserver","listener":"[::]:8000"}
{"level":"info","ts":1518192593.3168387,"msg":"starting","type":"coordinator","name":"notifier"}
{"level":"info","ts":1518192593.3168676,"msg":"starting","type":"coordinator","name":"cluster"}
{"level":"info","ts":1518192593.3168843,"msg":"starting","type":"module","coordinator":"cluster","class":"kafka","name":"local"}
{"level":"info","ts":1518192593.3624427,"msg":"starting","type":"coordinator","name":"consumer"}
{"level":"info","ts":1518192593.3625522,"msg":"starting","type":"module","coordinator":"consumer","class":"kafka_zk","name":"local_zk"}
{"level":"info","ts":1518192593.364876,"msg":"Failed to connect to [::1]:2181: dial tcp [::1]:2181: connect: network is unreachable","type":"module","coordinator":"consumer","class":"kafka_zk","name":"local_zk"}
{"level":"info","ts":1518192593.3651643,"msg":"Connected to 127.0.0.1:2181","type":"module","coordinator":"consumer","class":"kafka_zk","name":"local_zk"}
{"level":"info","ts":1518192593.3692389,"msg":"Authenticated: id=99496255347884037, timeout=30000","type":"module","coordinator":"consumer","class":"kafka_zk","name":"local_zk"}
{"level":"info","ts":1518192593.369343,"msg":"Re-submitting `0` credentials after reconnect","type":"module","coordinator":"consumer","class":"kafka_zk","name":"local_zk"}
{"level":"info","ts":1518192593.374514,"msg":"starting","type":"module","coordinator":"consumer","class":"kafka","name":"local"}
{"level":"info","ts":1518192593.3837266,"msg":"starting consumers","type":"module","coordinator":"consumer","class":"kafka","name":"local","topic":"__consumer_offsets","count":50}
{"level":"info","ts":1518192593.4269505,"msg":"starting evaluations","type":"coordinator","name":"notifier"}

burrow.toml :

[general]
pidfile="burrow.pid"
stdout-logfile="burrow.out"
access-control-allow-origin="mysite.example.com"

[logging]
filename="logs/burrow.log"
level="info"
maxsize=100
maxbackups=30
maxage=10
use-localtime=false
use-compression=true

[zookeeper]
servers=[ "localhost:2181" ]
timeout=6
root-path="/burrow"

[client-profile.test]
client-id="burrow-test"
kafka-version="0.10.2.1"

[cluster.local]
class-name="kafka"
servers=[ "localhost:9092" ]
client-profile="test"
topic-refresh=60
offset-refresh=15

[consumer.local]
class-name="kafka"
cluster="local"
servers=[ "localhost:9092" ]
client-profile="test"
group-whitelist=""
start-latest=false

[consumer.local_zk]
class-name="kafka_zk"
cluster="local"
servers=["localhost:2181" ]
zookeeper-path=""
zookeeper-timeout=30
group-whitelist=""

[httpserver.default]
address=":8000"

[storage.default]
class-name="inmemory"
workers=5
intervals=10
expire-group=604800
min-distance=0

I had done some tests with the same configuration and the latest Apache Kafka 1.0.0 on the same machine, and Burrow had run smoothly.

toddpalino commented 6 years ago

The NPE is because Burrow is unable to start the consumers for the __consumer_offsets topic. This could be due to ACL issues, or because the topic doesn't exist yet (it's only created after the first consumer group is started up).