mailgun / kafka-pixy

gRPC/REST proxy for Kafka
Apache License 2.0
768 stars 119 forks source link

Inexplicable offset manager timeouts #123

Closed horkhe closed 6 years ago

horkhe commented 6 years ago

From time to time we see in the logs that the offset manager timeout elapses. The timeout means that we do not commit offsets in time, therefore more messages can be delivered more than once in case of Kafka-Pixy crash. The root cause of this issue is not clear and should be investigated, understood and fixed before it gets out of hand.

request timeout 1.500776182s
github.com/mailgun/kafka-pixy/offsetmgr.(*offsetMgr).run
    /go/src/github.com/mailgun/kafka-pixy/offsetmgr/offsetmgr.go:320
github.com/mailgun/kafka-pixy/offsetmgr.(*offsetMgr).(github.com/mailgun/kafka-pixy/offsetmgr.run)-fm
    /go/src/github.com/mailgun/kafka-pixy/offsetmgr/offsetmgr.go:138
github.com/mailgun/kafka-pixy/actor.Spawn.func1
    /go/src/github.com/mailgun/kafka-pixy/actor/actor.go:98
runtime.goexit
    /usr/local/go/src/runtime/asm_amd64.s:2337

The issue becomes more pronounced when several heavily consuming Kafka-Pixy instance are restarted at once (e.g. during deployment).