jet / kafunk

Kafunk: F# Kafka client
https://jet.github.io/kafunk/
Other
160 stars 63 forks source link

Consumer randomly stops consuming, but continues to commit the stagnant offsets. #155

Closed schoi80 closed 6 years ago

schoi80 commented 7 years ago

Currently using Kafunk v0.1.2 consumer.

We've had a number of occurrences where consumer would randomly stop consuming messages from topic partition(s). There is no obvious errors / warnings when this happens. Therefore, and unfortunately, we don't have a deterministic way to reproduce the issue. (FYI, we are actually spotting this via monitoring partition log size and consumer group offset)

Typically, within first 24 hour of runtime, we would spot this behavior in at least 1 partition / consumer.

Here is the example log output when this behavior was observed. From the log you can observe that consumer is handling a batch of messages up until 10:12:20.538 PM. Shortly after, you can observe that same offsets are being committed each minute. (FYI, I also have an evidence that partition 18 and 19 had plenty of messages at the time)

Happy to provide additional detail as needed.

-- Added 7/6/17-- Here is another example log output

schoi80 commented 7 years ago

Shortly before the consumer hang is detected, I found the following Kafunk errors/warnings from the log

eulerfx commented 6 years ago

This should be addressed as of 0.1.5