Closed varun06 closed 5 years ago
@bai have you seen this error ever?
Possibly related? https://github.com/bsm/sarama-cluster/issues/29
Thanks Bai, that's helpful.
Hey @dim I think I need your help here. I have tried bunch things but still getting below error in sarama logs.
ERRO[0098] kafka: error while consuming ewr.kessel-run.mt-raw.1/8: kafka server: The provided member is not known in the current generation. offset=-3 partition=-1 topic=unknown type=kafka
ERRO[0098] kafka: error while consuming ewr.kessel-run.mt-raw.1/1: kafka server: The provided member is not known in the current generation. offset=-3 partition=-1 topic=unknown type=kafka
ERRO[0098] kafka: error while consuming ewr.kessel-run.mt-raw.1/4: kafka server: The provided member is not known in the current generation. offset=-3 partition=-1 topic=unknown type=kafka
Here is my consumer side config -
cfg.Consumer.Group.Session.Timeout = 20 * time.Second
cfg.Consumer.Group.Heartbeat.Interval = 6 * time.Second
cfg.Consumer.MaxProcessingTime = 500 * time.Millisecond
redacted logs from kafka side -
[2019-03-19 20:22:40,044] INFO [GroupCoordinator 8]: Preparing to rebalance group test-krm-local with old generation 53 (__consumer_offsets-29) (kafka.coordinator.group.GroupCoordinator)
[2019-03-19 20:22:43,045] INFO [GroupCoordinator 8]: Stabilized group test-krm-local generation 54 (__consumer_offsets-29) (kafka.coordinator.group.GroupCoordinator)
[2019-03-19 20:22:43,224] INFO [GroupCoordinator 8]: Assignment received from leader for group test-krm-local for generation 54 (kafka.coordinator.group.GroupCoordinator)
[2019-03-19 20:23:05,266] INFO [GroupCoordinator 8]: Member kessel-run-mirror-06ae81c4-a739-4ec2-8b94-3656a5ea831e in group test-krm-local has failed, removing it from the group (kafka.coordinator.group.GroupCoordinator)
[2019-03-19 20:23:05,266] INFO [GroupCoordinator 8]: Preparing to rebalance group test-krm-local with old generation 54 (__consumer_offsets-29) (kafka.coordinator.group.GroupCoordinator)
[2019-03-19 20:23:05,266] INFO [GroupCoordinator 8]: Group test-krm-local with generation 55 is now empty (__consumer_offsets-29) (kafka.coordinator.group.GroupCoordinator)
[2019-03-19 20:23:18,314] INFO [GroupCoordinator 8]: Preparing to rebalance group test-krm-local with old generation 55 (__consumer_offsets-29) (kafka.coordinator.group.GroupCoordinator)
[2019-03-19 20:23:21,316] INFO [GroupCoordinator 8]: Stabilized group test-krm-local generation 56 (__consumer_offsets-29) (kafka.coordinator.group.GroupCoordinator)
[2019-03-19 20:23:21,562] INFO [GroupCoordinator 8]: Assignment received from leader for group test-krm-local for generation 56 (kafka.coordinator.group.GroupCoordinator)
[2019-03-19 20:23:31,600] INFO [GroupCoordinator 8]: Member kessel-run-mirror-6983da00-2fb0-46c4-a3ae-a9852b2741b6 in group test-krm-local has failed, removing it from the group (kafka.coordinator.group.GroupCoordinator)
[2019-03-19 20:23:31,600] INFO [GroupCoordinator 8]: Preparing to rebalance group test-krm-local with old generation 56 (__consumer_offsets-29) (kafka.coordinator.group.GroupCoordinator)
[2019-03-19 20:23:31,600] INFO [GroupCoordinator 8]: Group test-krm-local with generation 57 is now empty (__consumer_offsets-29) (kafka.coordinator.group.GroupCoordinator)
@varun06 sorry, I have been slightly distracted recently :) My apologies, but I am not 100% sure how to help. Generation errors only happen when a member tries to commit offsets after the session has been closed server side. I assume you are not stopping your consume loop quickly enough after a rebalance is triggered. The broker is then starting a new session with a new generation and giving up on the previous one. I would try to increase the Session.Timeout and see if that makes a difference.
Thanks @dim I have been playing with timeouts and they have helped, so errors are very sporadic now, I am sure it is the way we committing the offsets, we commit them in batch and that code has some oddities as you mentioned.
@varun06 can you please describe and share your timeout values with us? We have the same problem with Sarama.
@1995parham
cfg.Consumer.Group.Session.Timeout = 20 * time.Second
cfg.Consumer.Group.Heartbeat.Interval = 6 * time.Second
cfg.Consumer.MaxProcessingTime = 500 * time.Millisecond
Versions
Sarama Version:1.21.0 Kafka Version: 1.1.0 Go Version:1.12
Configuration
Logs
Problem Description
We have a library that provide some abstraction over sarama consumer group. while using the library I see lots of above ^^ errors. I have already looked at session timeout values, but no help. @dim can you please help me understand this error and point towards some steps?