Open dasaripravin-developer opened 11 months ago
I've seen this happen in new clusters when the offsets.topic.replication.factor
(default: 3) is less than the actual number of brokers.
I am getting this error when starting my consumer, producer is working fine and publishing messages. How can I consumer messages. Can someone please help me in resolving it.
FYI: My topic details of replication and partition is mentioned below Replication: 3 Partition: 2
Error Msg 1
{"level":"ERROR","timestamp":"2023-12-20T19:16:12.330Z","logger":"kafkajs","message":"[Connection] Response GroupCoordinator(key: 10, version: 2)","broker":"ec2-54-90-120-75.compute-1.amazonaws.com:9096","clientId":"my-app","error":"Not authorized to access group: Group authorization failed","correlationId":0,"size":49}
Error Msg 2
{"level":"ERROR","timestamp":"2023-12-20T19:16:12.330Z","logger":"kafkajs","message":"[Consumer] Crash: KafkaJSGroupCoordinatorNotFound: Failed to find group coordinator","groupId":"my-app","stack":"KafkaJSGroupCoordinatorNotFound: Failed to find group coordinator
at Cluster.findGroupCoordinatorMetadata (/app/node_modules/kafkajs/src/cluster/index.js:420:11)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async /app/node_modules/kafkajs/src/cluster/index.js:346:33
at async [private:ConsumerGroup:join] (/app/node_modules/kafkajs/src/consumer/consumerGroup.js:167:24)
at async /app/node_modules/kafkajs/src/consumer/consumerGroup.js:335:9
at async Runner.start (/app/node_modules/kafkajs/src/consumer/runner.js:84:7)
at async start (/app/node_modules/kafkajs/src/consumer/index.js:243:7)
at async Object.run (/app/node_modules/kafkajs/src/consumer/index.js:304:5)
at async Object.startConsumer (/app/server/kafka-server.js:47:9)
at async /app/server/app.js:477:5"}
@arupsarkar-sfdc did you find any solution ?
This issue also happened to me; worst part was that the consumer did not recover. Normally KafkaJS will just retry whatever failed and eventually get back to healthy again, but in this case I had to restart the entire app.
I can add some more details, the issue reappeared. Here are our logs:
Jul 3, 2024 @ 14:16:37.355
NS: Connection, label: ERROR, message: Connection error: getaddrinfo ENOTFOUND kafka-controller-1.kafka-controller-headless.kafka.svc.cluster.local
This is a correct error message; we indeed had a network wobble in our Kafka cluster. KafkaJS detected this correctly, and followed:
Jul 3, 2024 @ 14:16:37.367
NS: Consumer, label: ERROR, message: Crash: KafkaJSNumberOfRetriesExceeded: Connection error: getaddrinfo ENOTFOUND kafka-controller-1.kafka-controller-headless.kafka.svc.cluster.local
So far so good, KafkaJS tries to recover:
Jul 3, 2024 @ 14:16:37.374
NS: Consumer, label: ERROR, message: Restarting the consumer in 10942ms
Jul 3, 2024 @ 14:16:48.322
NS: Consumer, label: INFO, message: Starting
So we're seeing a fresh connection attempt, this is as expected. Unfortunately the cluster was still down, so we got:
Jul 3, 2024 @ 14:17:11.120
NS: Consumer, label: ERROR, message: Crash: KafkaJSGroupCoordinatorNotFound: Failed to find group coordinator
Expected outcome:
The consumer stops once more, tries again after a few seconds
Actual outcome:
The consumer stops after KafkaJSGroupCoordinatorNotFound
and does not recover.
From then on the app ran without a consumer (but was reporting healthy) and had to be restarted manually - after a restart KafkaJS was working correctly once more
I received this error in a local environment a I fixed setting the IP address of the docker in the docker-compose file.
environment:
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:9092,PLAINTEXT_HOST://DOCKER_IP:29092
I have 3 node consumer group and facing the failed to find group coordinator ERROR. And consumer get stopped. Can anyone assist me to figure out the this issue. Find the below details for the same.
Error message which is got in console {"level":"ERROR","timestamp":"2023-10-06T12:00:50.029Z","logger":"kafkajs","message":"[Consumer] Crash: KafkaJSGroupCoordinatorNotFound: Failed to find group coordinator","groupId":"NODEKAFKA","stack":"KafkaJSGroupCoordinatorNotFound: Failed to find group coordinator\n at Cluster.findGroupCoordinatorMetadata"}
Expected behavior The group member should get the group coordinator.
Observed behavior Getting failed to find group coordinator and stopped the consumer
Environment:
Additional context Application running in kubernetes pod.
Please let me know if need more details.