confluentinc / confluent-kafka-dotnet

Confluent's Apache Kafka .NET client
https://github.com/confluentinc/confluent-kafka-dotnet/wiki
Apache License 2.0
90 stars 869 forks source link

Consumer keeps writing "[[thrd:GroupCoordinator]: 17/17 brokers are down]" #2051

Closed ksdvishnukumar closed 1 year ago

ksdvishnukumar commented 1 year ago

Description

@mhowlett and all, Am using Confluent Kafka Package 1.9.3 version to connect to Kafka Enabled Azure Eventhub. When am running the the consumer and consumer is idle for a while i keep getting this error every 5 min interval.

I use the below main properties other than GroupId, BrokerEndpoint configuration in the consumer

ConnectionsMaxIdleMs =180000 MaxPartitionFetchBytes=20971520 SocketNagleDisable = true, //https://github.com/Azure/azure-event-hubs-for-kafka/issues/139 SocketKeepaliveEnable = true, MetadataMaxAgeMs = 180000, //https://learn.microsoft.com/en-us/azure/event-hubs/apache-kafka-configurations SessionTimeoutMs = 30000, MaxPollIntervalMs = 300000,

warn: Logger.Logger[0] Producer Log Handler : WARNING|09-05-2023T15:14:16.6643173Z|CONFWARN|rdkafka#producer-1|[[thrd:app]: Configuration property allow.auto.create.topics is a consumer property and will be ignored by this producer instance] info: Logger.Logger[0] Starting Service info: Logger.Logger[0] Preparing the Consumer Configuration info: Microsoft.Hosting.Lifetime[0] Application started. Press Ctrl+C to shut down. info: Microsoft.Hosting.Lifetime[0] Hosting environment: Production info: Microsoft.Hosting.Lifetime[0] Content root path: /app info: Logger.Logger[0] notifications partitions incrementally assigned: [7,8,9], all: [7,8,9] at 09-05-2023T15:14:22.8092982Z info: Logger.Logger[0]
Component is running info: Logger.Logger[0] topic1 partitions incrementally assigned: [0], all: [0] at 09-05-2023T15:14:34.2252078Z info: Logger.Logger[0] topic2 partitions incrementally assigned: [1], all: [1] at 09-05-2023T15:14:34.2253958Z info: Logger.Logger[0] Processing 100 avro messages from topic1 info: Logger.Logger[0] Processing 100 avro messages from topic1 info: Logger.Logger[0] Processing 100 avro messages from topic1 info: Logger.Logger[0] Processing 100 avro messages from topic1 info: Logger.Logger[0] Processing 100 avro messages from topic1 info: Logger.Logger[0] Processing 100 avro messages from topic1 info: Logger.Logger[0] Processing 100 avro messages from topic1 info: Logger.Logger[0] Processing 100 avro messages from topic1 info: Logger.Logger[0] Processing 100 avro messages from topic1 info: Logger.Logger[0] Processing 100 avro messages from topic1 info: Logger.Logger[0] Processing 100 avro messages from topic1 info: Logger.Logger[0] Processing 100 avro messages from topic1 info: Logger.Logger[0] Processing 100 avro messages from topic1 info: Logger.Logger[0] Processing 100 avro messages from topic1 info: Logger.Logger[0] topic1 partitions incrementally revoked: [0], remaining: [] at 09-05-2023T15:19:37.2809989Z warn: Logger.Logger[0] Consumer Log Handler : WARNING|09-05-2023T15:21:18.7592137Z|MAXPOLL|rdkafka#consumer-3|[[thrd:main]: Application maximum poll interval (300000ms) exceeded by 365ms (adjust max.poll.interval.ms for long-running message processing): leaving group] info: Logger.Logger[0] topic1 partitions incrementally assigned: [0], all: [0] at 09-05-2023T15:21:18.7716243Z info: Logger.Logger[0] topic1 partitions incrementally assigned: [0], all: [0] at 09-05-2023T15:21:18.7777817Z info: Logger.Logger[0] Processing 100 avro messages from topic1 warn: Logger.Logger[0] Consumer Log Handler : WARNING|09-05-2023T15:26:22.7849415Z|MAXPOLL|rdkafka#consumer-4|[[thrd:main]: Application maximum poll interval (300000ms) exceeded by 21ms (adjust max.poll.interval.ms for long-running message processing): leaving group] info: Logger.Logger[0]
Component is running info: Logger.Logger[0] topic1 partitions incrementally revoked: [0], remaining: [] at 09-05-2023T15:26:24.7971394Z info: Logger.Logger[0] topic1 partitions incrementally assigned: [0], all: [0] at 09-05-2023T15:28:49.3316718Z info: Logger.Logger[0] topic1 partitions incrementally revoked: [0], remaining: [] at 09-05-2023T15:33:52.3673982Z info: Logger.Logger[0] topic1 partitions incrementally assigned: [0], all: [0] at 09-05-2023T15:33:52.7262563Z info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T17:22:29.0036943Z|ERROR|rdkafka#consumer-3|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T17:28:29.3781275Z|ERROR|rdkafka#consumer-3|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0]
Consumer Log Handler : ERROR|09-05-2023T17:32:30.4782541Z|ERROR|rdkafka#consumer-4|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T17:37:29.9455339Z|ERROR|rdkafka#consumer-3|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T17:38:30.7493981Z|ERROR|rdkafka#consumer-4|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T17:43:30.3222093Z|ERROR|rdkafka#consumer-3|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T17:47:31.3117085Z|ERROR|rdkafka#consumer-4|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T17:52:30.8760490Z|ERROR|rdkafka#consumer-3|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T17:53:31.7102630Z|ERROR|rdkafka#consumer-4|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T17:58:31.2375014Z|ERROR|rdkafka#consumer-3|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T18:02:32.2847836Z|ERROR|rdkafka#consumer-4|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T18:07:31.7856581Z|ERROR|rdkafka#consumer-3|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T18:08:32.6420377Z|ERROR|rdkafka#consumer-4|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T18:13:32.0736610Z|ERROR|rdkafka#consumer-3|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T18:17:33.2128823Z|ERROR|rdkafka#consumer-4|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T18:22:32.6360632Z|ERROR|rdkafka#consumer-3|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T18:23:33.5701283Z|ERROR|rdkafka#consumer-4|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T18:28:33.0091293Z|ERROR|rdkafka#consumer-3|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T18:32:34.1709591Z|ERROR|rdkafka#consumer-4|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T18:37:33.5589701Z|ERROR|rdkafka#consumer-3|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T18:38:36.5305322Z|ERROR|rdkafka#consumer-4|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T18:43:33.9279921Z|ERROR|rdkafka#consumer-3|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T18:47:37.1278388Z|ERROR|rdkafka#consumer-4|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T18:52:34.4940872Z|ERROR|rdkafka#consumer-3|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T18:53:37.4914007Z|ERROR|rdkafka#consumer-4|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T18:58:34.8432972Z|ERROR|rdkafka#consumer-3|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T19:02:38.0876856Z|ERROR|rdkafka#consumer-4|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T19:07:35.4113735Z|ERROR|rdkafka#consumer-3|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T19:08:38.4693361Z|ERROR|rdkafka#consumer-4|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T19:13:35.7815220Z|ERROR|rdkafka#consumer-3|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T19:17:39.0433802Z|ERROR|rdkafka#consumer-4|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T19:22:36.3671739Z|ERROR|rdkafka#consumer-3|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T19:23:39.4122192Z|ERROR|rdkafka#consumer-4|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T19:28:36.7283163Z|ERROR|rdkafka#consumer-3|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T19:38:40.1669439Z|ERROR|rdkafka#consumer-4|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T19:43:38.4733747Z|ERROR|rdkafka#consumer-3|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T19:47:40.7731629Z|ERROR|rdkafka#consumer-4|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T19:52:39.0620841Z|ERROR|rdkafka#consumer-3|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T19:53:41.1420704Z|ERROR|rdkafka#consumer-4|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T19:58:39.4423107Z|ERROR|rdkafka#consumer-3|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T20:02:41.7090156Z|ERROR|rdkafka#consumer-4|[[thrd:GroupCoordinator]: 17/17 brokers are down] info: Logger.Logger[0] Consumer Log Handler : ERROR|09-05-2023T20:07:40.0311642Z|ERROR|rdkafka#consumer-3|[[thrd:GroupCoordinator]: 17/17 brokers are down]

How to reproduce

Once the Consumer is started and processing the message. After that consumer is idle for a longer time since there is no message to read. It gives the [[thrd:GroupCoordinator]: 17/17 brokers are down]

Checklist

Please provide the following information:

Could you assist to resolve this error

ksdvishnukumar commented 1 year ago

@edenhill, @mhowlett, @ijuma , @codyaray and @skpabba , It will be very helpful if you come across or solution to resolve this issue.

Thanks in advance!!!

sdicont commented 1 year ago

Afaik it's an old issue. Even though it logs those messages, you probably can ignore them. See #310 for more details

ksdvishnukumar commented 1 year ago

Sure i will check it out @sdicont . Thanks for the input and direction to proceed.

ksdvishnukumar commented 1 year ago

@sdicont , Yeah you are right. Its able to reconnect. Looks like its a informational message. But still package give as a log level.

May be can consider to change the Log Level to Info and close it. @edenhill @mhowlett

ksdvishnukumar commented 1 year ago

Am closing it, probably we can take the log level change as part of separate task.

@sdicont @edenhill @mhowlett

ksdvishnukumar commented 1 year ago

And Even for producer also got the same error

Comment from @edenhill "@naveen-ilink log.connection.close: false will only suppress the individual broker disconnect logs, not the ALL_BROKERS_DOWN event/log.

I suggest registering an error delegate (error_cb in C) to handle and ignore the ALL_BROKERS_DOWN event, the error callback will also suppress the log message.

Side note: socket keepalives operate on the TCP layer, not application (Kafka) layer, so they will not help with the Kafka broker's idle connection reaper."

https://github.com/confluentinc/confluent-kafka-dotnet/issues/310#issuecomment-325094257

Keeping it the same error log message for both Producer and Consumer in one place for ease of access