confluentinc / confluent-kubernetes-examples

Example scenario workflows for Confluent for Kubernetes
Apache License 2.0
36 stars 174 forks source link

Schemaregistry ERROR Timed out waiting for join group to complete #306

Closed knut-bw closed 3 months ago

knut-bw commented 3 months ago

During the deployment process, all the component pods (e.g. connect & kafka & kafkarestproxy & ksqldb & zookeeper) are running normally except for the Control Center and Schema Registry which have issues

[INFO] 2024-06-13 03:32:22,077 [pool-3-thread-1] io.confluent.kafka.schemaregistry.leaderelector.kafka.SchemaRegistryCoordinator markCoordinatorUnknown - [Schema registry clientId=sr-1, groupId=id_schemaregistry_fcts] Group coordinator kafka-0.kafka.fcts.svc.cluster.local:9092 (id: 2147483647 rack: null) is unavailable or invalid due to cause: coordinator unavailable. isDisconnected: false. Rediscovery will be attempted. [INFO] 2024-06-13 03:32:22,077 [pool-3-thread-1] io.confluent.kafka.schemaregistry.leaderelector.kafka.SchemaRegistryCoordinator markCoordinatorUnknown - [Schema registry clientId=sr-1, groupId=id_schemaregistry_fcts] Requesting disconnect from last known coordinator kafka-0.kafka.fcts.svc.cluster.local:9092 (id: 2147483647 rack: null) [INFO] 2024-06-13 03:32:22,178 [pool-3-thread-1] io.confluent.kafka.schemaregistry.leaderelector.kafka.SchemaRegistryCoordinator onSuccess - [Schema registry clientId=sr-1, groupId=id_schemaregistry_fcts] Discovered group coordinator kafka-0.kafka.fcts.svc.cluster.local:9092 (id: 2147483647 rack: null) [INFO] 2024-06-13 03:32:22,178 [pool-3-thread-1] io.confluent.kafka.schemaregistry.leaderelector.kafka.SchemaRegistryCoordinator sendJoinGroupRequest - [Schema registry clientId=sr-1, groupId=id_schemaregistry_fcts] (Re-)joining group [INFO] 2024-06-13 03:32:22,178 [pool-3-thread-1] io.confluent.kafka.schemaregistry.leaderelector.kafka.SchemaRegistryCoordinator metadata - Updating metadata [INFO] 2024-06-13 03:32:22,182 [pool-3-thread-1] io.confluent.kafka.schemaregistry.leaderelector.kafka.SchemaRegistryCoordinator handle - [Schema registry clientId=sr-1, groupId=id_schemaregistry_fcts] Successfully joined group with generation Generation{generationId=1482, memberId='sr-1-84ae3146-035e-47bb-8ce9-855ee338df45', protocol='v0'} [INFO] 2024-06-13 03:32:22,182 [pool-3-thread-1] io.confluent.kafka.schemaregistry.leaderelector.kafka.SchemaRegistryCoordinator onLeaderElected - Performing assignment [INFO] 2024-06-13 03:32:22,182 [pool-3-thread-1] io.confluent.kafka.schemaregistry.leaderelector.kafka.SchemaRegistryCoordinator onLeaderElected - Member information: {sr-1-84ae3146-035e-47bb-8ce9-855ee338df45=version=1,host=schemaregistry-0.schemaregistry.fcts.svc.cluster.local,port=8081,scheme=http,leaderEligibility=true,isLeader=false} [INFO] 2024-06-13 03:32:22,182 [pool-3-thread-1] io.confluent.kafka.schemaregistry.leaderelector.kafka.SchemaRegistryCoordinator onLeaderElected - Assignment: Assignment{version=1, error=0, leader='sr-1-84ae3146-035e-47bb-8ce9-855ee338df45', leaderIdentity=version=1,host=schemaregistry-0.schemaregistry.fcts.svc.cluster.local,port=8081,scheme=http,leaderEligibility=true,isLeader=false} [INFO] 2024-06-13 03:32:22,183 [pool-3-thread-1] io.confluent.kafka.schemaregistry.leaderelector.kafka.SchemaRegistryCoordinator handle - [Schema registry clientId=sr-1, groupId=id_schemaregistry_fcts] SyncGroup failed: The coordinator is not available. Marking coordinator unknown. Sent generation was Generation{generationId=1482, memberId='sr-1-84ae3146-035e-47bb-8ce9-855ee338df45', protocol='v0'} [INFO] 2024-06-13 03:32:22,183 [pool-3-thread-1] io.confluent.kafka.schemaregistry.leaderelector.kafka.SchemaRegistryCoordinator markCoordinatorUnknown - [Schema registry clientId=sr-1, groupId=id_schemaregistry_fcts] Group coordinator kafka-0.kafka.fcts.svc.cluster.local:9092 (id: 2147483647 rack: null) is unavailable or invalid due to cause: error response COORDINATOR_NOT_AVAILABLE. isDisconnected: false. Rediscovery will be attempted. [INFO] 2024-06-13 03:32:22,183 [pool-3-thread-1] io.confluent.kafka.schemaregistry.leaderelector.kafka.SchemaRegistryCoordinator markCoordinatorUnknown - [Schema registry clientId=sr-1, groupId=id_schemaregistry_fcts] Requesting disconnect from last known coordinator kafka-0.kafka.fcts.svc.cluster.local:9092 (id: 2147483647 rack: null) [INFO] 2024-06-13 03:32:22,183 [pool-3-thread-1] io.confluent.kafka.schemaregistry.leaderelector.kafka.SchemaRegistryCoordinator requestRejoin - [Schema registry clientId=sr-1, groupId=id_schemaregistry_fcts] Request joining group due to: rebalance failed due to 'The coordinator is not available.' (CoordinatorNotAvailableException) [ERROR] 2024-06-13 03:32:22,185 [main] io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication initSchemaRegistry - Error starting the schema registry io.confluent.kafka.schemaregistry.exceptions.SchemaRegistryInitializationException: io.confluent.kafka.schemaregistry.exceptions.SchemaRegistryTimeoutException: Timed out waiting for join group to complete at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.electLeader(KafkaSchemaRegistry.java:434) at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.init(KafkaSchemaRegistry.java:414) at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.initSchemaRegistry(SchemaRegistryRestApplication.java:77) at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.configureBaseApplication(SchemaRegistryRestApplication.java:103) at io.confluent.rest.Application.configureHandler(Application.java:324) at io.confluent.rest.ApplicationServer.doStart(ApplicationServer.java:212) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73) at io.confluent.kafka.schemaregistry.rest.SchemaRegistryMain.main(SchemaRegistryMain.java:44) Caused by: io.confluent.kafka.schemaregistry.exceptions.SchemaRegistryTimeoutException: Timed out waiting for join group to complete at io.confluent.kafka.schemaregistry.leaderelector.kafka.KafkaGroupLeaderElector.init(KafkaGroupLeaderElector.java:214) at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.electLeader(KafkaSchemaRegistry.java:429) ... 7 more

From the logs, I see that it keeps trying to connect, but I don't know why it can't connect. I have absolutely no idea what's wrong