Open aragnon opened 8 years ago
@aragnon I'd suggest using the kafka-topics
command to determine the current state of replicas for the _schemas
topic (or whatever topic you are using if you changed the setting kafkastore.topic
). Here's how I used it on the a local set of test services:
$ ./bin/kafka-topics.sh --zookeeper localhost:2181 --describe --topic _schemas
Topic:_schemas PartitionCount:1 ReplicationFactor:1 Configs:cleanup.policy=compact
Topic: _schemas Partition: 0 Leader: 0 Replicas: 0 Isr: 0
The root cause exception (NotEnoughReplicasException
) indicates the in sync replicas for the topic partition has fallen below min.insync.replicas
, so the output from this command should show you the difference between the full list of replicas and the in sync replicas (labeled Isr
in the output`). From there you can determine why one or more of the brokers are falling behind.
In my case I can see the topic having ISR as required, but schema-registry still fails to start:
root@kafka-0:/# kafka-topics --zookeeper zookeeper-0.zookeeper --describe --topic _schemas
Topic:_schemas PartitionCount:1 ReplicationFactor:1 Configs:cleanup.policy=compact
Topic: _schemas Partition: 0 Leader: 5 Replicas: 5 Isr: 5
[2019-08-14 11:37:07,398] WARN The configuration 'topic.replication.factor' was supplied but isn't a known config. (org.apache.kafka.clients.consumer.ConsumerConfig)
[2019-08-14 11:37:07,398] WARN The configuration 'init.timeout.ms' was supplied but isn't a known config. (org.apache.kafka.clients.consumer.ConsumerConfig)
[2019-08-14 11:37:07,398] INFO Kafka version : 2.0.0-cp1 (org.apache.kafka.common.utils.AppInfoParser)
[2019-08-14 11:37:07,398] INFO Kafka commitId : 4b1dd33f255ddd2f (org.apache.kafka.common.utils.AppInfoParser)
[2019-08-14 11:37:07,414] INFO Cluster ID: Q1Bp2MYfRQCTuMfOhctLXg (org.apache.kafka.clients.Metadata)
[2019-08-14 11:37:07,416] INFO Initialized last consumed offset to -1 (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2019-08-14 11:37:07,418] INFO [kafka-store-reader-thread-_schemas]: Starting (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2019-08-14 11:37:07,483] INFO Cluster ID: Q1Bp2MYfRQCTuMfOhctLXg (org.apache.kafka.clients.Metadata)
[2019-08-14 11:37:07,488] INFO [Consumer clientId=KafkaStore-reader-_schemas, groupId=schema-registry-schema-registry-7454c4f585-rvhzv-8081] Resetting offset for partition _schemas-0 to offset 0. (org.apache.kafka.clients.consumer.internals.Fetcher)
[2019-08-14 11:37:07,513] ERROR Error starting the schema registry (io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication)
io.confluent.kafka.schemaregistry.exceptions.SchemaRegistryInitializationException: Error initializing kafka store while initializing schema registry
at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.init(KafkaSchemaRegistry.java:203)
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.setupResources(SchemaRegistryRestApplication.java:63)
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.setupResources(SchemaRegistryRestApplication.java:41)
at io.confluent.rest.Application.createServer(Application.java:169)
at io.confluent.kafka.schemaregistry.rest.SchemaRegistryMain.main(SchemaRegistryMain.java:43)
Caused by: io.confluent.kafka.schemaregistry.storage.exceptions.StoreInitializationException: io.confluent.kafka.schemaregistry.storage.exceptions.StoreException: Failed to write Noop record to kafka store.
at io.confluent.kafka.schemaregistry.storage.KafkaStore.init(KafkaStore.java:139)
at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.init(KafkaSchemaRegistry.java:201)
... 4 more
Caused by: io.confluent.kafka.schemaregistry.storage.exceptions.StoreException: Failed to write Noop record to kafka store.
at io.confluent.kafka.schemaregistry.storage.KafkaStore.getLatestOffset(KafkaStore.java:424)
at io.confluent.kafka.schemaregistry.storage.KafkaStore.waitUntilKafkaReaderReachesLastOffset(KafkaStore.java:277)
at io.confluent.kafka.schemaregistry.storage.KafkaStore.init(KafkaStore.java:137)
... 5 more
Caused by: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.NotEnoughReplicasException: Messages are rejected since there are fewer in-sync replicas than required.
at org.apache.kafka.clients.producer.internals.FutureRecordMetadata.valueOrError(FutureRecordMetadata.java:94)
at org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:77)
at org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:29)
at io.confluent.kafka.schemaregistry.storage.KafkaStore.getLatestOffset(KafkaStore.java:419)
... 7 more
Caused by: org.apache.kafka.common.errors.NotEnoughReplicasException: Messages are rejected since there are fewer in-sync replicas than required.
[2019-08-14 11:37:07,515] INFO Shutting down schema registry (io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry)
[2019-08-14 11:37:07,516] INFO [kafka-store-reader-thread-_schemas]: Shutting down (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2019-08-14 11:37:07,516] INFO [kafka-store-reader-thread-_schemas]: Shutdown completed (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2019-08-14 11:37:07,516] INFO [kafka-store-reader-thread-_schemas]: Stopped (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2019-08-14 11:37:07,520] INFO KafkaStoreReaderThread shutdown complete. (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2019-08-14 11:37:07,520] INFO [Producer clientId=producer-1] Closing the Kafka producer with timeoutMillis = 9223372036854775807 ms. (org.apache.kafka.clients.producer.KafkaProducer)
Removing the topic and restarting schema-registry solved the issue in my case.
Excellent @vaisov ! It worked also to me, and I add that I have my brokers working on top of docker-compose in swarm mode, so if anybody has the same problem, this approach can fix the issue. Thank you!
schema reigstry should not use live brokers number to set the topic replicas which leads to this issue.
//io.confluent.kafka.schemaregistry.storage.KafkaStore#createSchemaTopic
int numLiveBrokers = admin.describeCluster().nodes()
.get(initTimeout, TimeUnit.MILLISECONDS).size();
if (numLiveBrokers <= 0) {
throw new StoreInitializationException("No live Kafka brokers");
}
int schemaTopicReplicationFactor = Math.min(numLiveBrokers, desiredReplicationFactor);
I get the below exception (after which the schema-registry exits), but the exception doesn't print out the numbers of required replicas nor the number of insync replicas. This would essentially force me to build the schema-registry from source in order to find those values; not something you should require your users to go through.
In the meanwhile, I would appreciate a hint as to what's wrong.