confluentinc / schema-registry

Confluent Schema Registry for Kafka
https://docs.confluent.io/current/schema-registry/docs/index.html
Other
2.23k stars 1.11k forks source link

NotEnoughReplicasException #313

Open aragnon opened 8 years ago

aragnon commented 8 years ago

I get the below exception (after which the schema-registry exits), but the exception doesn't print out the numbers of required replicas nor the number of insync replicas. This would essentially force me to build the schema-registry from source in order to find those values; not something you should require your users to go through.

In the meanwhile, I would appreciate a hint as to what's wrong.

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/share/java/confluent-common/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/share/java/schema-registry/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
[2016-04-08 14:57:19,115] INFO SchemaRegistryConfig values:
        metric.reporters = []
        kafkastore.connection.url = 127.0.0.1:2181
        avro.compatibility.level = backward
        debug = true
        shutdown.graceful.ms = 1000
        response.mediatype.preferred = [application/vnd.schemaregistry.v1+json, application/vnd.schemaregistry+json, application/json]
        kafkastore.commit.interval.ms = -1
        response.mediatype.default = application/vnd.schemaregistry.v1+json
        kafkastore.topic = _schemas
        metrics.jmx.prefix = kafka.schema.registry
        access.control.allow.origin =
        port = 8081
        request.logger.name = io.confluent.rest-utils.requests
        metrics.sample.window.ms = 30000
        kafkastore.zk.session.timeout.ms = 30000
        master.eligibility = true
        kafkastore.topic.replication.factor = 1
        kafkastore.timeout.ms = 500
        host.name = kafka-dev
        schema.registry.zk.namespace = schema_registry
        kafkastore.init.timeout.ms = 60000
        metrics.num.samples = 2
 (io.confluent.kafka.schemaregistry.rest.SchemaRegistryConfig:135)
[2016-04-08 14:57:19,599] INFO Initialized the consumer offset to -1 (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread:86)
[2016-04-08 14:57:20,034] INFO [kafka-store-reader-thread-_schemas], Starting  (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread:68)
[2016-04-08 14:57:20,156] ERROR Error starting the schema registry (io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication:57)
io.confluent.kafka.schemaregistry.exceptions.SchemaRegistryInitializationException: Error initializing kafka store while initializing schema registry
        at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.init(KafkaSchemaRegistry.java:166)
        at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.setupResources(SchemaRegistryRestApplication.java:55)
        at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.setupResources(SchemaRegistryRestApplication.java:37)
        at io.confluent.rest.Application.createServer(Application.java:109)
        at io.confluent.kafka.schemaregistry.rest.SchemaRegistryMain.main(SchemaRegistryMain.java:43)
Caused by: io.confluent.kafka.schemaregistry.storage.exceptions.StoreInitializationException: io.confluent.kafka.schemaregistry.storage.exceptions.StoreException: Failed to write Noop record to kafka store.
        at io.confluent.kafka.schemaregistry.storage.KafkaStore.init(KafkaStore.java:155)
        at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.init(KafkaSchemaRegistry.java:164)
        ... 4 more
Caused by: io.confluent.kafka.schemaregistry.storage.exceptions.StoreException: Failed to write Noop record to kafka store.
        at io.confluent.kafka.schemaregistry.storage.KafkaStore.getLatestOffset(KafkaStore.java:367)
        at io.confluent.kafka.schemaregistry.storage.KafkaStore.waitUntilKafkaReaderReachesLastOffset(KafkaStore.java:224)
        at io.confluent.kafka.schemaregistry.storage.KafkaStore.init(KafkaStore.java:153)
        ... 5 more
Caused by: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.NotEnoughReplicasException: Messages are rejected since there are fewer in-sync replicas than required.
        at org.apache.kafka.clients.producer.internals.FutureRecordMetadata.valueOrError(FutureRecordMetadata.java:56)
        at org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:51)
        at org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:25)
        at io.confluent.kafka.schemaregistry.storage.KafkaStore.getLatestOffset(KafkaStore.java:363)
        ... 7 more
Caused by: org.apache.kafka.common.errors.NotEnoughReplicasException: Messages are rejected since there are fewer in-sync replicas than required.
ewencp commented 8 years ago

@aragnon I'd suggest using the kafka-topics command to determine the current state of replicas for the _schemas topic (or whatever topic you are using if you changed the setting kafkastore.topic). Here's how I used it on the a local set of test services:

$ ./bin/kafka-topics.sh --zookeeper localhost:2181 --describe --topic _schemas
Topic:_schemas  PartitionCount:1    ReplicationFactor:1 Configs:cleanup.policy=compact
    Topic: _schemas Partition: 0    Leader: 0   Replicas: 0 Isr: 0

The root cause exception (NotEnoughReplicasException) indicates the in sync replicas for the topic partition has fallen below min.insync.replicas, so the output from this command should show you the difference between the full list of replicas and the in sync replicas (labeled Isr in the output`). From there you can determine why one or more of the brokers are falling behind.

vaisov commented 5 years ago

In my case I can see the topic having ISR as required, but schema-registry still fails to start:

root@kafka-0:/# kafka-topics --zookeeper zookeeper-0.zookeeper --describe --topic _schemas
Topic:_schemas  PartitionCount:1        ReplicationFactor:1     Configs:cleanup.policy=compact
        Topic: _schemas Partition: 0    Leader: 5       Replicas: 5     Isr: 5
[2019-08-14 11:37:07,398] WARN The configuration 'topic.replication.factor' was supplied but isn't a known config. (org.apache.kafka.clients.consumer.ConsumerConfig)
[2019-08-14 11:37:07,398] WARN The configuration 'init.timeout.ms' was supplied but isn't a known config. (org.apache.kafka.clients.consumer.ConsumerConfig)
[2019-08-14 11:37:07,398] INFO Kafka version : 2.0.0-cp1 (org.apache.kafka.common.utils.AppInfoParser)
[2019-08-14 11:37:07,398] INFO Kafka commitId : 4b1dd33f255ddd2f (org.apache.kafka.common.utils.AppInfoParser)
[2019-08-14 11:37:07,414] INFO Cluster ID: Q1Bp2MYfRQCTuMfOhctLXg (org.apache.kafka.clients.Metadata)
[2019-08-14 11:37:07,416] INFO Initialized last consumed offset to -1 (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2019-08-14 11:37:07,418] INFO [kafka-store-reader-thread-_schemas]: Starting (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2019-08-14 11:37:07,483] INFO Cluster ID: Q1Bp2MYfRQCTuMfOhctLXg (org.apache.kafka.clients.Metadata)
[2019-08-14 11:37:07,488] INFO [Consumer clientId=KafkaStore-reader-_schemas, groupId=schema-registry-schema-registry-7454c4f585-rvhzv-8081] Resetting offset for partition _schemas-0 to offset 0. (org.apache.kafka.clients.consumer.internals.Fetcher)
[2019-08-14 11:37:07,513] ERROR Error starting the schema registry (io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication)
io.confluent.kafka.schemaregistry.exceptions.SchemaRegistryInitializationException: Error initializing kafka store while initializing schema registry
        at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.init(KafkaSchemaRegistry.java:203)
        at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.setupResources(SchemaRegistryRestApplication.java:63)
        at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.setupResources(SchemaRegistryRestApplication.java:41)
        at io.confluent.rest.Application.createServer(Application.java:169)
        at io.confluent.kafka.schemaregistry.rest.SchemaRegistryMain.main(SchemaRegistryMain.java:43)
Caused by: io.confluent.kafka.schemaregistry.storage.exceptions.StoreInitializationException: io.confluent.kafka.schemaregistry.storage.exceptions.StoreException: Failed to write Noop record to kafka store.
        at io.confluent.kafka.schemaregistry.storage.KafkaStore.init(KafkaStore.java:139)
        at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.init(KafkaSchemaRegistry.java:201)
        ... 4 more
Caused by: io.confluent.kafka.schemaregistry.storage.exceptions.StoreException: Failed to write Noop record to kafka store.
        at io.confluent.kafka.schemaregistry.storage.KafkaStore.getLatestOffset(KafkaStore.java:424)
        at io.confluent.kafka.schemaregistry.storage.KafkaStore.waitUntilKafkaReaderReachesLastOffset(KafkaStore.java:277)
        at io.confluent.kafka.schemaregistry.storage.KafkaStore.init(KafkaStore.java:137)
        ... 5 more
Caused by: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.NotEnoughReplicasException: Messages are rejected since there are fewer in-sync replicas than required.
        at org.apache.kafka.clients.producer.internals.FutureRecordMetadata.valueOrError(FutureRecordMetadata.java:94)
        at org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:77)
        at org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:29)
        at io.confluent.kafka.schemaregistry.storage.KafkaStore.getLatestOffset(KafkaStore.java:419)
        ... 7 more
Caused by: org.apache.kafka.common.errors.NotEnoughReplicasException: Messages are rejected since there are fewer in-sync replicas than required.
[2019-08-14 11:37:07,515] INFO Shutting down schema registry (io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry)
[2019-08-14 11:37:07,516] INFO [kafka-store-reader-thread-_schemas]: Shutting down (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2019-08-14 11:37:07,516] INFO [kafka-store-reader-thread-_schemas]: Shutdown completed (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2019-08-14 11:37:07,516] INFO [kafka-store-reader-thread-_schemas]: Stopped (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2019-08-14 11:37:07,520] INFO KafkaStoreReaderThread shutdown complete. (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread)
[2019-08-14 11:37:07,520] INFO [Producer clientId=producer-1] Closing the Kafka producer with timeoutMillis = 9223372036854775807 ms. (org.apache.kafka.clients.producer.KafkaProducer)
vaisov commented 5 years ago

Removing the topic and restarting schema-registry solved the issue in my case.

sebasPignataro commented 3 years ago

Excellent @vaisov ! It worked also to me, and I add that I have my brokers working on top of docker-compose in swarm mode, so if anybody has the same problem, this approach can fix the issue. Thank you!

alexwxy commented 1 year ago

schema reigstry should not use live brokers number to set the topic replicas which leads to this issue.

//io.confluent.kafka.schemaregistry.storage.KafkaStore#createSchemaTopic
    int numLiveBrokers = admin.describeCluster().nodes()
        .get(initTimeout, TimeUnit.MILLISECONDS).size();
    if (numLiveBrokers <= 0) {
      throw new StoreInitializationException("No live Kafka brokers");
    }

    int schemaTopicReplicationFactor = Math.min(numLiveBrokers, desiredReplicationFactor);