confluentinc / schema-registry

Confluent Schema Registry for Kafka
https://docs.confluent.io/current/schema-registry/docs/index.html
Other
2.23k stars 1.11k forks source link

Schema registry fails to start or is unable to register schema when Kafka restarts #1198

Open swathimocharla19 opened 5 years ago

swathimocharla19 commented 5 years ago

Hi, We are seeing intermittently, that when Kafka is restarted, Schema Registry either fails to start, or is unable to register the schema with the Kafka Store.

16T08:54:00.931Z", "timezone":"UTC", "log":"kafka-store-reader-thread-_schemas - io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread - [kafka-store-reader-thread-_schemas]: Starting"}
{"type":"log", "host":"pranayschema-ckaf-schema-registry-0", "level":"ERROR", "neid":"schema-registry-a6aab87097d14ff5b192d56f1d73ff1a", "system":"schema-registry", "time":"2019-07-16T08:55:01.130Z", "timezone":"UTC", "log":"main - io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication - Error starting the schema registry"}
io.confluent.kafka.schemaregistry.exceptions.SchemaRegistryInitializationException: Error initializing kafka store while initializing schema registry
    at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.init(KafkaSchemaRegistry.java:210)
    at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.initSchemaRegistry(SchemaRegistryRestApplication.java:61)
    at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.setupResources(SchemaRegistryRestApplication.java:72)
    at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.setupResources(SchemaRegistryRestApplication.java:39)
    at io.confluent.rest.Application.createServer(Application.java:201)
    at io.confluent.kafka.schemaregistry.rest.SchemaRegistryMain.main(SchemaRegistryMain.java:41)
Caused by: io.confluent.kafka.schemaregistry.storage.exceptions.StoreInitializationException: io.confluent.kafka.schemaregistry.storage.exceptions.StoreException: Failed to write Noop record to kafka store.
    at io.confluent.kafka.schemaregistry.storage.KafkaStore.init(KafkaStore.java:137)
    at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.init(KafkaSchemaRegistry.java:208)
    ... 5 more
Caused by: io.confluent.kafka.schemaregistry.storage.exceptions.StoreException: Failed to write Noop record to kafka store.
    at io.confluent.kafka.schemaregistry.storage.KafkaStore.getLatestOffset(KafkaStore.java:422)
    at io.confluent.kafka.schemaregistry.storage.KafkaStore.waitUntilKafkaReaderReachesLastOffset(KafkaStore.java:275)
    at io.confluent.kafka.schemaregistry.storage.KafkaStore.init(KafkaStore.java:135)
    ... 6 more
Caused by: java.util.concurrent.TimeoutException: Timeout after waiting for 60000 ms.
    at org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:78)
    at org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:30)
    at io.confluent.kafka.schemaregistry.storage.KafkaStore.getLatestOffset(KafkaStore.java:417)
    ... 8 more

At this point from the Schema Registry code, it seems like the background thread that is doing the doWork() and is acting as the consumer is not refreshing its state. At this point it seems like SR is trying to establish connection to Kafka via the old state.

At another instance, SR gets stuck in this state: {"type":"log", "host":"pranayschema-ckaf-schema-registry-0", "level":"INFO", "neid":"schema-registry-9c93b10e257040268ff5d018544de95f", "system":"schema-registry", "time":"2019-07-17T13:21:02.991Z", "timezone":"UTC", "log":"kafka-store-reader-thread-_schemas - io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread - [kafka-store-reader-thread-_schemas]: Starting"}

Here, it seems like the isRunning flag is still set and the SR code is unable to enter the doWork().

There needs to be some check at which an interrupt is called on the KafkaStoreReaderThread and this connection needs to be re-established.

kreuzman commented 4 years ago

We experiencing exactly same issue.