Open ShowMeYourNoobs opened 4 years ago
Have you set the registry containers to depend on the startup of zookeeper & kafka containers?
Yes. Kafka containers depend on zookeeper containers and schema registry containers depend on kafka containers. Here is the start script which we call from the dockerfile (in CMD section) of schema registry:
!/bin/bash
wait-for-it -h kafka1 -p 9092 -t 0 wait-for-it -h kafka2 -p 9092 -t 0 wait-for-it -h kafka3 -p 9092 -t 0
cub kafka-ready -b kafka1:9092 1 120 cub kafka-ready -b kafka2:9092 1 120 cub kafka-ready -b kafka3:9092 1 120
echo "OK. All Kafkas are ready"
./etc/confluent/docker/run
I use the wait-for-it script to check if the kafka container is up and I use the cub command to check if the actual kafka is up and running.
Any updates?
It has been some time ago, but as far as I remember we had a problem in 1 or more of our wait-for-it scripts. I don't know what the problem was, maybe miss-configuration or something else. After fixing that, all was working fine every time we deployed.
Hello,
I have configured confluent cluster in docker containers consisting of 3 zookeepers, 3 kafka brokers and 3 schema registries. Sometimes on deploying all 3 schema registries won't start. The problem appears very rarely, only when deploying the cluster (docker stack, using yml file). And when it does appear, the fix is to redeploy the cluster and it works again. I should note that we use 5.1.3 version of confluentinc. In the log bellow you can see that there are several errors which appear. Here is the log I get from 1 of the schema registries (all 3 of them have the same log):
_wait-for-it: waiting for kafka1:9092 without a timeout, wait-for-it: kafka1:9092 is available after 0 seconds, wait-for-it: waiting for kafka2:9092 without a timeout, wait-for-it: kafka2:9092 is available after 0 seconds, wait-for-it: waiting for kafka3:9092 without a timeout, wait-for-it: kafka3:9092 is available after 0 seconds, [main] INFO org.apache.kafka.clients.admin.AdminClientConfig - AdminClientConfig values: , bootstrap.servers = [kafka1:9092], client.dns.lookup = default, client.id = , connections.max.idle.ms = 300000, metadata.max.age.ms = 300000, metric.reporters = [], metrics.num.samples = 2, metrics.recording.level = INFO, metrics.sample.window.ms = 30000, receive.buffer.bytes = 65536, reconnect.backoff.max.ms = 1000, reconnect.backoff.ms = 50, request.timeout.ms = 120000, retries = 5, retry.backoff.ms = 100, sasl.client.callback.handler.class = null, sasl.jaas.config = null, sasl.kerberos.kinit.cmd = /usr/bin/kinit, sasl.kerberos.min.time.before.relogin = 60000, sasl.kerberos.service.name = null, sasl.kerberos.ticket.renew.jitter = 0.05, sasl.kerberos.ticket.renew.window.factor = 0.8, sasl.login.callback.handler.class = null, sasl.login.class = null, sasl.login.refresh.buffer.seconds = 300, sasl.login.refresh.min.period.seconds = 60, sasl.login.refresh.window.factor = 0.8, sasl.login.refresh.window.jitter = 0.05, sasl.mechanism = GSSAPI, security.protocol = PLAINTEXT, send.buffer.bytes = 131072, ssl.cipher.suites = null, ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1], ssl.endpoint.identification.algorithm = https, ssl.key.password = null, ssl.keymanager.algorithm = SunX509, ssl.keystore.location = null, ssl.keystore.password = null, ssl.keystore.type = JKS, ssl.protocol = TLS, ssl.provider = null, ssl.secure.random.implementation = null, ssl.trustmanager.algorithm = PKIX, ssl.truststore.location = null, ssl.truststore.password = null, ssl.truststore.type = JKS, , [main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka version : 2.1.1-cp3, [main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka commitId : c119bfeabe023775, [main] INFO org.apache.kafka.clients.admin.AdminClientConfig - AdminClientConfig values: , bootstrap.servers = [kafka2:9092], client.dns.lookup = default, client.id = , connections.max.idle.ms = 300000, metadata.max.age.ms = 300000, metric.reporters = [], metrics.num.samples = 2, metrics.recording.level = INFO, metrics.sample.window.ms = 30000, receive.buffer.bytes = 65536, reconnect.backoff.max.ms = 1000, reconnect.backoff.ms = 50, request.timeout.ms = 120000, retries = 5, retry.backoff.ms = 100, sasl.client.callback.handler.class = null, sasl.jaas.config = null, sasl.kerberos.kinit.cmd = /usr/bin/kinit, sasl.kerberos.min.time.before.relogin = 60000, sasl.kerberos.service.name = null, sasl.kerberos.ticket.renew.jitter = 0.05, sasl.kerberos.ticket.renew.window.factor = 0.8, sasl.login.callback.handler.class = null, sasl.login.class = null, sasl.login.refresh.buffer.seconds = 300, sasl.login.refresh.min.period.seconds = 60, sasl.login.refresh.window.factor = 0.8, sasl.login.refresh.window.jitter = 0.05, sasl.mechanism = GSSAPI, security.protocol = PLAINTEXT, send.buffer.bytes = 131072, ssl.cipher.suites = null, ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1], ssl.endpoint.identification.algorithm = https, ssl.key.password = null, ssl.keymanager.algorithm = SunX509, ssl.keystore.location = null, ssl.keystore.password = null, ssl.keystore.type = JKS, ssl.protocol = TLS, ssl.provider = null, ssl.secure.random.implementation = null, ssl.trustmanager.algorithm = PKIX, ssl.truststore.location = null, ssl.truststore.password = null, ssl.truststore.type = JKS, , [main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka version : 2.1.1-cp3, [main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka commitId : c119bfeabe023775, [main] INFO org.apache.kafka.clients.admin.AdminClientConfig - AdminClientConfig values: , bootstrap.servers = [kafka3:9092], client.dns.lookup = default, client.id = , connections.max.idle.ms = 300000, metadata.max.age.ms = 300000, metric.reporters = [], metrics.num.samples = 2, metrics.recording.level = INFO, metrics.sample.window.ms = 30000, receive.buffer.bytes = 65536, reconnect.backoff.max.ms = 1000, reconnect.backoff.ms = 50, request.timeout.ms = 120000, retries = 5, retry.backoff.ms = 100, sasl.client.callback.handler.class = null, sasl.jaas.config = null, sasl.kerberos.kinit.cmd = /usr/bin/kinit, sasl.kerberos.min.time.before.relogin = 60000, sasl.kerberos.service.name = null, sasl.kerberos.ticket.renew.jitter = 0.05, sasl.kerberos.ticket.renew.window.factor = 0.8, sasl.login.callback.handler.class = null, sasl.login.class = null, sasl.login.refresh.buffer.seconds = 300, sasl.login.refresh.min.period.seconds = 60, sasl.login.refresh.window.factor = 0.8, sasl.login.refresh.window.jitter = 0.05, sasl.mechanism = GSSAPI, security.protocol = PLAINTEXT, send.buffer.bytes = 131072, ssl.cipher.suites = null, ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1], ssl.endpoint.identification.algorithm = https, ssl.key.password = null, ssl.keymanager.algorithm = SunX509, ssl.keystore.location = null, ssl.keystore.password = null, ssl.keystore.type = JKS, ssl.protocol = TLS, ssl.provider = null, ssl.secure.random.implementation = null, ssl.trustmanager.algorithm = PKIX, ssl.truststore.location = null, ssl.truststore.password = null, ssl.truststore.type = JKS, , [main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka version : 2.1.1-cp3, [main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka commitId : c119bfeabe023775, [kafka-admin-client-thread | adminclient-1] INFO org.apache.kafka.clients.admin.internals.AdminMetadataManager - [AdminClient clientId=adminclient-1] Metadata update failed, org.apache.kafka.common.errors.TimeoutException: Timed out waiting to send the call., [main] ERROR io.confluent.admin.utils.ClusterStatus - Error while getting broker list., java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment., at org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45), at org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32), at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89), at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:262), at io.confluent.admin.utils.ClusterStatus.isKafkaReady(ClusterStatus.java:149), at io.confluent.admin.utils.cli.KafkaReadyCommand.main(KafkaReadyCommand.java:150), Caused by: org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment., [main] INFO io.confluent.admin.utils.ClusterStatus - Expected 1 brokers but found only 0. Trying to query Kafka for metadata again ..., [main] ERROR io.confluent.admin.utils.ClusterStatus - Expected 1 brokers but found only 0. Brokers found []., OK. All Kafkas are ready, ===> ENV Variables ..., ALLOW_UNSIGNED=false, COMPONENT=schema-registry, CONFLUENT_DEB_VERSION=1, CONFLUENT_MAJOR_VERSION=5, CONFLUENT_MINOR_VERSION=1, CONFLUENT_MVN_LABEL=, CONFLUENT_PATCH_VERSION=3, CONFLUENT_PLATFORM_LABEL=, CONFLUENT_VERSION=5.1.3, CUB_CLASSPATH=/etc/confluent/docker/docker-utils.jar, HOME=/root, HOSTNAME=704d321e9067, KAFKA_VERSION=2.1.1cp3, LANG=C.UTF-8, PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin, PWD=/, PYTHON_PIP_VERSION=8.1.2, PYTHON_VERSION=2.7.9-1, SCALA_VERSION=2.11, SCHEMA_REGISTRY_AVRO_COMPATIBILITY_LEVEL=none, SCHEMA_REGISTRY_HOST_NAME=schema-registry2, SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS=PLAINTEXT://kafka1:9092,PLAINTEXT://kafka2:9092,PLAINTEXT://kafka3:9092, SCHEMA_REGISTRY_LISTENERS=http://0.0.0.0:8081, SCHEMA_REGISTRY_ZK_NAMESPACE=sr, SHLVL=2, ZULU_OPENJDKVERSION=8=8.30.0.1, \=/usr/bin/env,, r, uid=0(root) gid=0(root) groups=0(root), ===> Configuring ..., ===> Running preflight checks ... , ===> Check if Kafka is healthy ..., [main] INFO org.apache.kafka.clients.admin.AdminClientConfig - AdminClientConfig values: , bootstrap.servers = [PLAINTEXT://kafka1:9092, PLAINTEXT://kafka2:9092, PLAINTEXT://kafka3:9092], client.dns.lookup = default, client.id = , connections.max.idle.ms = 300000, metadata.max.age.ms = 300000, metric.reporters = [], metrics.num.samples = 2, metrics.recording.level = INFO, metrics.sample.window.ms = 30000, receive.buffer.bytes = 65536, reconnect.backoff.max.ms = 1000, reconnect.backoff.ms = 50, request.timeout.ms = 120000, retries = 5, retry.backoff.ms = 100, sasl.client.callback.handler.class = null, sasl.jaas.config = null, sasl.kerberos.kinit.cmd = /usr/bin/kinit, sasl.kerberos.min.time.before.relogin = 60000, sasl.kerberos.service.name = null, sasl.kerberos.ticket.renew.jitter = 0.05, sasl.kerberos.ticket.renew.window.factor = 0.8, sasl.login.callback.handler.class = null, sasl.login.class = null, sasl.login.refresh.buffer.seconds = 300, sasl.login.refresh.min.period.seconds = 60, sasl.login.refresh.window.factor = 0.8, sasl.login.refresh.window.jitter = 0.05, sasl.mechanism = GSSAPI, security.protocol = PLAINTEXT, send.buffer.bytes = 131072, ssl.cipher.suites = null, ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1], ssl.endpoint.identification.algorithm = https, ssl.key.password = null, ssl.keymanager.algorithm = SunX509, ssl.keystore.location = null, ssl.keystore.password = null, ssl.keystore.type = JKS, ssl.protocol = TLS, ssl.provider = null, ssl.secure.random.implementation = null, ssl.trustmanager.algorithm = PKIX, ssl.truststore.location = null, ssl.truststore.password = null, ssl.truststore.type = JKS, , [main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka version : 2.1.1-cp3, [main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka commitId : c119bfeabe023775, ===> Launching ... , ===> Launching schema-registry ... , [2019-12-05 07:29:35,054] INFO SchemaRegistryConfig values: , resource.extension.class = [], metric.reporters = [], kafkastore.sasl.kerberos.kinit.cmd = /usr/bin/kinit, response.mediatype.default = application/vnd.schemaregistry.v1+json, resource.extension.classes = [], kafkastore.ssl.trustmanager.algorithm = PKIX, inter.instance.protocol = http, authentication.realm = , ssl.keystore.type = JKS, kafkastore.topic = _schemas, metrics.jmx.prefix = kafka.schema.registry, kafkastore.ssl.enabled.protocols = TLSv1.2,TLSv1.1,TLSv1, kafkastore.topic.replication.factor = 3, ssl.truststore.password = [hidden], kafkastore.timeout.ms = 500, host.name = schema-registry2, kafkastore.bootstrap.servers = [PLAINTEXT://kafka1:9092, PLAINTEXT://kafka2:9092, PLAINTEXT://kafka3:9092], schema.registry.zk.namespace = schemaregistry, kafkastore.sasl.kerberos.ticket.renew.window.factor = 0.8, kafkastore.sasl.kerberos.service.name = , schema.registry.resource.extension.class = [], ssl.endpoint.identification.algorithm = , compression.enable = true, kafkastore.ssl.truststore.type = JKS, avro.compatibility.level = none, kafkastore.ssl.protocol = TLS, kafkastore.ssl.provider = , kafkastore.ssl.truststore.location = , response.mediatype.preferred = [application/vnd.schemaregistry.v1+json, application/vnd.schemaregistry+json, application/json], kafkastore.ssl.keystore.type = JKS, authentication.skip.paths = [], ssl.truststore.type = JKS, websocket.servlet.initializor.classes = [], kafkastore.ssl.truststore.password = [hidden], access.control.allow.origin = , ssl.truststore.location = , ssl.keystore.password = [hidden], port = 8081, access.control.allow.headers = , kafkastore.ssl.keystore.location = , metrics.tag.map = {}, master.eligibility = true, ssl.client.auth = false, kafkastore.ssl.keystore.password = [hidden], rest.servlet.initializor.classes = [], websocket.path.prefix = /ws, kafkastore.security.protocol = PLAINTEXT, ssl.trustmanager.algorithm = , authentication.method = NONE, request.logger.name = io.confluent.rest-utils.requests, ssl.key.password = [hidden], kafkastore.zk.session.timeout.ms = 30000, kafkastore.sasl.mechanism = GSSAPI, kafkastore.sasl.kerberos.ticket.renew.jitter = 0.05, kafkastore.ssl.key.password = [hidden], zookeeper.set.acl = false, schema.registry.inter.instance.protocol = , authentication.roles = [*], metrics.num.samples = 2, ssl.protocol = TLS, schema.registry.group.id = schema-registry, kafkastore.ssl.keymanager.algorithm = SunX509, kafkastore.connection.url = , debug = false, listeners = [http://0.0.0.0:8081], kafkastore.group.id = , ssl.provider = , ssl.enabled.protocols = [], shutdown.graceful.ms = 1000, ssl.keystore.location = , ssl.cipher.suites = [], kafkastore.ssl.endpoint.identification.algorithm = , kafkastore.ssl.cipher.suites = , access.control.allow.methods = , kafkastore.sasl.kerberos.min.time.before.relogin = 60000, ssl.keymanager.algorithm = , metrics.sample.window.ms = 30000, kafkastore.init.timeout.ms = 60000, (io.confluent.kafka.schemaregistry.rest.SchemaRegistryConfig), [2019-12-05 07:29:35,100] INFO Logging initialized @353ms to org.eclipse.jetty.util.log.Slf4jLog (org.eclipse.jetty.util.log), [2019-12-05 07:29:35,373] INFO Initializing KafkaStore with broker endpoints: PLAINTEXT://kafka1:9092,PLAINTEXT://kafka2:9092,PLAINTEXT://kafka3:9092 (io.confluent.kafka.schemaregistry.storage.KafkaStore), [2019-12-05 07:29:35,388] INFO AdminClientConfig values: , bootstrap.servers = [PLAINTEXT://kafka1:9092, PLAINTEXT://kafka2:9092, PLAINTEXT://kafka3:9092], client.dns.lookup = default, client.id = , connections.max.idle.ms = 300000, metadata.max.age.ms = 300000, metric.reporters = [], metrics.num.samples = 2, metrics.recording.level = INFO, metrics.sample.window.ms = 30000, receive.buffer.bytes = 65536, reconnect.backoff.max.ms = 1000, reconnect.backoff.ms = 50, request.timeout.ms = 120000, retries = 5, retry.backoff.ms = 100, sasl.client.callback.handler.class = null, sasl.jaas.config = null, sasl.kerberos.kinit.cmd = /usr/bin/kinit, sasl.kerberos.min.time.before.relogin = 60000, sasl.kerberos.service.name = null, sasl.kerberos.ticket.renew.jitter = 0.05, sasl.kerberos.ticket.renew.window.factor = 0.8, sasl.login.callback.handler.class = null, sasl.login.class = null, sasl.login.refresh.buffer.seconds = 300, sasl.login.refresh.min.period.seconds = 60, sasl.login.refresh.window.factor = 0.8, sasl.login.refresh.window.jitter = 0.05, sasl.mechanism = GSSAPI, security.protocol = PLAINTEXT, send.buffer.bytes = 131072, ssl.cipher.suites = null, ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1], ssl.endpoint.identification.algorithm = https, ssl.key.password = null, ssl.keymanager.algorithm = SunX509, ssl.keystore.location = null, ssl.keystore.password = null, ssl.keystore.type = JKS, ssl.protocol = TLS, ssl.provider = null, ssl.secure.random.implementation = null, ssl.trustmanager.algorithm = PKIX, ssl.truststore.location = null, ssl.truststore.password = null, ssl.truststore.type = JKS, (org.apache.kafka.clients.admin.AdminClientConfig), [2019-12-05 07:29:35,451] INFO Kafka version : 2.1.1-cp3 (org.apache.kafka.common.utils.AppInfoParser), [2019-12-05 07:29:35,451] INFO Kafka commitId : 74487c928927908f (org.apache.kafka.common.utils.AppInfoParser), [2019-12-05 07:34:35,640] ERROR Error starting the schema registry (io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication), io.confluent.kafka.schemaregistry.exceptions.SchemaRegistryInitializationException: Error initializing kafka store while initializing schema registry, at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.init(KafkaSchemaRegistry.java:212), at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.initSchemaRegistry(SchemaRegistryRestApplication.java:62), at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.setupResources(SchemaRegistryRestApplication.java:73), at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.setupResources(SchemaRegistryRestApplication.java:40), at io.confluent.rest.Application.createServer(Application.java:201), at io.confluent.kafka.schemaregistry.rest.SchemaRegistryMain.main(SchemaRegistryMain.java:42), Caused by: io.confluent.kafka.schemaregistry.storage.exceptions.StoreInitializationException: Timed out trying to create or validate schema topic configuration, at io.confluent.kafka.schemaregistry.storage.KafkaStore.createOrVerifySchemaTopic(KafkaStore.java:172), at io.confluent.kafka.schemaregistry.storage.KafkaStore.init(KafkaStore.java:114), at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.init(KafkaSchemaRegistry.java:210), ... 5 more, Caused by: java.util.concurrent.TimeoutException, at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:108), at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:274), at io.confluent.kafka.schemaregistry.storage.KafkaStore.createOrVerifySchemaTopic(KafkaStore.java:165), ... 7 more,