confluentinc / schema-registry

Confluent Schema Registry for Kafka
https://docs.confluent.io/current/schema-registry/docs/index.html
Other
2.19k stars 1.11k forks source link

org.apache.kafka.common.config.ConfigException: No bootstrap urls given in bootstrap.servers #230

Closed zoltan-fedor closed 8 years ago

zoltan-fedor commented 8 years ago

I have setup two Confluent Kafka servers both with Schema Registry and when I started the Schema Registry on the first one, I got the following error:

org.apache.kafka.common.config.ConfigException: No bootstrap urls given in bootstrap.servers

It seems that at any given time I can only have one instance of the Schema Registry running within one Kafka cluster and the other one will exit with the above error.

Any idea what might be wrong? This error message is pretty cryptic.

zoltan-fedor commented 8 years ago

Sorry, now it seems the issue went away. Interesting, I have done a full reboot on both machines.

Still, it would be useful to understand this error message. What is the bootstrap url? Where to set it up? I couldn't find it in the docs nor in the code.

zoltan-fedor commented 8 years ago

Interestingly it seems this issue is coming and going. Restarting the Schema Registry only gets this error to show up about 50% of the times. Tried changing timeouts, but no help.

Anybody has any idea how to improve on this?

ewencp commented 8 years ago

@zoltan-fedor Internally the schema registry uses a Kafka producer and consumer. The Kafka producer needs a set of bootstrap brokers to get connected to the cluster. Since we already have zookeeper connection info, we just get the bootstrap brokers out of Kafka.

The error message you're seeing indicates that the list it's getting from Zookeeper is empty. If your Kafka cluster is up and running ok, I'm not sure why it wouldn't be able to generate this list properly. Do you have any more logs that might help clarify what's going on? Are the standard Kafka tools like the console producer and consumer working with your cluster?

zoltan-fedor commented 8 years ago

Thanks @ewencp , this helps a lot. I was testing the "Failed to write Noop record" (https://github.com/confluentinc/schema-registry/issues/232) issue, so I kept repeatedly restarting the whole cluster (2 servers each with kafka, zookeeper, kafka-rest and schema-registry) and was checking the schema registry log for the "noop" issue when I was seeing this boostrap urls problem and had no idea where it would be coming from.

Since then I have modified the schema-registry settings in multiple places - including the zookeeper url. Now - after your reply - I have tested it again by restarting close to a dozen times but this bootstrap url issue didn't show anymore (the "noop" error is stil there though).

I have a feeling that my earlier zookeeper urls might have been behind this issue (localhost + the ip of the other server), which would explain why sometimes worked and sometimes not (depending of which zookeeper was the maser?), but now both zookeeper urls are being set to the full ip address and all works fine. In any case I can't reproduce it anymore and thanks to your help now I understand where this error could come from and what to look for to resolve it if it occurs again.

As I can't reproduce it anymore, I will close this issue for now.

Thanks again for the explanation!

zoltan-fedor commented 8 years ago

Well, it has happened again.

schema-registry-log:

Fri Sep 18 23:43:24 UTC 2015: SLF4J: Class path contains multiple SLF4J bindings.
Fri Sep 18 23:43:24 UTC 2015: SLF4J: Found binding in [jar:file:/usr/share/java/confluent-common/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
Fri Sep 18 23:43:24 UTC 2015: SLF4J: Found binding in [jar:file:/usr/share/java/schema-registry/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
Fri Sep 18 23:43:24 UTC 2015: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
Fri Sep 18 23:43:24 UTC 2015: SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Fri Sep 18 23:43:24 UTC 2015: [2015-09-18 23:43:24,270] INFO SchemaRegistryConfig values:
Fri Sep 18 23:43:24 UTC 2015: master.eligibility = true
Fri Sep 18 23:43:24 UTC 2015: port = 8081
Fri Sep 18 23:43:24 UTC 2015: kafkastore.timeout.ms = 500
Fri Sep 18 23:43:24 UTC 2015: kafkastore.init.timeout.ms = 60000
Fri Sep 18 23:43:24 UTC 2015: debug = false
Fri Sep 18 23:43:24 UTC 2015: request.logger.name = io.confluent.rest-utils.requests
Fri Sep 18 23:43:24 UTC 2015: metrics.sample.window.ms = 30000
Fri Sep 18 23:43:24 UTC 2015: schema.registry.zk.namespace = schema_registry
Fri Sep 18 23:43:24 UTC 2015: kafkastore.zk.session.timeout.ms = 30000
Fri Sep 18 23:43:24 UTC 2015: kafkastore.topic = _schemas
Fri Sep 18 23:43:24 UTC 2015: avro.compatibility.level = backward
Fri Sep 18 23:43:24 UTC 2015: shutdown.graceful.ms = 1000
Fri Sep 18 23:43:24 UTC 2015: response.mediatype.preferred = [application/vnd.schemaregistry.v1+json, application/vnd.schemaregistry+json, application/json]
Fri Sep 18 23:43:24 UTC 2015: metrics.jmx.prefix = kafka.schema.registry
Fri Sep 18 23:43:24 UTC 2015: host.name = fr-mktg-ingest02.amers1.cis.trcloud
Fri Sep 18 23:43:24 UTC 2015: metric.reporters = []
Fri Sep 18 23:43:24 UTC 2015: kafkastore.commit.interval.ms = -1
Fri Sep 18 23:43:24 UTC 2015: kafkastore.connection.url = 10.206.148.154:2181,10.206.74.110:2181
Fri Sep 18 23:43:24 UTC 2015: metrics.num.samples = 2
Fri Sep 18 23:43:24 UTC 2015: response.mediatype.default = application/vnd.schemaregistry.v1+json
Fri Sep 18 23:43:24 UTC 2015: kafkastore.topic.replication.factor = 3
Fri Sep 18 23:43:24 UTC 2015: (io.confluent.kafka.schemaregistry.rest.SchemaRegistryConfig:135)
Fri Sep 18 23:43:24 UTC 2015: [2015-09-18 23:43:24,803] INFO Initialized the consumer offset to -1 (io.confluent.kafka.schemaregistry.storage.KafkaStoreReaderThread:87)
Fri Sep 18 23:43:25 UTC 2015: [2015-09-18 23:43:25,480] WARN The replication factor of the schema topic _schemas is less than the desired one of 3. If this is a production environment, it's crucial to add more brokers and increase the replication factor of the topic. (io.confluent.kafka.schemaregistry.storage.KafkaStore:201)
Fri Sep 18 23:43:25 UTC 2015: [2015-09-18 23:43:25,549] ERROR Server died unexpectedly:  (io.confluent.kafka.schemaregistry.rest.Main:50)
Fri Sep 18 23:43:25 UTC 2015: org.apache.kafka.common.config.ConfigException: No bootstrap urls given in bootstrap.servers
Fri Sep 18 23:43:25 UTC 2015: at org.apache.kafka.common.utils.ClientUtils.parseAndValidateAddresses(ClientUtils.java:46)
Fri Sep 18 23:43:25 UTC 2015: at org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:189)
Fri Sep 18 23:43:25 UTC 2015: at org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:129)
Fri Sep 18 23:43:25 UTC 2015: at io.confluent.kafka.schemaregistry.storage.KafkaStore.init(KafkaStore.java:143)
Fri Sep 18 23:43:25 UTC 2015: at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.init(KafkaSchemaRegistry.java:162)
Fri Sep 18 23:43:25 UTC 2015: at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.setupResources(SchemaRegistryRestApplication.java:55)
Fri Sep 18 23:43:25 UTC 2015: at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.setupResources(SchemaRegistryRestApplication.java:37)
Fri Sep 18 23:43:25 UTC 2015: at io.confluent.rest.Application.createServer(Application.java:104)
Fri Sep 18 23:43:25 UTC 2015: at io.confluent.kafka.schemaregistry.rest.Main.main(Main.java:42)

kafka logs are empty at the time when the schema registry failed.

kakfa-zookeeper logs on the two servers:

Fri Sep 18 23:43:24 UTC 2015: [2015-09-18 23:43:24,637] WARN Connection request from old client /10.206.148.154:36496; will be dropped if server is in r-o mode (org.apache.zookeeper.server.ZooKeeperServer)
Fri Sep 18 23:43:25 UTC 2015: [2015-09-18 23:43:25,886] WARN caught end of stream exception (org.apache.zookeeper.server.NIOServerCnxn)
Fri Sep 18 23:43:25 UTC 2015: EndOfStreamException: Unable to read additional data from client sessionid 0x14fe2d4e9c70005, likely client has closed socket
Fri Sep 18 23:43:25 UTC 2015: at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
Fri Sep 18 23:43:25 UTC 2015: at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
Fri Sep 18 23:43:25 UTC 2015: at java.lang.Thread.run(Thread.java:701)
Fri Sep 18 23:43:25 UTC 2015: [2015-09-18 23:43:25,026] WARN Connection request from old client /10.206.148.154:36105; will be dropped if server is in r-o mode (org.apache.zookeeper.server.ZooKeeperServer)
Fri Sep 18 23:43:25 UTC 2015: [2015-09-18 23:43:25,884] WARN caught end of stream exception (org.apache.zookeeper.server.NIOServerCnxn)
Fri Sep 18 23:43:25 UTC 2015: EndOfStreamException: Unable to read additional data from client sessionid 0x14fe2d4bb600005, likely client has closed socket
Fri Sep 18 23:43:25 UTC 2015: at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
Fri Sep 18 23:43:25 UTC 2015: at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
Fri Sep 18 23:43:25 UTC 2015: at java.lang.Thread.run(Thread.java:701)
zoltan-fedor commented 8 years ago

Sorry, also tried the producer and the below error was thrown:

 /usr/bin/kafka-avro-console-producer \
>             --broker-list localhost:9092 --topic test \
>             --property value.schema='{"type":"record","name":"myrecord","field                                                                         s":[{"name":"f1","type":"string"}]}'
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/share/java/confluent-common/slf4j-log4j12                                                                         -1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/share/java/schema-registry/slf4j-log4j12-                                                                         1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
{"f1": "value1"}
org.apache.kafka.common.errors.SerializationException: Error serializing Avro message
Caused by: java.net.ConnectException: Connection refused
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:385)
        at java.net.Socket.connect(Socket.java:546)
        at java.net.Socket.connect(Socket.java:495)
        at sun.net.NetworkClient.doConnect(NetworkClient.java:178)
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:427)
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:529)
        at sun.net.www.http.HttpClient.<init>(HttpClient.java:213)
        at sun.net.www.http.HttpClient.New(HttpClient.java:306)
        at sun.net.www.http.HttpClient.New(HttpClient.java:325)
        at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:955)
        at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:891)
        at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:809)
        at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1050)
        at io.confluent.kafka.schemaregistry.client.rest.utils.RestUtils.httpRequest(RestUtils.java:128)
        at io.confluent.kafka.schemaregistry.client.rest.utils.RestUtils.registerSchema(RestUtils.java:174)
        at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.registerAndGetId(CachedSchemaRegistryClient.java:51)
        at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.register(CachedSchemaRegistryClient.java:89)
        at io.confluent.kafka.serializers.AbstractKafkaAvroSerializer.serializeImpl(AbstractKafkaAvroSerializer.java:49)
        at io.confluent.kafka.formatter.AvroMessageReader.readMessage(AvroMessageReader.java:155)
        at kafka.tools.ConsoleProducer$.main(ConsoleProducer.scala:94)
        at kafka.tools.ConsoleProducer.main(ConsoleProducer.scala)

Restarting all elements of kafka yet again resolves this.

ppearcy commented 8 years ago

I've also hit this exception. This is pretty much a hunch, but it looked like a race condition where Kafka would start and open TCP connectivity, but not yet be registered with zookeeper. Adding just a 5 second sleep before starting schema registry resolved for me.

zoltan-fedor commented 8 years ago

Agreed, it turn out to be the same thing. I had Kafka, Zookeeper and the Schema Registy starting at boot automatically a services and sometimes this error rose. If I delayed the start then it worked more reliably.

I agree with this enhancement request (https://github.com/confluentinc/schema-registry/issues/233) better, more leading error code would be beneficial.