confluentinc / cp-docker-images

[DEPRECATED] Docker images for Confluent Platform.
Apache License 2.0
1.14k stars 705 forks source link

Kafka DNS resolution failed #708

Closed justpolidor closed 4 years ago

justpolidor commented 5 years ago

When using the kafka headless service into the kafka advertised listerners , I have:

[2019-03-14 14:34:01,736] WARN The replication factor of topic __confluent.support.metrics is 1, which is less than the desired replication factor of 3.  If you happen to add more brokers to this cluster, then it is important to increase the replication factor of the topic to eventually 3 to ensure reliable and durable metrics collection. (io.confluent.support.metrics.common.kafka.KafkaUtilities)
[2019-03-14 14:34:01,760] INFO ProducerConfig values:
        acks = 1
        batch.size = 16384
        bootstrap.servers = [PLAINTEXT://kafkabroker-0.kafkabroker-headless.test:9092, PLAINTEXT://kafkabroker-1.kafkabroker-headless.test:9092, PLAINTEXT://kafkabroker-2.kafkabroker-headless.test:9092]
        buffer.memory = 33554432
        client.dns.lookup = default
        client.id =
        compression.type = none
        connections.max.idle.ms = 540000
        delivery.timeout.ms = 120000
        enable.idempotence = false
        interceptor.classes = []
        key.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer
        linger.ms = 0
        max.block.ms = 10000
        max.in.flight.requests.per.connection = 5
        max.request.size = 1048576
        metadata.max.age.ms = 300000
        metric.reporters = []
        metrics.num.samples = 2
        metrics.recording.level = INFO
        metrics.sample.window.ms = 30000
        partitioner.class = class org.apache.kafka.clients.producer.internals.DefaultPartitioner
        receive.buffer.bytes = 32768
        reconnect.backoff.max.ms = 1000
        reconnect.backoff.ms = 50
        request.timeout.ms = 30000
        retries = 2147483647
        retry.backoff.ms = 100
        sasl.client.callback.handler.class = null
        sasl.jaas.config = null
        sasl.kerberos.kinit.cmd = /usr/bin/kinit
        sasl.kerberos.min.time.before.relogin = 60000
        sasl.kerberos.service.name = null
        sasl.kerberos.ticket.renew.jitter = 0.05
        sasl.kerberos.ticket.renew.window.factor = 0.8
        sasl.login.callback.handler.class = null
        sasl.login.class = null
        sasl.login.refresh.buffer.seconds = 300
        sasl.login.refresh.min.period.seconds = 60
        sasl.login.refresh.window.factor = 0.8
        sasl.login.refresh.window.jitter = 0.05
        sasl.mechanism = GSSAPI
        security.protocol = PLAINTEXT
        send.buffer.bytes = 131072
        ssl.cipher.suites = null
        ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
        ssl.endpoint.identification.algorithm = https
        ssl.key.password = null
        ssl.keymanager.algorithm = SunX509
        ssl.keystore.location = null
        ssl.keystore.password = null
        ssl.keystore.type = JKS
        ssl.protocol = TLS
        ssl.provider = null
        ssl.secure.random.implementation = null
        ssl.trustmanager.algorithm = PKIX
        ssl.truststore.location = null
        ssl.truststore.password = null
        ssl.truststore.type = JKS
        transaction.timeout.ms = 60000
        transactional.id = null
        value.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer
 (org.apache.kafka.clients.producer.ProducerConfig)
[2019-03-14 14:34:01,790] WARN Couldn't resolve server PLAINTEXT://kafkabroker-0.kafkabroker-headless.test:9092 from bootstrap.servers as DNS resolution failed for kafkabroker-0.kafkabroker-headless.test (org.apache.kafka.clients.ClientUtils)
[2019-03-14 14:34:01,798] WARN Couldn't resolve server PLAINTEXT://kafkabroker-1.kafkabroker-headless.test:9092 from bootstrap.servers as DNS resolution failed for kafkabroker-1.kafkabroker-headless.test (org.apache.kafka.clients.ClientUtils)
[2019-03-14 14:34:01,804] WARN Couldn't resolve server PLAINTEXT://kafkabroker-2.kafkabroker-headless.test:9092 from bootstrap.servers as DNS resolution failed for kafkabroker-2.kafkabroker-headless.test (org.apache.kafka.clients.ClientUtils)
[2019-03-14 14:34:01,804] INFO [Producer clientId=producer-1] Closing the Kafka producer with timeoutMillis = 0 ms. (org.apache.kafka.clients.producer.KafkaProducer)
[2019-03-14 14:34:01,807] ERROR Could not submit metrics to Kafka topic __confluent.support.metrics: Failed to construct kafka producer (io.confluent.support.metrics.BaseMetricsReporter)
[2019-03-14 14:34:03,321] INFO Successfully submitted metrics to Confluent via secure endpoint (io.confluent.support.metrics.submitters.ConfluentSubmitter)

Kafka Image version: 5.1.0 and the statefulset.yaml/service.yaml is the same as in the repo

OneCricketeer commented 4 years ago

Could you please clarify what code is throwing this exception? Is it deployed as a service as well?

iNithinPrasad commented 4 years ago

I too have the same issue. Below is the code and error received. Have masked my providername.

seq 10000 | kafka-console-producer \

--topic example --broker-list kafka.providername:9092 \ --producer.config kafka.properties

[2019-11-11 00:37:31,563] WARN Couldn't resolve server kafka.providername:9092 from bootstrap.servers as DNS resolution failed for kafka.providername (org.apache.kafka.clients.ClientUtils) org.apache.kafka.common.KafkaException: Failed to construct kafka producer at org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:433) at org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:298) at kafka.tools.ConsoleProducer$.main(ConsoleProducer.scala:45) at kafka.tools.ConsoleProducer.main(ConsoleProducer.scala) Caused by: org.apache.kafka.common.config.ConfigException: No resolvable bootstrap urls given in bootstrap.servers at org.apache.kafka.clients.ClientUtils.parseAndValidateAddresses(ClientUtils.java:88) at org.apache.kafka.clients.ClientUtils.parseAndValidateAddresses(ClientUtils.java:47) at org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:408)

OneCricketeer commented 4 years ago

If you're not able to simply ping kafka.providername from where you run the command, it won't work

The containers work fine, but you must configure your own network appropriately

iNithinPrasad commented 4 years ago

Thank you. I did a quick nslookup and it confirmed the same. I am running these confluent containers in an AKS cluster and an addition to my Azure DNS config sould solve the issue. I will let you know if that worked out.

C:\Users\Nithin.Prasad>nslookup kafka.providername DNS request timed out. timeout was 2 seconds. Server: UnKnown Address: 0.0.0.0

DNS request timed out. timeout was 2 seconds.

justpolidor commented 4 years ago

Hi all,

sorry for my late reply. My problem was related to misconfigured network settings on K8S. Now it is working.

Thanks

iNithinPrasad commented 4 years ago

@justpolidor : Can you please share the resolution steps you took? Thanks.

anishprobable commented 4 years ago

@iNithinPrasad Have you got the resolution ? I am trying to install confluent schema registry on AKS through confluent helm chart and facing same issue.

iNithinPrasad commented 4 years ago

@anishprobable : Tried adding a DNS entry but it still did not work. I downloaded the docker image of confluent and ran it on a VM. Atleast got my app running for a prototype.

anishprobable commented 4 years ago

@iNithinPrasad I also got it running. Noticed that I was giving an incorrect service name instead of actual one. When I changed the service name in publish/consume command, everything worked fine.

houshunwei commented 4 years ago

Hi all,

sorry for my late reply. My problem was related to misconfigured network settings on K8S. Now it is working.

Thanks

hello, I have the same problem. Could you tell me your misconfigured network settings on K8S ?

mohankrishna3 commented 3 years ago

My Kafka pods are getting crashed with same DNS issue in AKS

ravindrabhargava commented 3 years ago

Hi @justpolidor did you resolve the issue, I am facing the same errors on my kafka running on GKE as wel, I checked my pods and services seems fine to me. how did you resolve your issue. please share your experience once. Thanks