dpkp / kafka-python

Python client for Apache Kafka
http://kafka-python.readthedocs.io/
Apache License 2.0
5.62k stars 1.41k forks source link

kafka.errors.NoBrokersAvailable exception when running producer example on Mac #1308

Closed vkroz closed 5 years ago

vkroz commented 6 years ago

Running single-node Kafka cluster on localhost on Mac (OS X 10.11.6) Getting error on attempt to instantiate producer

>>> from kafka import KafkaProducer
>>> producer = KafkaProducer(bootstrap_servers=['localhost:9092'])

getting error

  File "<stdin>", line 1, in <module>
  File "/Users/user1/anaconda/envs/myenv/lib/python2.7/site-packages/kafka/producer/kafka.py", line 347, in __init__
    **self.config)
  File "/Users/user1/anaconda/envs/myenv/lib/python2.7/site-packages/kafka/client_async.py", line 221, in __init__
    self.config['api_version'] = self.check_version(timeout=check_timeout)
  File "/Users/user1/anaconda/envs/myenv/lib/python2.7/site-packages/kafka/client_async.py", line 826, in check_version
    raise Errors.NoBrokersAvailable()
kafka.errors.NoBrokersAvailable: NoBrokersAvailable

Kafka is up, and running locally and producer from confluent-kafka-python works without issues. Any suggestions what to look for?

server.properties:
. . . 
listeners=PLAINTEXT://localhost:9092
. . .
KIC commented 6 years ago

I have the same problem on windows using Kafka 1.0.0

jeffwidman commented 6 years ago

@vkroz what version of kafka-python / kafka brokers?

Can you share the code snippet you used for confluent-kafka-python to connect?

vkroz commented 6 years ago

@jeffwidman It's just 2 lines of code in Python console as shown above.

And apparently it has something to do with setup of specific cluster. When I'm trying to connect to another remote Kafka cluster it works fine.

jeffwidman commented 6 years ago

Are there differences between the server.properties of the two clusters? I'd start by checking the listeners and advertised.listeners configs.

vkroz commented 6 years ago

Here are server properties used to run 3-nodes cluster on localhost -- the one which is python client os failing to connect

https://gist.github.com/vkroz/5ee7854448fdd96ab619a7f7380fa0c2

And these config are fully operations, cluster is accessible by Java producers/consumers and all standard Kafka tools.

jeffwidman commented 6 years ago

These look fine to me, nothing amiss. Can you share the snippet you use to connect using confluent-kafka-python? If you're running from the exact same machine with the same broker address, it's weird to me that the confluent client can connect but kafka-python can't. Are you perhaps running the confluent/java codes from a different container/VM than where you run kafka-python?

dpkp commented 6 years ago

what does this show:

python -c 'import socket; print socket.getaddrinfo("localhost", 9092)'
jar349 commented 6 years ago

Version 1.3.5 of this library (which is latest on pypy) only lists certain API versions 0.8.0 to 0.10.1. So unless you explicitly specify api_version to be (0, 10, 1) the client library's attempt to discover the version will cause a NoBrokersAvailable error.

an example from my code:

producer = KafkaProducer(
    bootstrap_servers=URL,
    client_id=CLIENT_ID,
    value_serializer=JsonSerializer.serialize,
    api_version=(0, 10, 1)
)
dpkp commented 6 years ago

Although setting api_version may appear to fix a problem, this is a very wrong assessment:

Version 1.3.5 of this library (which is latest on pypy) only lists certain API versions 0.8.0 to 0.10.1. So unless you explicitly specify api_version to be (0, 10, 1) the client library's attempt to discover the version will cause a NoBrokersAvailable error.

The issue is not the version check, it is the TCP socket connection itself. If kafka-python can connect to a 1.0.0 broker it would still be identified as (0, 10, 1). The only thing you achieve by setting an api_version explicitly is you never actually try to open the TCP socket. But that probably just means that your connection issue will surface later when you try to send or receive messages.

The deeper issue here is a real connection bug that I believe has to do with using all available dns lookup data during bootstrap.

jar349 commented 6 years ago

When I explicitly set the api_version, I am able to produce and consume events.

dpkp commented 6 years ago

I believe the underlying issue is related to handling multiple network interfaces. My mac, for example, shows three addrinfos for localhost:

('::1', 9092, 0, 0)
('fe80::1%lo0', 9092, 0, 1)
('127.0.0.1', 9092)

The bootstrap and version check code currently only attempt connection on the first addrinfo returned, so if the broker is bound to a later addrinfo then the connection will fail. I fixed this in #1411 .

jar349 commented 6 years ago

May I please ask why explicitly setting the api_version always works for me but removing it always fails? It doesn't seem to me that the value of api_version would in any way effect which network interface is bound?

dpkp commented 6 years ago

When you set api_version the client will not attempt to probe brokers for version information. So it is the probe operation that is failing. One large difference between the version probe connections and the general connections is that the former only attempts to connect on a single interface per connection (per broker), where as the latter -- general operation -- will cycle through all interfaces continually until a connection succeeds. #1411 fixes this by switching the version probe logic to attempt a connection on all found interfaces.

jar349 commented 6 years ago

That makes perfect sense, thank you! 👍

dpkp commented 6 years ago

This is fixed on master

jeffwji commented 6 years ago

@dpkp, I'm using 1.4.2 still encountering this issue. does master branch has been released?

nahidalam commented 6 years ago

@dpkp same here, using 1.4.2 (installed on March 27, 2018) still getting the error on Macbook

suesunss commented 6 years ago

@dpkp Same problem, using 1.4.2, if version not specified, will raise the exception.

yukurkov commented 6 years ago

I fixed this error with kafka config by adding 'custom kafka broker' with key-value: advertised.listeners=PLAINTEXT://01.02.03.04:1234

jeffwidman commented 6 years ago

@jeffwji / @nahid / @suesunss can you provide more info?

Will it work fine if you specify the version in kafka-python, and fail if you omit the version?

Is another client working fine (to verify that the broker is indeed running properly)?

suesunss commented 6 years ago

@jeffwidman hello, I am using kafka-python 1.4.2. The problem is: When I am using the new client interface (under client_async.py), API_VERSIONmust be specified, whereas using the old interface (client.py), it works fine.

hariseldonn commented 6 years ago

jar349, specifying version number as you instructed solve 'NoBrokerAvailable' error.

giladsh1 commented 6 years ago

After investigating this issue for a while... if you're using wurstmeister/kafka with docker-compose, please notice that in Kafka's last version many parameters have been deprecated. instead of using -

KAFKA_HOST:
KAFKA_PORT: 9092
KAFKA_ADVERTISED_HOST_NAME: <IP-ADDRESS>
KAFKA_ADVERTISED_PORT: 9092

you need to use -

KAFKA_LISTENERS: PLAINTEXT://:9092
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://<IP-ADDRESS>:9092

view this link for more details

bmamouri commented 5 years ago

I am having the same problem using hosted Kafka (IBM Cloud) and 1.4.4 version of kafka-python.

If I don't specify api_version, I can't connect to the consumer, and when I do my consumer does not receive any messages.

mrzhangboss commented 5 years ago

I meet this problem in centos7, and I fix this by using flowing code

       producer = KafkaProducer(bootstrap_servers='localhost:9092', request_timeout_ms=1000000, api_version_auto_timeout_ms=1000000)

It seems increasing this value api_version_auto_timeout_ms may fix this problem.

peacecoder commented 5 years ago

@yukurkov @giladsh1 , one of solutions is what you find. thanks

Aireed commented 5 years ago

@mrzhangboss works.

xiaowuyz commented 5 years ago

After investigating this issue for a while... if you're using wurstmeister/kafka with docker-compose, please notice that in Kafka's last version many parameters have been deprecated. instead of using -

KAFKA_HOST:
KAFKA_PORT: 9092
KAFKA_ADVERTISED_HOST_NAME: <IP-ADDRESS>
KAFKA_ADVERTISED_PORT: 9092

you need to use -

KAFKA_LISTENERS: PLAINTEXT://:9092
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://<IP-ADDRESS>:9092

view this link for more details

How_ do you configure docker-compose is my docker-compose.yml:

version: '2'

services:
zookeeper:
container_name: zookeeper
image: 'wurstmeister/zookeeper'
ports:
- '2181:2181'
volumes:
- ./zookeeper-data:/bitnami/zookeeper
- /root/kafka/kafka-topics.sh:/kafka-topics.sh
kafka:
image: 'wurstmeister/kafka'
ports:
- '9092:9092'
environment:
KAFKA_LISTENERS: 'PLAINTEXT://:9092'
KAFKA_ADVERTISED_LISTENERS: 'PLAINTEXT://localhost:9092'
KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181'
volumes:
- ./kafka-persistence:/bitnami/kafka
- /root/kafka/kafka-topics.sh:/kafka-topics.sh
depends_on:
- zookeeper
container_name: kafka
diarmuidcire commented 5 years ago

I had the same problem with kafka_2.12-2.1.0. The issue was caused by multiple network interfaces on virtual machine images, as commented by user: dpkp. I recreated the VM image with only network interface and reinstalled the application (OSM Release 4) and everything ran as expected.

osm_pm.1.kfdvv4by3ovs@osm-r4 | 02/13/2019 10:28:44 PM Connecting to Kafka server at kafka:9092 osm_pm.1.f5e5rgn6ydhs@osm-r4 | Traceback (most recent call last): osm_pm.1.kfdvv4by3ovs@osm-r4 | 02/13/2019 10:28:44 PM Bootstrapping cluster metadata from [('kafka', 9092, <AddressFamily.AF_UNSPEC: 0>)] osm_pm.1.f5e5rgn6ydhs@osm-r4 | File "/usr/local/bin/osm-policy-agent", line 9, in osm_pm.1.kfdvv4by3ovs@osm-r4 | 02/13/2019 10:28:44 PM : connecting to 10.0.0.17:9092 osm_pm.1.f5e5rgn6ydhs@osm-r4 | load_entry_point('osm-policy-module==1.0', 'console_scripts', 'osm-policy-agent')() osm_pm.1.kfdvv4by3ovs@osm-r4 | 02/13/2019 10:28:44 PM Bootstrap succeeded: found 1 brokers and 0 topics. osm_pm.1.f5e5rgn6ydhs@osm-r4 | File "/usr/local/lib/python3.5/dist-packages/osm_policy_module/cmd/policy_module_agent.py", line 66, in main osm_pm.1.kfdvv4by3ovs@osm-r4 | 02/13/2019 10:28:44 PM : Closing connection. osm_pm.1.f5e5rgn6ydhs@osm-r4 | agent.run() osm_pm.1.kfdvv4by3ovs@osm-r4 | 02/13/2019 10:28:44 PM : connecting to 10.0.0.17:9092 osm_pm.1.f5e5rgn6ydhs@osm-r4 | File "/usr/local/lib/python3.5/dist-packages/osm_policy_module/core/agent.py", line 51, in run osm_pm.1.kfdvv4by3ovs@osm-r4 | 02/13/2019 10:28:44 PM Broker version identifed as 0.11.0 osm_pm.1.kfdvv4by3ovs@osm-r4 | 02/13/2019 10:28:44 PM Set configuration api_version=(0, 11, 0) to skip auto check_version requests on startup osm_pm.1.f5e5rgn6ydhs@osm-r4 | group_id="pm-consumer") osm_pm.1.kfdvv4by3ovs@osm-r4 | 02/13/2019 10:28:44 PM Updating subscribed topics to: ['lcm_pm', 'alarm_response'] osm_pm.1.f5e5rgn6ydhs@osm-r4 | File "/usr/local/lib/python3.5/dist-packages/kafka/consumer/group.py", line 324, in init osm_pm.1.kfdvv4by3ovs@osm-r4 | 02/13/2019 10:28:45 PM Topic alarm_response is not available during auto-create initialization osm_pm.1.kfdvv4by3ovs@osm-r4 | 02/13/2019 10:28:45 PM Topic lcm_pm is not available during auto-create initialization osm_pm.1.f5e5rgn6ydhs@osm-r4 | self._client = KafkaClient(metrics=self._metrics, **self.config) osm_pm.1.kfdvv4by3ovs@osm-r4 | 02/13/2019 10:28:48 PM Group coordinator for pm-consumer is BrokerMetadata(nodeId=1001, host='kafka', port=9092, rack=None) osm_pm.1.f5e5rgn6ydhs@osm-r4 | File "/usr/local/lib/python3.5/dist-packages/kafka/client_async.py", line 221, in init osm_pm.1.kfdvv4by3ovs@osm-r4 | 02/13/2019 10:28:48 PM Discovered coordinator 1001 for group pm-consumer osm_pm.1.f5e5rgn6ydhs@osm-r4 | self.config['api_version'] = self.check_version(timeout=check_timeout) osm_pm.1.kfdvv4by3ovs@osm-r4 | 02/13/2019 10:28:48 PM Revoking previously assigned partitions set() for group pm-consumer osm_pm.1.f5e5rgn6ydhs@osm-r4 | File "/usr/local/lib/python3.5/dist-packages/kafka/client_async.py", line 826, in check_version osm_pm.1.kfdvv4by3ovs@osm-r4 | 02/13/2019 10:28:48 PM (Re-)joining group pm-consumer osm_pm.1.f5e5rgn6ydhs@osm-r4 | raise Errors.NoBrokersAvailable() osm_pm.1.f5e5rgn6ydhs@osm-r4 | kafka.errors.NoBrokersAvailable: NoBrokersAvailable

satwikk commented 5 years ago

If you are using kafka-python library initialize your producer by specifying api_version. In my case it was "2"

producer = kafka.KafkaProducer(bootstrap_servers=['localhost:9092'], \
                             value_serializer=lambda x:dumps(x).encode('utf-8'), \
                             api_version=(2))
jeffwidman commented 5 years ago

I'm going to close this, as it seems the underlying issue was fixed, and now it's just a catchall... but reading through the most recent comments, it seems most of those that have been traced to root causes are due to environment problems, not library bugs.

If you are stumbling across this through searching, and can't figure it out, please open a new ticket instead of commenting on this one.