confluentinc / cp-docker-images

[DEPRECATED] Docker images for Confluent Platform.
Apache License 2.0
1.14k stars 704 forks source link

Can not use dockerized installation from remote host - Local: Broker transport failure: localhost:9092/1: Connect to ipv4#127.0.0.1:9092 failed: Connection refused #796

Closed holgerbrandl closed 5 years ago

holgerbrandl commented 5 years ago

When using the quickstart protocol to deploy a local cluster with

git clone https://github.com/confluentinc/examples
cd examples
git checkout 5.3.1-post
cd cp-all-in-one/
docker-compose up -d --build

the broker can be used on the same host with

docker-compose exec broker kafka-topics --create --zookeeper zookeeper:2181 --replication-factor 1 --partitions 1 --topic tester

echo "huhu" | kafkacat -b localhost -t tester 

But when trying to use the broker from a different machine (dddocker02 is the hostname where the docker confluent deployment is runnig it fails with:

brandl@dddocker01:~$ echo "huhu" | kafkacat -d broker -Pb dddocker02 -t tester
%7|1568801817.217|BRKMAIN|rdkafka#producer-1| [thrd::0/internal]: :0/internal: Enter main broker thread
%7|1568801817.217|STATE|rdkafka#producer-1| [thrd::0/internal]: :0/internal: Broker changed state INIT -> UP
%7|1568801817.217|BROKER|rdkafka#producer-1| [thrd:app]: dddocker02:9092/bootstrap: Added new broker with NodeId -1
%7|1568801817.217|INIT|rdkafka#producer-1| [thrd:app]: librdkafka v0.11.5 (0xb05ff) rdkafka#producer-1 initialized (debug 0x2)
%7|1568801817.217|BRKMAIN|rdkafka#producer-1| [thrd:dddocker02:9092/bootstrap]: dddocker02:9092/bootstrap: Enter main broker thread
%7|1568801817.217|CONNECT|rdkafka#producer-1| [thrd:dddocker02:9092/bootstrap]: dddocker02:9092/bootstrap: broker in state INIT connecting
%7|1568801817.218|CONNECT|rdkafka#producer-1| [thrd:dddocker02:9092/bootstrap]: dddocker02:9092/bootstrap: Connecting to ipv4#192.168.1.137:9092 (plaintext) with socket 7
%7|1568801817.218|STATE|rdkafka#producer-1| [thrd:dddocker02:9092/bootstrap]: dddocker02:9092/bootstrap: Broker changed state INIT -> CONNECT
%7|1568801817.218|CONNECT|rdkafka#producer-1| [thrd:dddocker02:9092/bootstrap]: dddocker02:9092/bootstrap: Connected to ipv4#192.168.1.137:9092
%7|1568801817.218|CONNECTED|rdkafka#producer-1| [thrd:dddocker02:9092/bootstrap]: dddocker02:9092/bootstrap: Connected (#1)
%7|1568801817.218|FEATURE|rdkafka#producer-1| [thrd:dddocker02:9092/bootstrap]: dddocker02:9092/bootstrap: Updated enabled protocol features +ApiVersion to ApiVersion
%7|1568801817.218|STATE|rdkafka#producer-1| [thrd:dddocker02:9092/bootstrap]: dddocker02:9092/bootstrap: Broker changed state CONNECT -> APIVERSION_QUERY
%7|1568801817.220|FEATURE|rdkafka#producer-1| [thrd:dddocker02:9092/bootstrap]: dddocker02:9092/bootstrap: Updated enabled protocol features to MsgVer1,ApiVersion,BrokerBalancedConsumer,ThrottleTime,Sasl,SaslHandshake,BrokerGroupCoordinator,LZ4,OffsetTime,MsgVer2
%7|1568801817.220|STATE|rdkafka#producer-1| [thrd:dddocker02:9092/bootstrap]: dddocker02:9092/bootstrap: Broker changed state APIVERSION_QUERY -> UP
%7|1568801817.221|BROKER|rdkafka#producer-1| [thrd:main]: localhost:9092/1: Added new broker with NodeId 1
%7|1568801817.221|CLUSTERID|rdkafka#producer-1| [thrd:main]: dddocker02:9092/bootstrap: ClusterId update "" -> "sPefqcpZRmucNjJL4jdbCw"
%7|1568801817.221|CONTROLLERID|rdkafka#producer-1| [thrd:main]: dddocker02:9092/bootstrap: ControllerId update -1 -> 1
%7|1568801817.221|BRKMAIN|rdkafka#producer-1| [thrd:localhost:9092/1]: localhost:9092/1: Enter main broker thread
%7|1568801817.221|CONNECT|rdkafka#producer-1| [thrd:localhost:9092/1]: localhost:9092/1: broker in state INIT connecting
%7|1568801817.221|CONNECT|rdkafka#producer-1| [thrd:localhost:9092/1]: localhost:9092/1: Connecting to ipv4#127.0.0.1:9092 (plaintext) with socket 10
%7|1568801817.221|STATE|rdkafka#producer-1| [thrd:localhost:9092/1]: localhost:9092/1: Broker changed state INIT -> CONNECT
%7|1568801817.221|TOPBRK|rdkafka#producer-1| [thrd:localhost:9092/1]: localhost:9092/1: Topic tester [0]: joining broker (rktp 0x7fba1c001970)
%7|1568801817.221|BROKERFAIL|rdkafka#producer-1| [thrd:localhost:9092/1]: localhost:9092/1: failed: err: Local: Broker transport failure: (errno: Connection refused)
%7|1568801817.221|STATE|rdkafka#producer-1| [thrd:localhost:9092/1]: localhost:9092/1: Broker changed state CONNECT -> DOWN
% ERROR: Local: Broker transport failure: localhost:9092/1: Connect to ipv4#127.0.0.1:9092 failed: Connection refused
%7|1568801818.221|CONNECT|rdkafka#producer-1| [thrd:localhost:9092/1]: localhost:9092/1: broker in state DOWN connecting
%7|1568801818.221|CONNECT|rdkafka#producer-1| [thrd:localhost:9092/1]: localhost:9092/1: Connecting to ipv4#127.0.0.1:9092 (plaintext) with socket 10
%7|1568801818.221|STATE|rdkafka#producer-1| [thrd:localhost:9092/1]: localhost:9092/1: Broker changed state DOWN -> CONNECT
%7|1568801818.221|BROKERFAIL|rdkafka#producer-1| [thrd:localhost:9092/1]: localhost:9092/1: failed: err: Local: Broker transport failure: (errno: Connection refused)
%7|1568801818.221|FAIL|rdkafka#producer-1| [thrd:localhost:9092/1]: localhost:9092/1: Connect to ipv4#127.0.0.1:9092 failed: Connection refused
%7|1568801818.221|STATE|rdkafka#producer-1| [thrd:localhost:9092/1]: localhost:9092/1: Broker changed state CONNECT -> DOWN
%7|1568801818.221|RECONNECT|rdkafka#producer-1| [thrd:localhost:9092/1]: localhost:9092/1: Delaying next reconnect by 749ms
%7|1568801818.970|CONNECT|rdkafka#producer-1| [thrd:localhost:9092/1]: localhost:9092/1: broker in state DOWN connecting

Could this be a an issue with the docker deployment?

The port binding seems fine

brandl@dddocker02:~$ netstat -a | grep 9092
tcp6       0      0 [::]:9092               [::]:*                  LISTEN
tcp6       0      0 [::]:29092              [::]:*                  LISTEN
brandl@dddocker02:~$ netstat -a | grep 9092
tcp6       0      0 [::]:9092               [::]:*                  LISTEN
tcp6       0      0 [::]:29092              [::]:*                  LISTEN

For completeness here's the docker ps info

CONTAINER ID        IMAGE                                                 COMMAND                  CREATED             STATUS                 PORTS                                              NAMES
647bc7bf28e7        confluentinc/cp-ksql-cli:5.3.1                        "/bin/sh"                2 hours ago         Up 2 hours                                                                ksql-cli
dfb937adf677        confluentinc/ksql-examples:5.3.1                      "bash -c 'echo Waiti…"   2 hours ago         Up 2 hours                                                                ksql-datagen
6643d6d9d2f8        confluentinc/cp-enterprise-control-center:5.3.1       "/etc/confluent/dock…"   2 hours ago         Up 2 hours             0.0.0.0:9021->9021/tcp                             control-center
0189bab1ec17        confluentinc/cp-ksql-server:5.3.1                     "/etc/confluent/dock…"   2 hours ago         Up 2 hours             0.0.0.0:8088->8088/tcp                             ksql-server
1feba8366835        cnfldemos/kafka-connect-datagen:0.1.3-5.3.1           "/etc/confluent/dock…"   2 hours ago         Up 2 hours (healthy)   0.0.0.0:8083->8083/tcp, 9092/tcp                   connect
a4136efff28e        confluentinc/cp-kafka-rest:5.3.1                      "/etc/confluent/dock…"   2 hours ago         Up 2 hours             0.0.0.0:8082->8082/tcp                             rest-proxy
ff1bf8f02a4f        confluentinc/cp-schema-registry:5.3.1                 "/etc/confluent/dock…"   2 hours ago         Up 2 hours             0.0.0.0:8081->8081/tcp                             schema-registry
22865fe12916        confluentinc/cp-enterprise-kafka:5.3.1                "/etc/confluent/dock…"   2 hours ago         Up 2 hours             0.0.0.0:9092->9092/tcp, 0.0.0.0:29092->29092/tcp   broker
ad70d7c01f37        confluentinc/cp-zookeeper:5.3.1                       "/etc/confluent/dock…"   2 hours ago         Up 2 hours             2888/tcp, 0.0.0.0:2181->2181/tcp, 3888/tcp         zookeeper
OneCricketeer commented 5 years ago

The quickstart is only intended for single machine configurations, and not expected to work when another machine is introduced.

That being said, on port 9092, the advertised listeners hostname is localhost.

On port 29092, the advertised listeners is broker.

If you'd like all client requests to be routed over the Docker network to the broker service, use port 29092

If you'd like services to communicate with host dddocker02, you need an advertised listener with that hostname in it

holgerbrandl commented 5 years ago

When using the local quickstart mode, broker access from another host is working without any issues using the default 9092 port. So the problem is tied to the docker deployment mode (which makes it a bug imho).

If broker access is limited by design to only the docker host (which would be a rather odd design imho), it should be maybe made much more clear in the docker quickstart docs?

OneCricketeer commented 5 years ago

I'd say it's not a bug; it's a (lack of) networking configuration.

Running Kafka outside of Docker works better because Kafka's source code sets up the advertised listeners to be the externally resolvable address such that other systems in your LAN can reach. When you add the Docker bridge network in there, and Kafka doesn't know it's in a container, then it's isolated to that network unless otherwise configured. A simple port forwarding will not work.

If you'd like the same behavior from Docker, you'd have to adjust one configuration that is very specific to your own system / environment to include your own IP/Hostname.

holgerbrandl commented 5 years ago

Since confluent UI is accessible from other non-docker-host machines, it gives the incorrect impression that the quickstart docker deployment is fully functional.

Is the additional configuration which would be necessary to enable communication from other hosts with a dockerized quickstart-broker documented in the manual (or elsewhere)? I'd guess, it's not just me having some basic idea about kafka and docker, but struggling with setting up such networking details.

OneCricketeer commented 5 years ago

A UI of static html doesn't necessarily mean that each backing API or system is responsive and healthy.

Yes, advertised.listeners is documented in the Kafka documentation as well as within the server.properties file.

And no, it's not just you. The same problem is seemingly posted multiple times per week on Stackoverflow. And it prompted a blog https://www.confluent.io/blog/kafka-listeners-explained

But there's no good place to really consolidate the information - the quickstart is simply to get it all working locally, as a POC or demo, but I agree it could have a note saying as much. If you really want to scale it out, there's obvious networking considerations that need taken into account

holgerbrandl commented 5 years ago

Thanks for sharing this article. It perfectly described the underlying kafka concepts and detailed out the solution of my issue (which was simply to replace localhost with the actual hostname of the docker host in the yaml).