spotify / docker-kafka

Kafka (and Zookeeper) in Docker
Apache License 2.0
1.4k stars 640 forks source link

Minor update to README #38

Closed massenz closed 7 years ago

massenz commented 8 years ago

Hey guys, super-thanks for creating this, saved me a TON of time!

I just noticed that the instructions in the README may be slightly improved, by using the following (I was having trouble running this on a VM where I hadn't installed Kafka, so wanted to run both consumer/producer inside two separate containers, still using this image - but avoid using the "hack" of using localhost everywhere).

I also believe that Kafka will have trouble if the ADVERTISED_HOST does not reflect the IP of the container.

Anyways, this is how I ran it - please let me know if you would like me to update the README and I'd be happy to submit a PR.

# Launch the "main" Kafka container, this runs the server;
# it also changes the `hostname` to "kafka" and sets an appropriate entry to /etc/hosts:
$ docker run --rm --env ADVERTISED_HOST=kafka --env ADVERTISED_PORT=9092 \
    --name kafka -h kafka spotify/kafka 

To launch the Producer (this used to fail with a kafka.common.LeaderNotAvailableException when posting to a topic):

$ docker run --rm -it --name producer --link kafka spotify/kafka \
     /opt/kafka_2.11-0.8.2.1/bin/kafka-console-producer.sh --broker kafka:9092 --topic test

similarly, the consumer can be run with:

$ docker run --rm -it --name consumer --link kafka spotify/kafka \
    /opt/kafka_2.11-0.8.2.1/bin/kafka-console-consumer.sh --zookeeper kafka:2181 --topic test

Also, I would suggest adding the /opt/kafka_xxx directory to the container's PATH in the Dockerfile (again, happy to send a PR, if you guys want me to).

Hope this helps!

tuhingupta commented 8 years ago

What I have noticed, running spotify/kafka docker instance behind corporate firewall is that posting messages to topic using kafka-console-producer.sh will always fail if you give your docker container id in ADVERTISED_HOST or any combination of the same.

Since inside the docker image /etc/hosts local IP 127.0.0.1 is mapped to localhost. Hence I found it useful when running the container to point ADVERTISED_HOST to localhost.

Then everything worked well.

Hope you could put this comment somewhere on wiki or some place. This is not mentioned anywhere and it was through try /error that I stumbled on this. THis is only for situations where you are running docker behind corporate firewall and need to forward all request using VM port forwarding.

massenz commented 8 years ago

Thanks for comment, @tuhingupta ! I'm not sure I follow the reasoning, though - in particular, I don't see how the "corporate firewall" comes into play here (if you are running this on a Windows box, no surprise there: I have no idea how that works!)

In particular, using localhost only works if you are running kafka as a container, then have the producer/consumer processes run inside the same container (via docker exec) - otherwise, you need a way to tell both which IP address to use: using the --link (or the newer --net) flag is the best way (however, the suggested way in README - using boot2docker- has been deprecated for quite some time now).

Docker will modify the /etc/hosts file of the container(s) following the use of the -h and --link flags, so this all "just works".

If you do run the "producer" from your dev box (eg, laptop) then, obviously, you need a way to reach the VM's IP (depending on what you are using, whether Virtualbox or the newer Docker for Mac/Windows, there are different ways to reach it).

Again, I'm not sure how Kafka is going to use ADVERTISED_HOST, but I'm pretty sure that pointing it to localhost will make Kafka unreachable from other containers/hosts (different containers have different 'localhosts' even if they run on the same VM/box).

fffergal commented 8 years ago

Note if you are using docker-compose, and are expecting ADVERTISED_HOST of "kafka" to work if you have a service called "kafka", it won't. docker-compose doesn't add the service's own name to hosts. You can set hostname in docker-compose.yml though, and use that for ADVERTISED_HOST. e.g.:

kafka:
  image: spotify/kafka
  hostname: kafka
  environment:
    ADVERTISED_HOST: kafka
    ADVERTISED_PORT: 9092
  ports:
  - "9092:9092"
  - "2181:2181"
mbehrisch commented 8 years ago

1+

arianitu commented 7 years ago

This is what you should do if you want to talk to kafka from another container

Run Kafka:

docker run --rm --env ADVERTISED_HOST=kafka --env ADVERTISED_PORT=9092 --name kafka -h kafka spotify/kafka

Run Kafka Consumer:

docker run --rm -it --name kafka-consumer --link kafka spotify/kafka /bin/sh -c '\$KAFKA_HOME/bin/kafka-console-consumer.sh --bootstrap-server kafka:9092 --topic test'

Run Kafka Producer:

docker run --rm -it --name kafka-producer --link kafka spotify/kafka /bin/sh -c '\$KAFKA_HOME/bin/kafka-console-producer.sh --broker-list kafka:9092 --topic test'

This is what you should do if you want to talk to kafka from localhost

docker run --rm --env ADVERTISED_HOST=localhost -p 2181:2181 -p 9092:9092 --env ADVERTISED_PORT=9092 --name kafka -h kafka spotify/kafka

Run Kafka Consumer (you must run this on your localhost machine, not using docker run):

kafka-console-consumer --bootstrap-server localhost:9092 --topic test

Run Kafka Producer (you must run this on your localhost machine, not using docker run):

kafka-console-producer --broker-list localhost:9092 --topic test

SquirrelNinja commented 7 years ago

ran this: docker run -d --rm --env ADVERTISED_HOST=kafka --env ADVERTISED_PORT=9092 --name kafka -h kafka spotify/kafka

ran this: docker run --rm -it --name kafka-consumer --link kafka spotify/kafka /bin/sh -c '\$KAFKA_HOME/bin/kafka-console-consumer.sh --bootstrap-server kafka:9092 --topic test'

ERROR: /bin/sh: 1: $KAFKA_HOME/bin/kafka-console-consumer.sh: not found

Ran this: docker run --rm -it --name kafka-consumer --link kafka spotify/kafka /bin/sh -c 'ls $KAFKA_HOME/bin'

result: connect-distributed.sh kafka-replica-verification.sh connect-standalone.sh kafka-run-class.sh kafka-acls.sh kafka-server-start.sh kafka-configs.sh kafka-server-stop.sh kafka-console-consumer.sh kafka-simple-consumer-shell.sh kafka-console-producer.sh kafka-streams-application-reset.sh kafka-consumer-groups.sh kafka-topics.sh kafka-consumer-offset-checker.sh kafka-verifiable-consumer.sh kafka-consumer-perf-test.sh kafka-verifiable-producer.sh kafka-mirror-maker.sh windows kafka-preferred-replica-election.sh zookeeper-security-migration.sh kafka-producer-perf-test.sh zookeeper-server-start.sh kafka-reassign-partitions.sh zookeeper-server-stop.sh kafka-replay-log-producer.sh zookeeper-shell.sh

Not sure where to go now...

arianitu commented 7 years ago

@SquirrelNinja it appears that $KAFKA_HOME is not being expanded for you, this is the error I usually get if the path is wrong:

/bin/sh: 1: /opt/kafka_2.11-0.10.1.0/bin/kafka-console-consumerdzsda.sh: not found

SquirrelNinja commented 7 years ago

@arianitu - you are correct, I put the whole path in and it was fine - thx!

massenz commented 7 years ago

@SquirrelNinja my best guess is that you are missing a \ in the command just before the $ - the way you built it, $KAFKA_HOME is being expanded by your host box which, even in the unlikely event had it defined, would probably point someplace else.

FWIW - I prefer to use docker exec -it kafka /bin/bash and try things out there at the console, so you can see what's what (eg, run /usr/bin/env and see what env vars are defined and to what values).

massenz commented 7 years ago

BTW - as this project seems 'abandoned" to me (there are 22 PR's pending) I have forked my own, which also has been recently updated to run Kafka 0.10.1.1.

SquirrelNinja commented 7 years ago

@massenz - thanks!