moooofly / MarkSomethingDownLLS

本仓库用于记录自 2017年10月16日起,在英语流利说任职期间的各方面知识汇总(以下内容不足以体现全部,一些敏感内容已移除)~
MIT License
72 stars 37 forks source link

Kafka connectivity #59

Open moooofly opened 5 years ago

moooofly commented 5 years ago

Kafka connectivity

A large percentage of questions and issues opened against the kafka-docker project concern configuring Kafka networking. This is often a case of not understanding the Kafka requirements, or not understanding docker networking.

This page aims to explain some of the basic requirements to help people resolve any initial issues they may encounter. It is not an exhaustive guide to configuring Kafka and/or Docker.

和本项目相关的、被提及最多的问题是关于 Kafka networking 的;而这个问题和 Kafka requirements 及 docker networking 有关;

Kafka requirements

There are three main requirements for configuring Kafka networking.

Kafka networking 的三大要求;

The following diagram represents the different communication paths:

image

This means for a complete working Kafka setup, each one of the components must be able to route to the other and have accessible ports.

Kafka in Docker

First, let's take a look at the simplest use-case - running a single Kafka broker.

最简模式:运行单个 Kafka broker

# KAFKA_ADVERTISED_HOST_NAME: localhost
# ZOOKEEPER_CONNECT: zookeeper:2181
docker-compose -f docker-compose-single-broker.yml up -d

docker-compose-single-broker.yml 内容如下

version: '2'
services:
  zookeeper:
    image: wurstmeister/zookeeper
    ports:
      - "2181:2181"
  kafka:
    build: .
    ports:
      - "9092:9092"
    environment:
      KAFKA_ADVERTISED_HOST_NAME: 192.168.99.100
      KAFKA_CREATE_TOPICS: "test:1:1"
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock

image

Two containers are created which share the kafka-docker_default bridge network created by docker-compose. Here you can see both ports from the container are mapped directly to the host's network interface (2181 and 9092)

NOTE: When using docker-compose, all containers are generally started in the same networking namespace. I say 'generally' as you can configure multiple networks, but we're sticking to the simple use-case here.

In this setup all Kafka requirements are met:

$ docker ps
CONTAINER ID        IMAGE                      PORTS                                                NAMES
1bf0d78a352c        wurstmeister/zookeeper     22/tcp, 2888/tcp, 3888/tcp, 0.0.0.0:2181->2181/tcp   kafkadocker_zookeeper_1
d0c932301db5        kafkadocker_kafka          0.0.0.0:9092->9092/tcp                               kafkadocker_kafka_1

Next, let's look at the common use-case - running multiple Kafka brokers.

常用模式:运行多个 Kafka broker

# KAFKA_ADVERTISED_HOST_NAME: 192.168.1.2
# ZOOKEEPER_CONNET: zookeeper:2181
docker-compose up -d zookeeper
docker-compose scale kafka=2

docker-compose.yml 内容如下

version: '2'
services:
  zookeeper:
    image: wurstmeister/zookeeper
    ports:
      - "2181:2181"
  kafka:
    build: .
    ports:
      - "9092"
    environment:
      KAFKA_ADVERTISED_HOST_NAME: 192.168.99.100
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock

image

Here, the key differences are two configurations in the docker-compose.yml file; ports and the KAFKA_ADVERTISED_HOST_NAME environment variable.

Because it is only possible to bind to each unique port once on a single interface, we can no longer publish the Broker port (9092). Instead, we simply expose the port.

ports:
  - "9092"

This results in docker binding an ephemeral port on the host interface to the container port.

$ docker ps
CONTAINER ID        IMAGE                      PORTS                                                NAMES
2c3fe5e651bf        kafkadocker_kafka          0.0.0.0:32000->9092/tcp                              kafkadocker_kafka_2
4e22d3d715ec        kafkadocker_kafka          0.0.0.0:32001->9092/tcp                              kafkadocker_kafka_1
bfb5545efe6b        wurstmeister/zookeeper     22/tcp, 2888/tcp, 3888/tcp, 0.0.0.0:2181->2181/tcp   kafkadocker_zookeeper_1

This should hopefully explain why we had to use the hosts interface address in the KAFKA_ADVERTISED_HOST_NAME environment var. Let's cement this understanding by adding consumers / producers to the diagram:

image

This explains why all Kafka requirements are met: