nebula-orchestrator / worker

The worker node manager container which manages nebula nodes
https://nebula-orchestrator.github.io/
GNU General Public License v3.0
34 stars 10 forks source link

Facing issue in creating reporting kafka connection object #62

Closed Sharvin26 closed 5 years ago

Sharvin26 commented 5 years ago

Hello

I have configured the Nebula worker on the Raspberry Pi.

I am using Ubuntu 18.04 VPS on which I have the following containers =>

  1. Nebula Manager
  2. Mongo
  3. Nebula Reporter
  4. kafka
  5. zookeeper

Expected/Wanted Behavior

The worker sends the current state to a Kafka cluster after every sync with the manager. The reporter component will pull from Kafka and populate the state data into the backend DB. Then the manager can query the new state data from the backend DB to let the admin know the state of managed devices.

Actual Behavior

When the nebula worker downloads and updates the application while reporting the state using Kafka I get the following error =>

1. Logs of Nebula worker =>

Recreating 3cc087462a4c_worker ... done
Attaching to worker
worker    | reading config variables
worker    | reading config variables
worker    | /usr/local/lib/python3.7/site-packages/parse_it/file/file_reader.py:55: UserWarning: config_folder_location does not exist, only envvars & cli args will be used
worker    |   warnings.warn("config_folder_location does not exist, only envvars & cli args will be used")
worker    | logging in to registry
worker    | {'IdentityToken': '', 'Status': 'Login Succeeded'}
worker    | checking nebula manager connection
worker    | nebula manager connection ok
worker    | stopping all preexisting nebula managed app containers in order to ensure a clean slate on boot
worker    | stopping container e02f34d03c880a47cc33cb51b5e84578f7e387f305e618843a9c8e229ccd93cb
worker    | removing container e02f34d03c880a47cc33cb51b5e84578f7e387f305e618843a9c8e229ccd93cb
worker    | initial start of example app
worker    | pulling image <my_registry_url>/flask:latest
worker    | <my_registry_url>/flask
worker    | {
worker    |     "status": "Pulling from flask",
worker    |     "id": "latest"
worker    | }
worker    | {
worker    |     "status": "Digest: sha256:6f51939e6d3dff3fdfebdeb639ddad00c3671d5f0b241666c9e140d1bfa7883c"
worker    | }
worker    | {
worker    |     "status": "Status: Image is up to date for <my_registry_url>/flask:latest"
worker    | }
worker    | creating container example-1
worker    | successfully created container example-1
worker    | starting container example-1
worker    | completed initial start of example app
worker    | starting work container health checking thread
worker    | creating reporting kafka connection object
worker    | failed creating reporting kafka connection object - exiting
worker    | NoBrokersAvailable

2. Logs of Nebula Reporter =>

reading config variables
creating reporting kafka connection object
NoBrokersAvailable
failed creating reporting kafka connection object - exiting
reading config variables
creating reporting kafka connection object
NoBrokersAvailable
failed creating reporting kafka connection object - exiting
reading config variables
creating reporting kafka connection object
NoBrokersAvailable
failed creating reporting kafka connection object - exiting
reading config variables
creating reporting kafka connection object
NoBrokersAvailable
failed creating reporting kafka connection object - exiting
reading config variables
creating reporting kafka connection object
NoBrokersAvailable
failed creating reporting kafka connection object - exiting
reading config variables
creating reporting kafka connection object
opened MongoDB connection
starting to digest messages from kafka

Note As Kafka logs are too big I haven't added them but if you need them for debugging I can attach the log file.

Steps to Reproduce the Problem

  1. Configured worker on Raspberry Pi using docker-compose.yml and docker custom built mentioned in the Specifications section

  2. Configured Manager, Reporter, Mongo, Kafka, and zookeeper on Ubuntu 18.04 using docker-compose.yml as mentioned in the Specification section.

  3. Configured a Private Docker Registry for maintaining the Update release and Images.

Specifications

1. Nebula worker =>

At the worker side as I am using Raspberry Pi I had to build the Image on the Pi and start the container. For achieving this I did the following steps =>

Dir Structure =>

- Nebula worker 
       - Dockerfile
       - docker-compose.yml
       - worker/ ( Directory where all the source code is there )

Docker file =>

# it's official so I'm using it + alpine so damn small
FROM python:3.7.2-alpine3.9

# copy the codebase
COPY . /worker

# install required packages - requires build-base due to psutil GCC complier requirements
RUN apk add --no-cache build-base python3-dev linux-headers
RUN pip install -r /worker/worker/requirements.txt

#set python to be unbuffered
ENV PYTHONUNBUFFERED=1

# run the worker-manger
WORKDIR /worker
CMD [ "python", "worker/worker.py" ]

docker-compose.yml for worker =>

version: '3'
services:
  worker:
    container_name: worker
    build:
      context: .
      dockerfile: Dockerfile
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
    restart: unless-stopped
    hostname: worker
    environment:
      REGISTRY_HOST: <my_registry_url>
      REGISTRY_AUTH_USER: <my_registry_user>
      REGISTRY_AUTH_PASSWORD: <my_registry_password>
      MAX_RESTART_WAIT_IN_SECONDS: 0
      NEBULA_MANAGER_AUTH_USER: nebula
      NEBULA_MANAGER_AUTH_PASSWORD: nebula
      NEBULA_MANAGER_HOST: <my_vps_url>
      NEBULA_MANAGER_PORT: 80
      NEBULA_MANAGER_PROTOCOL: http
      NEBULA_MANAGER_CHECK_IN_TIME: 5
      DEVICE_GROUP: example
      KAFKA_BOOTSTRAP_SERVERS: <my_vps_url>:9092
      KAFKA_TOPIC: nebula-reports

2. Nebula Manager, Mongo, Kafka, Reporter and Zookeeper =>

docker-compose.yml =>

version: '3'
services:
  mongo:
    container_name: mongo
    hostname: mongo
    image: mongo:4.0.1
    ports:
      - "27017:27017"
    restart: unless-stopped
    environment:
      MONGO_INITDB_ROOT_USERNAME: nebula
      MONGO_INITDB_ROOT_PASSWORD: nebula

  manager:
    container_name: manager
    hostname: manager
    depends_on:
      - mongo
    image: nebulaorchestrator/manager
    ports:
      - "80:80"
    restart: unless-stopped
    environment:
      MONGO_URL: mongodb://nebula:nebula@mongo:27017/nebula?authSource=admin
      SCHEMA_NAME: nebula
      BASIC_AUTH_PASSWORD: nebula
      BASIC_AUTH_USER: nebula
      AUTH_TOKEN: nebula

  zookeeper:
    container_name: zookeeper
    hostname: zookeeper
    image: zookeeper:3.4.13
    ports:
      - 2181:2181
    restart: unless-stopped
    environment:
      ZOO_MY_ID: 1

  kafka:
    container_name: kafka
    hostname: kafka
    image: confluentinc/cp-kafka:5.1.2
    ports:
      - 9092:9092
    restart: unless-stopped
    depends_on:
      - zookeeper
    environment:
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
      KAFKA_BROKER_ID: 1
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1

  reporter:
    container_name: reporter
    hostname: reporter
    depends_on:
      - mongo
      - kafka
    image: nebulaorchestrator/reporter
    restart: unless-stopped
    environment:
      MONGO_URL: mongodb://nebula:nebula@mongo:27017/nebula?authSource=admin
      SCHEMA_NAME: nebula
      BASIC_AUTH_PASSWORD: nebula
      BASIC_AUTH_USER: nebula
      KAFKA_BOOTSTRAP_SERVERS: kafka:9092
      KAFKA_TOPIC: nebula-reports
issue-label-bot[bot] commented 5 years ago

Issue Label Bot is not confident enough to auto-label this issue. See dashboard for more details.

naorlivne commented 5 years ago

I think the issue might be KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092 on your Kafka config, this is the address where you tell Kafka to listen to connections to & you told it to listen to an address that's only available to the containers that are part of the same docker-compose, please try KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://<my_vps_url>:9092 instead.

Sharvin26 commented 5 years ago

Hello @naorlivne

Thanks for the Response.

I changed KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092 to KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://<my_vps_url>:9092.

But I am still facing the same issue.

docker-compose.yml for Nebula Manager, Mongo, Kafka, Reporter and Zookeeper =>

version: '3'
services:
  mongo:
    container_name: mongo
    hostname: mongo
    image: mongo:4.0.1
    ports:
      - "27017:27017"
    restart: unless-stopped
    environment:
      MONGO_INITDB_ROOT_USERNAME: nebula
      MONGO_INITDB_ROOT_PASSWORD: nebula

  manager:
    container_name: manager
    hostname: manager
    depends_on:
      - mongo
    image: nebulaorchestrator/manager
    ports:
      - "80:80"
    restart: unless-stopped
    environment:
      MONGO_URL: mongodb://nebula:nebula@mongo:27017/nebula?authSource=admin
      SCHEMA_NAME: nebula
      BASIC_AUTH_PASSWORD: nebula
      BASIC_AUTH_USER: nebula
      AUTH_TOKEN: nebula

  zookeeper:
    container_name: zookeeper
    hostname: zookeeper
    image: zookeeper:3.4.13
    ports:
      - 2181:2181
    restart: unless-stopped
    environment:
      ZOO_MY_ID: 1

  kafka:
    container_name: kafka
    hostname: kafka
    image: confluentinc/cp-kafka:5.1.2
    ports:
      - 9092:9092
    restart: unless-stopped
    depends_on:
      - zookeeper
    environment:
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://<my_vps_url>:9092
      KAFKA_BROKER_ID: 1
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1

  reporter:
    container_name: reporter
    hostname: reporter
    depends_on:
      - mongo
      - kafka
    image: nebulaorchestrator/reporter
    restart: unless-stopped
    environment:
      MONGO_URL: mongodb://nebula:nebula@mongo:27017/nebula?authSource=admin
      SCHEMA_NAME: nebula
      BASIC_AUTH_PASSWORD: nebula
      BASIC_AUTH_USER: nebula
      KAFKA_BOOTSTRAP_SERVERS: kafka:9092
      KAFKA_TOPIC: nebula-reports

Note: is the ip address of my server.

naorlivne commented 5 years ago

Still think that the issue is kafka connectivity - can you confirm that your Kafka starts & stays up? (run docker ps after a few minutes and see it's uptime is at least longer then a minute)

Assuming Kafka stays online Can you try connecting to it from the worker using Kafka CLI to test? my feeling is that when you enter PLAINTEXT://:9092 the Kafka container doesn't know what that IP is so it can't bind to it (as the IP belongs to the host and not the container).

Another option you can test is going back to the original compose file (before my suggestion to move from kafka to the host public IP) & using extra-hosts on your worker configuration to add a host entry that will point kafka to your kafka host public IP, assuming this works we will know for sure that my guesstimation to the issue cause is correct.

Reading the provided reporter logs I can see it tried connecting a few times and failed until it finally succeeded, this makes sense when you consider the time it takes Kafka to boot up & also means that connection to kafka from inside the compose network works and that the issue seems to be external access to it.

Sharvin26 commented 5 years ago

Thanks for the Response. It was my silly mistake I forgot to keep the port open of the VPS machine and that's why the issue was created.