scality / backbeat

Zenko Backbeat is the core engine for asynchronous replication, optimized for queuing metadata updates and dispatching work to long-running tasks in the background.
https://www.zenko.io
Apache License 2.0
55 stars 19 forks source link

Guidance on setting up a docker-compose file for enabling backbeat + cloudserver for bucket notifications #2560

Open claudiu-muresan-pfa opened 1 month ago

claudiu-muresan-pfa commented 1 month ago

I tried with this docker-compose file:

version: '3.4'
volumes:
  cloudserver-data:
    name: cloudserver-data
  cloudserver-metadata:
    name: cloudserver-metadata
  mongo-data:
    name: mongo-data
services:
  cloudserver:
    image: zenko/cloudserver:latest
    container_name: cloudserver
    platform: linux/amd64
    environment:
      - S3BACKEND=file
      - REMOTE_MANAGEMENT_DISABLE=1
      - ENDPOINT=localhost
      - SCALITY_ACCESS_KEY_ID=accessKey1
      - SCALITY_SECRET_ACCESS_KEY=verySecretKey1
      - REDIS_HOST=redis
      - REDIS_PORT=6379
      - MONGODB_HOSTS=mongo:27017
      - MONGODB_DATABASE=zenko
      - CRR_METRICS_HOST=backbeat
      - CRR_METRICS_PORT=8000
      - LOG_LEVEL=trace
    ports:
      - "8111:8000"
    volumes:
      - cloudserver-data:/usr/src/app/localData
      - cloudserver-metadata:/usr/src/app/localMetadata
    depends_on:
      - redis
      - mongo
    restart: always

  backbeat:
    image: zenko/backbeat:latest
    container_name: backbeat
    platform: linux/amd64
    environment:
      - CLOUDSERVER_HOST=cloudserver
      - CLOUDSERVER_PORT=8000
      - QUEUE_POPULATOR_ENABLED=true
      - BACKBEAT_REPLICATION_METRICS=true
      - REDIS_HOST=redis
      - REDIS_PORT=6379
      - MONGODB_HOSTS=mongo:27017
      - MONGODB_DATABASE=zenko
      - ZOOKEEPER_AUTO_CREATE_NAMESPACE=true
      - ZOOKEEPER_CONNECTION_STRING=zookeeper:2181
      - KAFKA_HOSTS=kafka:9092
      - LOG_LEVEL=trace
    depends_on:
      - cloudserver
      - redis
      - mongo
      - zookeeper
      - kafka
    restart: always

  redis:
    image: redis:alpine
    container_name: redis-server
    ports:
      - "6379:6379"
    restart: always

  mongo:
    image: mongo:4.2
    container_name: mongo
    ports:
      - "27117:27017"
    volumes:
      - mongo-data:/data/db
    restart: always

  zookeeper:
    image: wurstmeister/zookeeper:3.4.6
    container_name: zookeeper
    ports:
      - "2181:2181"
    restart: always

  kafka:
    image: wurstmeister/kafka:latest
    container_name: kafka
    ports:
      - "9092:9092"
    environment:
      KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
    depends_on:
      - zookeeper
    restart: always

but backbeat service is constantly restarting. Can you please provide me a minimal docker-compose that I can use for inspiration so that backbeat gets "connected" to cloudserver so that I can use the bucket notifications extension?

claudiu-muresan-pfa commented 1 month ago

Nevermind. I was using the docker hub images that are quite old so I moved to using the GHCR images. My docker-compose looks now as:

volumes:
  cloudserver-data:
    name: cloudserver-data
  cloudserver-metadata:
    name: cloudserver-metadata
  mongo-data:
    name: mongo-data

services:
  cloudserver:
    image: ghcr.io/scality/cloudserver:0b29e7914f2061a3ae44b3cd3480ea53aa9bab97
    container_name: cloudserver
    platform: linux/amd64
    environment:
      - S3BACKEND=file
      - REMOTE_MANAGEMENT_DISABLE=1
      - ENDPOINT=localhost
      - SCALITY_ACCESS_KEY_ID=accessKey1
      - SCALITY_SECRET_ACCESS_KEY=verySecretKey1
      - REDIS_HOST=redis
      - REDIS_PORT=6379
      - CRR_METRICS_HOST=backbeat
      - CRR_METRICS_PORT=8901
      - LOG_LEVEL=trace
    ports:
      - "8111:8000"
    volumes:
      - cloudserver-data:/usr/src/app/localData
      - cloudserver-metadata:/usr/src/app/localMetadata
    depends_on:
      - redis
    restart: always

  backbeat:
    image: ghcr.io/scality/backbeat:0a32e66773027f99b5057bf9159bf1ec1f5fc3c0
    container_name: backbeat
    platform: linux/amd64
    volumes:
      - ./backbeat:/usr/src/app/conf
    command: ["yarn", "start"]
    ports:
      - "8901:8901"
    environment:
      - REMOTE_MANAGEMENT_DISABLE=1
    depends_on:
      - redis
      - mongo
      - zookeeper
      - kafka
      - cloudserver
    restart: always

  redis:
    image: redis:alpine
    container_name: redis-server
    ports:
      - "6379:6379"
    restart: always

  mongo:
    image: mongo:4.2
    container_name: mongo
    ports:
      - "27117:27017"
    volumes:
      - mongo-data:/data/db
    command: >
      mongod --replSet rs0 --bind_ip_all
    restart: always

  zookeeper:
    image: wurstmeister/zookeeper:latest
    container_name: zookeeper
    ports:
      - "2181:2181"
    restart: always

  kafka:
    image: wurstmeister/kafka:latest
    container_name: kafka
    ports:
      - "9092:9092"
    environment:
      KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
    depends_on:
      - zookeeper
    restart: always

with the backbeat config files as:

I fail to understand how are the bucket notifications send from the CloudServer to Backbeat. Can I get some hints here? Thanks in advance.

BourgoisMickael commented 1 week ago

CloudServer writes actions performed on bucket / objects into the metadata backend in a s3-recordLog table / collection.

Backbeat will connect to that metadata server to read that recordLog and send bucket notifications.

Right now, your cloudserver uses a file for metadata backend and not mongo. You can add an env var S3METADATA=mongodb to cloudserver to write metadata into mongodb.

Then in backbeat config queuePopulator.logSource should be the name of the configuration to that metadata backend, you don't use bucketd, so it should be the metadata file server or mongodb