SigNoz / signoz

SigNoz is an open-source observability platform native to OpenTelemetry with logs, traces and metrics in a single application. An open-source alternative to DataDog, NewRelic, etc. 🔥 🖥. 👉 Open source Application Performance Monitoring (APM) & Observability tool
https://signoz.io
Other
19.23k stars 1.27k forks source link

[HELP] Signoz docker standalone `logspout` container keeps crashing on my ec2 #4494

Open amanangira opened 9 months ago

amanangira commented 9 months ago

Hello, I have setup self-hosted Signoz backend on an AWS ec2 instance. I did follow the official documentation of Signoz to set it up and additionally commented out the sample application from the docker-compose as mentioned here.

Screenshot 2024-02-05 at 4 06 24 PM

The logspout container seems to be restarting repeatedly and I could find the following error by running docker container logs command

2024/02/05 10:34:33 !! lookup otel-collector on 127.0.0.11:53: no such host
2024/02/05 10:35:34 # logspout v3.2.14 by gliderlabs
2024/02/05 10:35:34 # adapters: multiline raw syslog tcp tls udp
2024/02/05 10:35:34 # options : 
2024/02/05 10:35:34 persist:/mnt/routes

I did look into the docker-compose.yaml but couldn't find any references to port 53.

Here's my docker-compose.yaml

version: "2.4"

x-clickhouse-defaults: &clickhouse-defaults
  restart: on-failure
  # addding non LTS version due to this fix https://github.com/ClickHouse/ClickHouse/commit/32caf8716352f45c1b617274c7508c86b7d1afab
  image: clickhouse/clickhouse-server:23.11.1-alpine
  tty: true
  depends_on:
    - zookeeper-1
    # - zookeeper-2
    # - zookeeper-3
  logging:
    options:
      max-size: 50m
      max-file: "3"
  healthcheck:
    # "clickhouse", "client", "-u ${CLICKHOUSE_USER}", "--password ${CLICKHOUSE_PASSWORD}", "-q 'SELECT 1'"
    test:
      [
        "CMD",
        "wget",
        "--spider",
        "-q",
        "localhost:8123/ping"
      ]
    interval: 30s
    timeout: 5s
    retries: 3
  ulimits:
    nproc: 65535
    nofile:
      soft: 262144
      hard: 262144

x-db-depend: &db-depend
  depends_on:
    clickhouse:
      condition: service_healthy
    otel-collector-migrator:
      condition: service_completed_successfully
    # clickhouse-2:
    #   condition: service_healthy
    # clickhouse-3:
    #   condition: service_healthy

services:

  zookeeper-1:
    image: bitnami/zookeeper:3.7.1
    container_name: signoz-zookeeper-1
    hostname: zookeeper-1
    user: root
    ports:
      - "2181:2181"
      - "2888:2888"
      - "3888:3888"
    volumes:
      - ./data/zookeeper-1:/bitnami/zookeeper
    environment:
      - ZOO_SERVER_ID=1
      # - ZOO_SERVERS=0.0.0.0:2888:3888,zookeeper-2:2888:3888,zookeeper-3:2888:3888
      - ALLOW_ANONYMOUS_LOGIN=yes
      - ZOO_AUTOPURGE_INTERVAL=1

  clickhouse:
    <<: *clickhouse-defaults
    container_name: signoz-clickhouse
    hostname: clickhouse
    ports:
      - "9000:9000"
      - "8123:8123"
      - "9181:9181"
    volumes:
      - ./clickhouse-config.xml:/etc/clickhouse-server/config.xml
      - ./clickhouse-users.xml:/etc/clickhouse-server/users.xml
      - ./custom-function.xml:/etc/clickhouse-server/custom-function.xml
      - ./clickhouse-cluster.xml:/etc/clickhouse-server/config.d/cluster.xml
      # - ./clickhouse-storage.xml:/etc/clickhouse-server/config.d/storage.xml
      - ./data/clickhouse/:/var/lib/clickhouse/
      - ./user_scripts:/var/lib/clickhouse/user_scripts/

  alertmanager:
    image: signoz/alertmanager:${ALERTMANAGER_TAG:-0.23.4}
    container_name: signoz-alertmanager
    volumes:
      - ./data/alertmanager:/data
    depends_on:
      query-service:
        condition: service_healthy
    restart: on-failure
    command:
      - --queryService.url=http://query-service:8085
      - --storage.path=/data

  # Notes for Maintainers/Contributors who will change Line Numbers of Frontend & Query-Section. Please Update Line Numbers in `./scripts/commentLinesForSetup.sh` & `./CONTRIBUTING.md`

  query-service:
    image: signoz/query-service:${DOCKER_TAG:-0.36.2}
    container_name: signoz-query-service
    command:
      [
        "-config=/root/config/prometheus.yml",
        "--prefer-delta=true"
      ]
    # ports:
    #   - "6060:6060"     # pprof port
    #   - "8080:8080"     # query-service port
    volumes:
      - ./prometheus.yml:/root/config/prometheus.yml
      - ../dashboards:/root/config/dashboards
      - ./data/signoz/:/var/lib/signoz/
    environment:
      - ClickHouseUrl=tcp://clickhouse:9000/?database=signoz_traces
      - ALERTMANAGER_API_PREFIX=http://alertmanager:9093/api/
      - SIGNOZ_LOCAL_DB_PATH=/var/lib/signoz/signoz.db
      - DASHBOARDS_PATH=/root/config/dashboards
      - STORAGE=clickhouse
      - GODEBUG=netdns=go
      - TELEMETRY_ENABLED=true
      - DEPLOYMENT_TYPE=docker-standalone-amd
    restart: on-failure
    healthcheck:
      test:
        [
          "CMD",
          "wget",
          "--spider",
          "-q",
          "localhost:8080/api/v1/health"
        ]
      interval: 30s
      timeout: 5s
      retries: 3
    <<: *db-depend

  frontend:
    image: signoz/frontend:${DOCKER_TAG:-0.36.2}
    container_name: signoz-frontend
    restart: on-failure
    depends_on:
      - alertmanager
      - query-service
    ports:
      - "3301:3301"
    volumes:
      - ../common/nginx-config.conf:/etc/nginx/conf.d/default.conf

  otel-collector-migrator:
    image: signoz/signoz-schema-migrator:${OTELCOL_TAG:-0.88.6}
    container_name: otel-migrator
    command:
      - "--dsn=tcp://clickhouse:9000"
    depends_on:
      clickhouse:
        condition: service_healthy
      # clickhouse-2:
      #   condition: service_healthy
      # clickhouse-3:
      #   condition: service_healthy

  otel-collector:
    image: signoz/signoz-otel-collector:${OTELCOL_TAG:-0.88.6}
    container_name: signoz-otel-collector
    command:
      [
        "--config=/etc/otel-collector-config.yaml",
        "--manager-config=/etc/manager-config.yaml",
        "--copy-path=/var/tmp/collector-config.yaml",
        "--feature-gates=-pkg.translator.prometheus.NormalizeName"
      ]
    user: root # required for reading docker container logs
    volumes:
      - ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
      - ./otel-collector-opamp-config.yaml:/etc/manager-config.yaml
      - /var/lib/docker/containers:/var/lib/docker/containers:ro
    environment:
      - OTEL_RESOURCE_ATTRIBUTES=host.name=signoz-host,os.type=linux
      - DOCKER_MULTI_NODE_CLUSTER=false
      - LOW_CARDINAL_EXCEPTION_GROUPING=false
    ports:
      # - "1777:1777"     # pprof extension
      - "4317:4317" # OTLP gRPC receiver
      - "4318:4318" # OTLP HTTP receiver
      # - "8888:8888"     # OtelCollector internal metrics
      # - "8889:8889"     # signoz spanmetrics exposed by the agent
      # - "9411:9411"     # Zipkin port
      # - "13133:13133"   # health check extension
      # - "14250:14250"   # Jaeger gRPC
      # - "14268:14268"   # Jaeger thrift HTTP
      # - "55678:55678"   # OpenCensus receiver
      # - "55679:55679"   # zPages extension
    restart: on-failure
    depends_on:
      clickhouse:
        condition: service_healthy
      otel-collector-migrator:
        condition: service_completed_successfully
      query-service:
        condition: service_healthy

  otel-collector-metrics:
    image: signoz/signoz-otel-collector:${OTELCOL_TAG:-0.88.6}
    container_name: signoz-otel-collector-metrics
    command:
      [
        "--config=/etc/otel-collector-metrics-config.yaml",
        "--feature-gates=-pkg.translator.prometheus.NormalizeName"
      ]
    volumes:
      - ./otel-collector-metrics-config.yaml:/etc/otel-collector-metrics-config.yaml
    # ports:
    #   - "1777:1777"     # pprof extension
    #   - "8888:8888"     # OtelCollector internal metrics
    #   - "13133:13133"   # Health check extension
    #   - "55679:55679"   # zPages extension
    restart: on-failure
    <<: *db-depend

  logspout:
    image: "gliderlabs/logspout:v3.2.14"
    container_name: signoz-logspout
    volumes:
      - /etc/hostname:/etc/host_hostname:ro
      - /var/run/docker.sock:/var/run/docker.sock
    command: syslog+tcp://otel-collector:2255
    depends_on:
      - otel-collector
    restart: on-failure

My local Signoz repository points to the following commit. Since I am unsure on how to find the version and which service (multiple services are defined in the docker-compose.yaml) version would be relevant here.

Thank you for the help.

welcome[bot] commented 9 months ago

Thanks for opening this issue. A team member should give feedback soon. In the meantime, feel free to check out the contributing guidelines.

prashant-shahi commented 9 months ago

From the shared screenshot, I see there is no container for signoz-otel-collector. Do check why that container is not up.