elastic / beats

:tropical_fish: Beats - Lightweight shippers for Elasticsearch & Logstash
https://www.elastic.co/products/beats
Other
12.17k stars 4.92k forks source link

docker autodiscover hints & networks #8498

Open mathroc opened 6 years ago

mathroc commented 6 years ago

Hi,

I'm configuring my servers monitoring with metricbeat, and I have an issue: I'm using docker autodiscover hints to find containers to monitor, eg:


  traefik:
    image: traefik:1.6.6
    # some config unrelated to this issue skipped
    deploy:
      mode: global
    labels:
      co.elastic.metrics/module: traefik
      co.elastic.metrics/metricsets: health
      co.elastic.metrics/hosts: '$${data.host}:8080'
      co.elastic.metrics/period: 10s
    networks:
      - web
      - default
      - metrics

networks:
  default:
  web:
    name: web
    driver: overlay
    attachable: true
  metrics:
    name: metrics
    external: true

here metrics is the network metricbeat is attached to, the traefik containers has multiple IPs for the multiple networks it's attached to, so ${data.host} get one of the IP (first? random?) but it's not the correct one. that means metricbeat is not able to query the traefik API, and I have this message instead of the metrics in the metricbeat index:

"failed to sample health: error making http request: Get http://10.0.24.31:8080/health: dial tcp 10.0.24.31:8080: connect: network is unreachable"

10.0.24 is the IP of traefik but for another container

traefik itself deals with this problem when using autodiscover by adding a label to specify wich network to use (in the exemple I have above, I use the "web" network to let traefik communicate with other containers so I add the label traefik.docker.network: web)

I think a similar label should be added to metricbeat, here I would like to add co.elastic.metrics/network: metrics

an alternative would be to add all IPs inside the autodiscover event data, so that I could use this label instead for the hosts: co.elastic.metrics/hosts: '${data.networks.metrics.ip}:8080'

exekias commented 6 years ago

Thank you for opening this @mathroc, I agree we can and should provide more tools to let you choose the network. we can probably provide networks.yournetwork.host field in the autodiscover event

mathroc commented 6 years ago

if I have the time to install golang and setup a dev environment for this I'll give it a try, but if someone already know how to write go, please don't wait for me :angel:

looks like it should be done here in emitContainer https://github.com/elastic/beats/blob/master/libbeat/autodiscover/providers/docker/docker.go#L118

dzurikmiroslav commented 4 years ago

Maybe this problem solve not using directly IP address of container ${data.host} but add variable ${data.hostname} for network alias/hostname for example da098a22d0b6 which is same on all networks.

elasticmachine commented 4 years ago

Pinging @elastic/integrations-platforms (Team:Platforms)

jan-demsar commented 4 years ago

Same problem in FileBeat.

We use docker autodiscover feature for redis:

filebeat.autodiscover:
  providers:
    - type: docker
      hints.enabled: true
      hints.default_config.enabled: true
      templates:
        - condition:
            contains:
              docker.container.image: redis
          config:
            - module: redis
              log:
                enabled: true
                input:
                  type: docker
                  containers.ids:
                    - "${data.docker.container.id}"
              slowlog:
                enabled: true
                var.hosts: ["${data.host}:${data.port}"]

Log part is working great, but slowlog keeps returning errors like:

Error running input: error receiving slowlog data: dial tcp 10.0.11.84:6379: connect: connection timed out

When I check redis container, I can see it has different IP for each network:

backend: 10.0.11.84 elastic: 10.0.2.196

Filebeat container is connected to elastic network only and cannot access 10.0.11.84 IP because it is on different network.

So a solution like @mathroc suggested would be great!

sderungs commented 3 years ago

We're having a similar issue with metricbeat.autodiscover in 7.12 in Docker swarm mode. Metricbeat should be used to monitor the Elasticsearch cluster according to the docs.

While trying to setup this with Autodiscover as we have multiple Elasticsearch nodes, we run into the same issue. The IP reported by Autodiscover for the Elasticsearch instances is sometimes the one from the docker built-in "ingress" network instead of our overlay network used for the cluster containers. Elasticsearch containers are part of the ingress network because they expose ports (9200) while Metricbeat of course doesn't. Therefore, Metricbeat is not part of the ingress network and is therefore unable to reach the corresponding Elasticsearch node on that particular IP (while the corresponding IP of the same Elasticsearch container in the overlay network is ping-able from within the Metricbeat container).

botelastic[bot] commented 2 years ago

Hi! We just realized that we haven't looked into this issue in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1. Thank you for your contribution!

Filipe777 commented 2 years ago

Hi! We just realized that we haven't looked into this issue in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1. Thank you for your contribution!

Still active. Using version 6.8.23. Fix is much appreciated.

devamanv commented 1 year ago

Hi @mathroc. Thank for logging the issue. I understand the problem but I couldn't reproduce the exact issue. However I got some Connection Refused after following the part of the docker-compose(as seen below) you shared in the description. I am not sure if it's the same. In this regard I might need these:

services:
  traefik:
    image: traefik:v1.6
    container_name: "traefik"
    # some config unrelated to this issue skipped
    deploy:
      mode: global
    labels:
      co.elastic.metrics/module: traefik
      co.elastic.metrics/metricsets: health
      co.elastic.metrics/hosts: '{$$data.host}'
      co.elastic.metrics/period: 10s
    networks:
      - web
      - default
      - metrics
    command: 
      - "--api" 
      - "--debug"
      - "--docker"
      - "--entryPoints='http address::80"
    ports:
      # The HTTP port
      - "80:80"
      # The Web UI (enabled by --api.insecure=true)
      - "8080:8080"
    volumes:
      # So that Traefik can listen to the Docker events
      - /var/run/docker.sock:/var/run/docker.sock
  whoami:
    # A container that exposes an API to show its IP address
    image: traefik/whoami
    labels:
      - "traefik.http.routers.whoami.rule=Host(`whoami.docker.localhost`)" 
      - "traefik.http.routers.whoami.entrypoints=web"   

networks:
  default:
  web:
    name: web
    driver: bridge
    attachable: true
  metrics:
    name: metrics
    # If set to true, external specifies that this network’s lifecycle is maintained outside of that of the application.
    external: true