albertodonato / query-exporter

Export Prometheus metrics from SQL queries
GNU General Public License v3.0
446 stars 103 forks source link

Error: error while attempting to bind on address ('::1', 9560, 0, 0): cannot assign requested address #155

Closed connorourke closed 1 year ago

connorourke commented 1 year ago

Describe the bug

I'm running query-exporter within a Docker container. WHen I try to start it with the example config.yaml I get the following error:

unhandled exception during asyncio.run() shutdown
task: <Task finished name='Task-1' coro=<_run_app() done, defined at /query_exporter/.venv/lib/python3.10/site-packages/aiohttp/web.py:289> exception=OSError(99, "error while attempting to bind on address ('::1', 9560, 0, 0): cannot assign requested address")>
Traceback (most recent call last):
  File "/query_exporter/.venv/lib/python3.10/site-packages/aiohttp/web.py", line 516, in run_app
    loop.run_until_complete(main_task)
  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/query_exporter/.venv/lib/python3.10/site-packages/aiohttp/web.py", line 415, in _run_app
    await site.start()
  File "/query_exporter/.venv/lib/python3.10/site-packages/aiohttp/web_runner.py", line 121, in start
    self._server = await loop.create_server(
  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 1519, in create_server
    raise OSError(err.errno, 'error while attempting '
OSError: [Errno 99] error while attempting to bind on address ('::1', 9560, 0, 0): cannot assign requested address
Traceback (most recent call last):
  File "/query_exporter/.venv/bin/query-exporter", line 8, in <module>
Exception in thread Thread-1 (thread_fn):
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
Exception in thread Thread-2 (thread_fn):
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    sys.exit(script())
  File "/query_exporter/.venv/lib/python3.10/site-packages/toolrack/script.py", line 110, in __call__
    self.run()
  File "/usr/local/lib/python3.10/threading.py", line 953, in run
    self.run()
    self._target(*self._args, **self._kwargs)
  File "/query_exporter/.venv/lib/python3.10/site-packages/sqlalchemy_aio/asyncio.py", line 53, in thread_fn
  File "/usr/local/lib/python3.10/threading.py", line 953, in run
    return self.main(parsed_args) or 0
  File "/query_exporter/.venv/lib/python3.10/site-packages/prometheus_aioexporter/script.py", line 143, in main
    self._loop.call_soon_threadsafe(request.set_finished)
  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 798, in call_soon_threadsafe
    self._target(*self._args, **self._kwargs)
  File "/query_exporter/.venv/lib/python3.10/site-packages/sqlalchemy_aio/asyncio.py", line 53, in thread_fn
    exporter.run()
  File "/query_exporter/.venv/lib/python3.10/site-packages/prometheus_aioexporter/web.py", line 71, in run
    run_app(
  File "/query_exporter/.venv/lib/python3.10/site-packages/aiohttp/web.py", line 516, in run_app
    loop.run_until_complete(main_task)
  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/query_exporter/.venv/lib/python3.10/site-packages/aiohttp/web.py", line 415, in _run_app
    self._loop.call_soon_threadsafe(request.set_finished)
  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 798, in call_soon_threadsafe
    self._check_closed()
  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 515, in _check_closed
    raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed
    self._check_closed()
  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 515, in _check_closed
    await site.start()
  File "/query_exporter/.venv/lib/python3.10/site-packages/aiohttp/web_runner.py", line 121, in start
    self._server = await loop.create_server(
  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 1519, in create_server
    raise OSError(err.errno, 'error while attempting '
OSError: [Errno 99] error while attempting to bind on address ('::1', 9560, 0, 0): cannot assign requested address

The port 9560 is exposed (i am running in the slurmctld container:

version: "3.3"

services:
  prometheus:
    container_name: prometheus
    image: prom/prometheus
    restart: always
    volumes:
      - ./prometheus:/etc/prometheus/
      - prometheus_data:/prometheus
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
      - '--web.console.libraries=/usr/share/prometheus/console_libraries'
      - '--web.console.templates=/usr/share/prometheus/consoles'
    ports:
      - 9090:9090
    networks:
      - prom_app_net

  grafana:
    container_name: grafana
    image: grafana/grafana
    user: '472'
    restart: always
    environment:
      GF_INSTALL_PLUGINS: 'grafana-clock-panel,grafana-simple-json-datasource'
    volumes:
      - grafana_data:/var/lib/grafana
      - ./grafana/provisioning/:/etc/grafana/provisioning/
      - './grafana/grafana.ini:/etc/grafana/grafana.ini'
    env_file:
      - ./grafana/.env_grafana
    ports:
      - 3000:3000
    depends_on:
      - prometheus
    networks:
      - prom_app_net

  mysql:
    image: mariadb:10.10
    hostname: mysql
    container_name: mysql
    environment:
      MYSQL_RANDOM_ROOT_PASSWORD: "yes"
      MYSQL_DATABASE: slurm_acct_db
      MYSQL_USER: slurm
      MYSQL_PASSWORD: password
    volumes:
      - var_lib_mysql:/var/lib/mysql

  slurmdbd:
    image: prom-slurm-cluster:${IMAGE_TAG:-21.08.6}
    build:
      context: .
      args:
        SLURM_TAG: ${SLURM_TAG:-slurm-21-08-6-1}
    command: ["slurmdbd"]
    container_name: slurmdbd
    hostname: slurmdbd
    volumes:
      - etc_munge:/etc/munge
      - etc_slurm:/etc/slurm
      - var_log_slurm:/var/log/slurm
      - cgroups:/sys/fs/cgroup:ro
    expose:
      - "6819"
    ports:
      - "6819:6819"
    depends_on:
      - mysql
    privileged: true
    cgroup: host

  slurmctld:
    image: prom-slurm-cluster:${IMAGE_TAG:-21.08.6}
    command: ["slurmctld"] 
    container_name: slurmctld
    hostname: slurmctld
    volumes:
      - etc_munge:/etc/munge
      - etc_slurm:/etc/slurm
      - slurm_jobdir:/data
      - var_log_slurm:/var/log/slurm
      - etc_prometheus:/etc/prometheus
      - /sys/fs/cgroup:/sys/fs/cgroup:rw
    expose:
      - "6817"
      - "8080"
      - "8081"
      - "9560"
    ports:
      - 8080:8080
      - 8081:8081
      - 9560:9560
    depends_on:
      - "slurmdbd"
    privileged: true
    cgroup: host

  c1:
    image: prom-slurm-cluster:${IMAGE_TAG:-21.08.6}
    command: ["slurmd"]
    hostname: c1
    container_name: c1
    volumes:
      - etc_munge:/etc/munge
      - etc_slurm:/etc/slurm
      - slurm_jobdir:/data
      - var_log_slurm:/var/log/slurm
      - cgroups:/sys/fs/cgroup:ro
    expose:
      - "6818"
    depends_on:
      - "slurmctld"
    privileged: true
    cgroup: host  

  c2:
    image: prom-slurm-cluster:${IMAGE_TAG:-21.08.6}
    command: ["slurmd"]
    hostname: c2
    container_name: c2
    volumes:
      - etc_munge:/etc/munge
      - etc_slurm:/etc/slurm
      - slurm_jobdir:/data
      - var_log_slurm:/var/log/slurm
      - cgroups:/sys/fs/cgroup:ro
    expose:
      - "6818"
      - "22"
    depends_on:
      - "slurmctld"
    privileged: true
    cgroup: host

volumes:
  etc_munge:
  etc_slurm:
  slurm_jobdir:
  var_lib_mysql:
  var_log_slurm:
  grafana_data:
  prometheus_data:
  cgroups: 
  etc_prometheus:

networks:
  prom_app_net:
    driver: bridge

Installation details

RUN wget https://www.python.org/ftp/python/3.10.12/Python-3.10.12.tgz \ && tar -xzvf Python-3.10.12.tgz\ && pushd Python-3.10.12 \ && ./configure --enable-optimizations --prefix=/usr/local\ && make \ && make altinstall


  - snap: [output from `snap info query-exporter`]

**To Reproduce**

If possible, please provide detailed steps to reproduce the behavior:

1. Config file content (redacted of secrets if needed)

databases: db1: dsn: sqlite:// connect-sql:

metrics: metric1: type: gauge description: A sample gauge metric2: type: summary description: A sample summary labels: [l1, l2] expiration: 24h metric3: type: histogram description: A sample histogram buckets: [10, 20, 50, 100, 1000] metric4: type: enum description: A sample enum states: [foo, bar, baz]

queries: query1: interval: 5 databases: [db1] metrics: [metric1] sql: SELECT random() / 1000000000000000 AS metric1 query2: interval: 20 timeout: 0.5 databases: [db1, db2] metrics: [metric2, metric3] sql: | SELECT abs(random() / 1000000000000000) AS metric2, abs(random() / 10000000000000000) AS metric3, "value1" AS l1, "value2" AS l2 query3: schedule: "/5 *" databases: [db2] metrics: [metric4] sql: | SELECT value FROM ( SELECT "foo" AS metric4 UNION SELECT "bar" AS metric3 UNION SELECT "baz" AS metric4 ) ORDER BY random() LIMIT 1


2. Ran query-exporter with the following command line ...
    `query-exporter config.yaml`

4. Got the error when running the command above
connorourke commented 1 year ago

If i expose and pass through a differernt port with query-exporter -p <port_no_here> config.yaml I get the same thing.

connorourke commented 1 year ago

Though the port is exposed. Running the following in the container:

from aiohttp import web

async def handle(request):
    name = request.match_info.get('name', "World!")
    text = "Hello, " + name
    print('received request, replying with "{}".'.format(text))
    return web.Response(text=text)

app = web.Application()
app.router.add_get('/', handle)
app.router.add_get('/{name}', handle)

web.run_app(app, port=8082)

Works fine and serves up the hello world page.

albertodonato commented 1 year ago

What's the exact command line that's being called for query-exporter? it seems it's trying to bind IPv6, which is possibly not enabled in docker

connorourke commented 1 year ago

query-exporter config.yaml -p 8082

(when i am using port 8082 rather than 9056 - but the error is the same either way)

albertodonato commented 1 year ago

by default, that will try to bind ipv6 as well, but that's likely not enabled in docker.

For this reason the docker image only binds ipv4:

https://github.com/albertodonato/query-exporter/blob/b85be7b403634f84e6c8daf31fc2fd1f3785a85c/Dockerfile#L61

connorourke commented 1 year ago

OK - thanks. I'm not using the query-exporter image - I'm installing query-exporter on the container with pip. Is there a way to enable it to bind IPv6 then?

albertodonato commented 1 year ago

https://docs.docker.com/config/daemon/ipv6/

connorourke commented 1 year ago

I have enabled IPv6, following the instructions by putting:

{
  "experimental": true,
  "ip6tables": true
}

into my docker/daemon.json and restarting. But the exporter still doesn't work. My compose file with the IPv6 network looks like:

version: "3.3"

services:
  prometheus:
    container_name: prometheus
    image: prom/prometheus
    restart: always
    volumes:
      - ./prometheus:/etc/prometheus/
      - prometheus_data:/prometheus
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
      - '--web.console.libraries=/usr/share/prometheus/console_libraries'
      - '--web.console.templates=/usr/share/prometheus/consoles'
    ports:
      - 9090:9090
    networks:
      - prom_app_net

  grafana:
    container_name: grafana
    image: grafana/grafana
    user: '472'
    restart: always
    environment:
      GF_INSTALL_PLUGINS: 'grafana-clock-panel,grafana-simple-json-datasource'
    volumes:
      - grafana_data:/var/lib/grafana
      - ./grafana/provisioning/:/etc/grafana/provisioning/
      - './grafana/grafana.ini:/etc/grafana/grafana.ini'
    env_file:
      - ./grafana/.env_grafana
    ports:
      - 3000:3000
    depends_on:
      - prometheus
    networks:
      - prom_app_net

  mysql:
    image: mariadb:10.10
    hostname: mysql
    container_name: mysql
    environment:
      MYSQL_RANDOM_ROOT_PASSWORD: "yes"
      MYSQL_DATABASE: slurm_acct_db
      MYSQL_USER: slurm
      MYSQL_PASSWORD: password
    volumes:
      - var_lib_mysql:/var/lib/mysql
    networks:
      - slurm
#    network_mode: host

  slurmdbd:
    image: prom-slurm-cluster:${IMAGE_TAG:-21.08.6}
    build:
      context: .
      args:
        SLURM_TAG: ${SLURM_TAG:-slurm-21-08-6-1}
    command: ["slurmdbd"]
    container_name: slurmdbd
    hostname: slurmdbd
    volumes:
      - etc_munge:/etc/munge
      - etc_slurm:/etc/slurm
      - var_log_slurm:/var/log/slurm
      - cgroups:/sys/fs/cgroup:ro
    expose:
      - "6819"
    ports:
      - "6819:6819"
    depends_on:
      - mysql
    privileged: true
    cgroup: host
    networks:
      - slurm
    #network_mode: host

  slurmctld:
    image: prom-slurm-cluster:${IMAGE_TAG:-21.08.6}
    command: ["slurmctld"] 
    container_name: slurmctld
    hostname: slurmctld
    volumes:
      - etc_munge:/etc/munge
      - etc_slurm:/etc/slurm
      - slurm_jobdir:/data
      - var_log_slurm:/var/log/slurm
      - etc_prometheus:/etc/prometheus
      - /sys/fs/cgroup:/sys/fs/cgroup:rw
    expose:
      - "6817"
      - "8080"
      - "8081"
      - "8082/tcp"
    ports:
      - 8080:8080
      - 8081:8081
      - 8082:8082/tcp
    depends_on:
      - "slurmdbd"
    privileged: true
    cgroup: host

    #network_mode: host
    networks:
      - slurm

  c1:
    image: prom-slurm-cluster:${IMAGE_TAG:-21.08.6}
    command: ["slurmd"]
    hostname: c1
    container_name: c1
    volumes:
      - etc_munge:/etc/munge
      - etc_slurm:/etc/slurm
      - slurm_jobdir:/data
      - var_log_slurm:/var/log/slurm
      - cgroups:/sys/fs/cgroup:ro
    expose:
      - "6818"
    depends_on:
      - "slurmctld"
    privileged: true
    cgroup: host 
    #network_mode: host
    networks:
      - slurm

  c2:
    image: prom-slurm-cluster:${IMAGE_TAG:-21.08.6}
    command: ["slurmd"]
    hostname: c2
    container_name: c2
    volumes:
      - etc_munge:/etc/munge
      - etc_slurm:/etc/slurm
      - slurm_jobdir:/data
      - var_log_slurm:/var/log/slurm
      - cgroups:/sys/fs/cgroup:ro
    expose:
      - "6818"
      - "22"
    depends_on:
      - "slurmctld"
    privileged: true
    cgroup: host
    networks:
      - slurm
    #network_mode: host

volumes:
  etc_munge:
  etc_slurm:
  slurm_jobdir:
  var_lib_mysql:
  var_log_slurm:
  grafana_data:
  prometheus_data:
  cgroups: 
  etc_prometheus:

networks:
  prom_app_net:
  slurm:
    enable_ipv6: true
    ipam:
      config: 
        - subnet: 2001:0DB8::/112

When I run query-exporter like query-exporter config.yaml -p 8082 with the following config file:

databases:
  db1:
    dsn: sqlite:////test.db
    connect-sql:
      - PRAGMA application_id = 123
      - PRAGMA auto_vacuum = 1
    labels:
      region: us1
      app: app1

metrics:
  metric1:
    type: gauge
    description: A sample gauge

queries:
  query1:
    interval: 5
    databases: [db1]
    metrics: [metric1]
    sql: SELECT random() / 1000000000000000 AS metric1

It doesnt work:

Screenshot 2023-07-10 at 10 19 47

But if i run the following simple exporter on the same port:

from prometheus_client import start_http_server, Summary
import random
import time

# Create a metric to track time spent and requests made.
REQUEST_TIME = Summary('request_processing_seconds', 'Time spent processing request')

# Decorate function with metric.
@REQUEST_TIME.time()
def process_request(t):
    """A dummy function that takes some time."""
    time.sleep(t)

if __name__ == '__main__':
    # Start up the server to expose the metrics.
    start_http_server(8082)
    # Generate some requests.
    while True:
        process_request(random.random())

It works fine:

Screenshot 2023-07-10 at 10 22 55

Is this a bug, or is there some other step that I am missing?

Thanks!

connorourke commented 1 year ago

Ah - ok. If I start it with query-exporter config.yaml -p 8082 -H 0.0.0.0 it works.