ClusterLabs / ha_cluster_exporter

Prometheus exporter for Pacemaker based Linux HA clusters
Apache License 2.0
79 stars 35 forks source link

Dockerized ha_cluster_exporter: Missing Metrics and Collector Initialization Issues #241

Closed ThauMish closed 6 months ago

ThauMish commented 10 months ago

Hello,

I'm experiencing issues with missing metrics in ha_cluster_exporter running in a Docker environment. Here are the specifics:

Environment:

Issue Description: I am not receiving all expected metrics from ha_cluster_exporter. This particularly affects 'pacemaker' and 'drbd' collectors.

ts=2023-11-16T11:25:01.755Z caller=instrumented_collector.go:60 level=warn msg="pacemaker collector scrape failed" err="crm_mon parser error: error while executing crm_mon: exit status 127"
ts=2023-11-16T11:25:01.755Z caller=instrumented_collector.go:60 level=warn msg="corosync collector scrape failed" err="corosync parser error: could not parse node id in corosync-quorumtool output: could not find Node ID line"
ts=2023-11-16T11:25:01.755Z caller=instrumented_collector.go:60 level=warn msg="drbd collector scrape failed" err="drbdsetup command failed: exit status 1"

When I curl, i couldn't get every metrics only these :

# HELP ha_cluster_scrape_duration_seconds Duration of a collector scrape.
# TYPE ha_cluster_scrape_duration_seconds gauge
ha_cluster_scrape_duration_seconds{collector="corosync"} 0.00285278
ha_cluster_scrape_duration_seconds{collector="drbd"} 0.003006552
ha_cluster_scrape_duration_seconds{collector="pacemaker"} 0.002103107
# HELP ha_cluster_scrape_success Whether a collector succeeded.
# TYPE ha_cluster_scrape_success gauge
ha_cluster_scrape_success{collector="corosync"} 0
ha_cluster_scrape_success{collector="drbd"} 0
ha_cluster_scrape_success{collector="pacemaker"} 0
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 0.03
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1.048576e+06
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 13
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 1.2402688e+07
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.70013385889e+09
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 7.39438592e+08
# HELP process_virtual_memory_max_bytes Maximum amount of virtual memory available in bytes.
# TYPE process_virtual_memory_max_bytes gauge
process_virtual_memory_max_bytes 1.8446744073709552e+19
# HELP promhttp_metric_handler_requests_in_flight Current number of scrapes being served.
# TYPE promhttp_metric_handler_requests_in_flight gauge
promhttp_metric_handler_requests_in_flight 1
# HELP promhttp_metric_handler_requests_total Total number of scrapes by HTTP status code.
# TYPE promhttp_metric_handler_requests_total counter
promhttp_metric_handler_requests_total{code="200"} 1
promhttp_metric_handler_requests_total{code="500"} 0
promhttp_metric_handler_requests_total{code="503"} 0

Docker Compose Configuration:

services:

  ha_cluster_exporter:
    <<: *default-service
    image: "${HA_CLUSTER_EXPORTER_IMAGE_NAME}:${HA_CLUSTER_EXPORTER_IMAGE_VERSION}"
    container_name: ha_cluster_exporter
    hostname: ha_cluster_exporter
    labels:
      <<: *labels
      application: ha_cluster_exporter
    env_file:
      - 'env/ha_cluster_exporter.env'
    volumes:
    - ./conf/ha_cluster_exporter.yaml:/etc/ha_cluster_exporter.yaml
    - ./conf/ha_cluster_exporter.web.yaml:/etc/ha_cluster_exporter.web.yaml
    - /usr/sbin/crm_mon:/usr/sbin/crm_mon
    - /usr/sbin/cibadmin:/usr/sbin/cibadmin
    - /usr/sbin/corosync-cfgtool:/usr/sbin/corosync-cfgtool
    - /usr/sbin/corosync-quorumtool:/usr/sbin/corosync-quorumtool
    - /usr/sbin/sbd:/usr/sbin/sbd
    - /sbin/drbdsetup:/sbin/drbdsetup
    ports:
      - "${LISTEN_IPV4_ADDRESS}:${HA_CLUSTER_EXPORTER_PORT}:9664"
    networks:
      ha_cluster_exporter-fr:
        ipv4_address: ${HA_CLUSTER_EXPORTER_IPV4_NETWORK:-10.0.0}.11

Could you please assist me in resolving this issue? Are there specific configurations or steps that I'm missing to ensure all metrics are collected and reported correctly in a Docker environment?

Thank you for your support.

stefanotorresi commented 6 months ago

The exporter is not really meant to run inside a container, since it depends on various system tools that operate at the host level and with root privileges. Mounting the executables in the container with volumes is far from enough to ensure all those tools work correctly. I'm afraid I will have to close this because trying to run the linux HA stack inside a container would be a huge endeavor and it is completely outside the scope of the project.