Closed pvaldria closed 3 years ago
More details:
Feb 7 12:15:10 nfs-server-2 systemd: Started Prometheus exporter for Pacemaker HA clusters metrics.
Feb 7 12:15:10 nfs-server-2 ha_cluster_exporter: time="2021-02-07T12:15:10Z" level=warning msg="Config File \"ha_cluster_exporter\" Not Foun d in \"[/ /.config /etc /usr/etc]\""
Feb 7 12:15:10 nfs-server-2 ha_cluster_exporter: time="2021-02-07T12:15:10Z" level=info msg="Default config values will be used"
Feb 7 12:15:10 nfs-server-2 ha_cluster_exporter: time="2021-02-07T12:15:10Z" level=warning msg="Registration failure: could not initialize ' drbd' collector: '/sbin/drbdsetup' does not exist"
Feb 7 12:15:10 nfs-server-2 ha_cluster_exporter: time="2021-02-07T12:15:10Z" level=info msg="'pacemaker' collector registered."
Feb 7 12:15:10 nfs-server-2 ha_cluster_exporter: time="2021-02-07T12:15:10Z" level=info msg="'corosync' collector registered."
Feb 7 12:15:10 nfs-server-2 ha_cluster_exporter: time="2021-02-07T12:15:10Z" level=info msg="'sbd' collector registered."
Feb 7 12:15:10 nfs-server-2 ha_cluster_exporter: time="2021-02-07T12:15:10Z" level=info msg="Serving metrics on 0.0.0.0:9664"
Hi @pvaldria,
Systemd and other OS related metrics are provided by the Prometheus Node_exporter. Do you have it running on your system too? The ha_cluster_exporter is specialized to provide Clusterlabs components metrics.
yes, I have the node_exporter service running on all nodes and on the Grafana/Prometheus server, I have the following:
The last job below ( - job_name: 'nfs-ha-cluster') I added for displaying HA details and I mentioned both port 9664 and port 9100.
` /etc/prometheus/prometheus.yml global: scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute. external_labels: region: region monitor: infrastructure replica: nfs-20210209-0706
alerting: alertmanagers:
rule_files:
scrape_configs:
job_name: 'prometheus'
static_configs:
job_name: 'nfs_servers'
scrape_interval: 5s static_configs:
job_name: 'quorum'
scrape_interval: 5s static_configs:
job_name: 'nfs-ha-cluster'
scrape_interval: 5s static_configs:
@pvaldria another check: Did you enable systemd collector on your node_exporter configuration? It comes disabled by default.
https://github.com/prometheus/node_exporter#disabled-by-default
Node atttributes and Systemd units data not showing up in Grafana. Please see attached screenshot. Is it a known issue ? I have a pacemaker/corosync NFS HA cluster (active/passive) with shared disk and using SBD fencing agent.
I had to add the below to /etc/prometheus/prometheus.yml `
I installed ha_cluster_exporter using the steps below.
` yum install -y -q git curl -O https://objectstorage.us-ashburn-1.oraclecloud.com/xxxxxxxxxxxxxxx/go1.15.8.linux-amd64.tar.gz tar -C /usr/local -xzf go1.15.8.linux-amd64.tar.gz
echo ' export GOROOT="/usr/local/go" export GOBIN="$HOME/go/bin" mkdir -p $GOBIN export PATH=$PATH:$GOROOT/bin:$GOBIN ' >> .bashrc source ~/.bashrc go version go get github.com/golang/mock/mockgen
git clone https://github.com/ClusterLabs/ha_cluster_exporter cd ha_cluster_exporter make make install
cat > /lib/systemd/system/ha_cluster_exporter.service << EOF [Unit] Description=Prometheus exporter for Pacemaker HA clusters metrics After=network.target
[Service] Type=simple Restart=always ExecStart=/root/go/bin/ha_cluster_exporter ExecReload=/bin/kill -HUP $MAINPID Restart=on-failure RestartSec=5s [Install] WantedBy=multi-user.target EOF
systemctl start ha_cluster_exporter `