prometheus-community / postgres_exporter

A PostgreSQL metric exporter for Prometheus
Apache License 2.0
2.81k stars 739 forks source link

Move pg_stat_replication queries to collector package #966

Open SuperQ opened 11 months ago

SuperQ commented 11 months ago

Proposal

There are existing queries for pg_stat_replication in cmd/postgres_exporter/queries.go. These metrics should be migrated to the collector package.

ARPABoy commented 11 months ago

This affects replication monitoring in the way that if only pg_up and pg_replication_lag_seconds are monitored in Secondary servers and there's a network outage between Primary and Secondary servers, Secondary servers get lagged without any alarm being triggered.

It seems more reasonable to monitor replication looking at Primary server data. SELECT COUNT(*) FROM pg_stat_replication WHERE client_addr='SLAVE_IP' AND state = 'streaming'; If it returns 0, we have an unreachable Secondary server.

SELECT COALESCE(EXTRACT(EPOCH FROM replay_lag)::bigint, 0) AS replay_lag FROM pg_stat_replication WHERE client_addr='SLAVE_IP'; If it returns more than X we have a lagged Secondary server.