prometheus-community / postgres_exporter

A PostgreSQL metric exporter for Prometheus
Apache License 2.0
2.82k stars 743 forks source link

Exporter does not collect metrics from multiple databases #999

Open freneticpony1995 opened 10 months ago

freneticpony1995 commented 10 months ago

What did you do?

I've started postgres-exporter in my k8s cluster using official prometheus helmchart v5.3.0. I serve connection information to exporter using environmental variable "DATA_SOURCE_NAME" containing connection strings for 23 host with Postgresql 13.

Example: DATA_SOURCE_NAME=postgresql://username1:PASSWD1@server1:5432/service1db?sslmode=disable,postgresql://username1:PASSWD2@server2:5432/service2db?sslmode=disable ... etc

What did you expect to see?

I expected to see labels "datname" and "server" in all metrics on /metrics endpoint + all enabled metrics for all databases.

Example:

# HELP pg_stat_database_conflicts_confl_tablespace Number of queries in this database that have been canceled due to dropped tablespaces
# TYPE pg_stat_database_conflicts_confl_tablespace counter
pg_stat_database_conflicts_confl_tablespace{datid="20146",datname="service1db",server="FQDN1:5432"} 0
pg_stat_database_conflicts_confl_tablespace{datid="20366",datname="service2db",server="FQDN2:5432"} 0

What did you see instead? Under which circumstances?

I see some metrics has been exported only for 1 database (first in DATA_SOURCE_NAME list) without label "server" or even without any label.

Example:

# HELP pg_stat_database_deadlocks Number of deadlocks detected in this database
# TYPE pg_stat_database_deadlocks counter
pg_stat_database_deadlocks{datid="20146",datname="service1db"} 0

List of metrics with this issue: pg_database_size_bytes pg_locks_count pg_statdatabase pg_stat_usertables pg_statio_usertables.*

List of metrics without issue: pg_stat_database_conflictsconfl pg_statarchiver pg_statactivity pgsettings pg_statbgwriter*

pg_statbgwriter* metrics has been served without labels at all.

Example:

# HELP pg_stat_bgwriter_buffers_alloc_total Number of buffers allocated
# TYPE pg_stat_bgwriter_buffers_alloc_total counter
pg_stat_bgwriter_buffers_alloc_total 1.42966019e+08
# HELP pg_stat_bgwriter_buffers_backend_fsync_total Number of times a backend had to execute its own fsync call (normally the background writer handles those even when the backend does its own write)
# TYPE pg_stat_bgwriter_buffers_backend_fsync_total counter
pg_stat_bgwriter_buffers_backend_fsync_total 0
# HELP pg_stat_bgwriter_buffers_backend_total Number of buffers written directly by a backend
# TYPE pg_stat_bgwriter_buffers_backend_total counter
pg_stat_bgwriter_buffers_backend_total 5.7748244e+07
# HELP pg_stat_bgwriter_buffers_checkpoint_total Number of buffers written during checkpoints
# TYPE pg_stat_bgwriter_buffers_checkpoint_total counter
pg_stat_bgwriter_buffers_checkpoint_total 2.70184913e+08
# HELP pg_stat_bgwriter_buffers_clean_total Number of buffers written by the background writer
# TYPE pg_stat_bgwriter_buffers_clean_total counter
pg_stat_bgwriter_buffers_clean_total 1.977864e+06
# HELP pg_stat_bgwriter_checkpoint_sync_time_total Total amount of time that has been spent in the portion of checkpoint processing where files are synchronized to disk, in milliseconds
# TYPE pg_stat_bgwriter_checkpoint_sync_time_total counter
pg_stat_bgwriter_checkpoint_sync_time_total 226029
# HELP pg_stat_bgwriter_checkpoint_write_time_total Total amount of time that has been spent in the portion of checkpoint processing where files are written to disk, in milliseconds
# TYPE pg_stat_bgwriter_checkpoint_write_time_total counter
pg_stat_bgwriter_checkpoint_write_time_total 7.680013381e+09
# HELP pg_stat_bgwriter_checkpoints_req_total Number of requested checkpoints that have been performed
# TYPE pg_stat_bgwriter_checkpoints_req_total counter
pg_stat_bgwriter_checkpoints_req_total 780
# HELP pg_stat_bgwriter_checkpoints_timed_total Number of scheduled checkpoints that have been performed
# TYPE pg_stat_bgwriter_checkpoints_timed_total counter
pg_stat_bgwriter_checkpoints_timed_total 198817
# HELP pg_stat_bgwriter_maxwritten_clean_total Number of times the background writer stopped a cleaning scan because it had written too many buffers
# TYPE pg_stat_bgwriter_maxwritten_clean_total counter
pg_stat_bgwriter_maxwritten_clean_total 8284
# HELP pg_stat_bgwriter_stats_reset_total Time at which these statistics were last reset
# TYPE pg_stat_bgwriter_stats_reset_total counter
pg_stat_bgwriter_stats_reset_total 1.646306272e+09

Environment

k8s cluster

postgres_exporter, version 0.15.0 (branch: HEAD, revision: 68c176b8833b7580bf847cecf60f8e0ad5923f9a)
  build user:       root@88f74f2c2888
  build date:       20231027-14:38:56
  go version:       go1.21.3
  platform:         linux/amd64
  tags:             unknown
'--config.file=/etc/postgres_exporter.yml'
'--web.listen-address=:9187'
'--log.level=debug'
'--collector.long_running_transactions'
'--collector.stat_activity_autovacuum'
'--collector.process_idle'
'--no-collector.wal'
hajali-amine commented 9 months ago

There's also all metrics following this pattern pg_stat_statements_* are only showing the values for the first datasource.

oadekoya commented 9 months ago

What is the update on this? I think deprecating the auto-discover-databases feature without any alternative solution is very disadvantageous.

thampiotr commented 7 months ago

It seems to be because of this logic, which takes only first DSN to instantiate the collector.

When I tried to instantiate multiple collectors in a fork I ran into issues with attempts to register multiple metrics. Seems like running an exporter per DB is the workaround for now.

sysadmind commented 7 months ago

The best way to use one collector for metrics on several postgres instances is to use the multi-target support. We're in the (slow) process of standardizing the code and how the labels are applied.

thampiotr commented 7 months ago

Thanks for the context @sysadmind! We're looking forward to the new versions with standardized implementation.

Are all the metrics, including the ones defined in the postgres_exporter.go and custom metrics from a YAML file supported by the multi-target support? Or only those from collector package?