Vonng / pg_exporter

Advanced PostgreSQL & Pgbouncer Metrics Exporter for Prometheus
https://pigsty.io
Apache License 2.0
171 stars 44 forks source link

extension, namespace tags use values from default DB only? #44

Closed ringerc closed 3 months ago

ringerc commented 3 months ago

If I read this code correctly, the scraper collects the list of extensions and namespaces present from the default DB it connects to.

This would result in incorrectly running a particular metric set on DBs where it should not run, or not running it where it should, if the set of extensions and/or namspaces differs between DBs on the postgres instance.

Am I missing something there? It looks like the namespace and extensions lists probably need to be collected per-discovered-database.

It also appears to collect DB names including those with false datallowconn and datistemplate, which will either fail to scrape or not be useful to scrape.

ringerc commented 3 months ago

Testcase

create database scrape_test;
\c scrape_test
create namespace scrape_ns;
create table scrape_ns.scrape_me(somevalue integer);
insert into scrape_ns.scrape_me(somevalue) values (42);

and config

test_for_selective_scrape:
  name: selective_scrape
  desc: test for GH issue 44
  query: |
    SELECT somevalue FROM scrape_ns.scrape_me;
  tags:
    - "schema:scrape_ns"
  fatal: true
  skip: false
  metrics:
    - somevalue:
        name: somevalue
        usage: GAUGE

then connect to a db other than scrape_me

it runs with logs

level=info timestamp=2024-06-12T01:22:25.332211159Z caller=utils.go:56 msg="server [craig] planned with 1 queries, 0 installed, 1 discarded, installed:  , discarded: test_for_selective_scrape"
level=info timestamp=2024-06-12T01:22:25.334541006Z caller=utils.go:56 msg="server [sp_test] version changed: from [0] to [140008]"
level=info timestamp=2024-06-12T01:22:25.335477667Z caller=utils.go:56 msg="server [sp_test] planned with 1 queries, 0 installed, 1 discarded, installed:  , discarded: test_for_selective_scrape"
level=info timestamp=2024-06-12T01:22:25.337818204Z caller=utils.go:56 msg="server [scrape_test] version changed: from [0] to [140008]"
level=info timestamp=2024-06-12T01:22:25.338782313Z caller=utils.go:56 msg="server [scrape_test] planned with 1 queries, 1 installed, 0 discarded, installed: test_for_selective_scrape , discarded: "

and scrape output

➜  pg_exporter git:(master) ✗ curl -sSLf1 http://localhost:9630/metrics |grep selective
pg_exporter_query_cache_ttl{datname="scrape_test",query="selective_scrape"} 0
pg_exporter_query_scrape_duration{datname="scrape_test",query="selective_scrape"} 8.7162e-05
pg_exporter_query_scrape_error_count{datname="scrape_test",query="selective_scrape"} 0
pg_exporter_query_scrape_hit_count{datname="scrape_test",query="selective_scrape"} 0
pg_exporter_query_scrape_metric_count{datname="scrape_test",query="selective_scrape"} 1
pg_exporter_query_scrape_total_count{datname="scrape_test",query="selective_scrape"} 2
# HELP selective_scrape_somevalue 
# TYPE selective_scrape_somevalue gauge
selective_scrape_somevalue 42

so in fact it's behaving correctly and I did misread the code.

I guess all the connected services watch for databases not just one.