prometheus-community / postgres_exporter

A PostgreSQL metric exporter for Prometheus
Apache License 2.0
2.81k stars 742 forks source link

stat_activity_autovacuum: collected metric was collected before with the same name and label values #953

Open zeha opened 1 year ago

zeha commented 1 year ago

What did you do?

Investigated scrape failures.

What did you expect to see?

No scrape failures.

What did you see instead? Under which circumstances?

$ curl http://localhost:9187/metrics
collected metric "pg_stat_activity_autovacuum_timestamp_seconds" { label:{name:"relname"  value:"datacdr"}  gauge:{value:1.699456188197915e+09}} was collected before with the same name and label values

Given this output it seems likely the stat_activity_autovacuum collector neglects to put the schema name into labels.

$ uname -srm
Linux 4.19.0-25-amd64 x86_64
$ lsb_release -a
Distributor ID: Debian
Description:    Debian GNU/Linux 10 (buster)
Release:        10
Codename:       buster
$ /usr/local/bin/postgres_exporter --version
postgres_exporter, version 0.15.0 (branch: HEAD, revision: 68c176b8833b7580bf847cecf60f8e0ad5923f9a)
  build user:       root@88f74f2c2888
  build date:       20231027-14:38:56
  go version:       go1.21.3
  platform:         linux/amd64
  tags:             unknown
--collector.stat_statements --collector.stat_wal_receiver --collector.stat_activity_autovacuum --collector.long_running_transactions --collector.database_wraparound
# show server_version;
         server_version
--------------------------------
 15.4 (Debian 15.4-2.pgdg100+1)

I didn't see any relevant logs for this.

sysadmind commented 1 year ago

The query that is ran to pull the metric is this:

    SELECT
        SPLIT_PART(query, '.', 2) AS relname,
        EXTRACT(EPOCH FROM xact_start) AS timestamp_seconds
    FROM
        pg_catalog.pg_stat_activity
    WHERE
        query LIKE 'autovacuum:%'

Can you run that and post the results? I think there are multiple rows in pg_stat_activity with that relname in your case. If that is the case, would you be able to post the results of the following for troubleshooting?

    SELECT *
    FROM
        pg_catalog.pg_stat_activity
    WHERE
        query LIKE 'autovacuum:%'
zeha commented 1 year ago

Sure, here is an example:

current_timestamp | 2023-11-13 05:08:01.541131+01
datid             | 16409
datname           | backoffice
pid               | 19573
leader_pid        |
usesysid          |
usename           |
application_name  |
client_addr       |
client_hostname   |
client_port       |
backend_start     | 2023-11-13 05:07:50.578647+01
xact_start        | 2023-11-13 05:07:51.045391+01
query_start       | 2023-11-13 05:07:51.045391+01
state_change      | 2023-11-13 05:07:51.045392+01
wait_event_type   |
wait_event        |
state             | active
backend_xid       |
backend_xmin      | 3040708119
query_id          |
query             | autovacuum: VACUUM ANALYZE mob.datacdr
backend_type      | autovacuum worker
-------------------------------------------------------------
current_timestamp | 2023-11-13 05:08:01.541131+01
datid             | 16409
datname           | backoffice
pid               | 19801
leader_pid        |
usesysid          |
usename           |
application_name  |
client_addr       |
client_hostname   |
client_port       |
backend_start     | 2023-11-13 05:07:59.200846+01
xact_start        | 2023-11-13 05:08:01.252384+01
query_start       | 2023-11-13 05:08:01.252384+01
state_change      | 2023-11-13 05:08:01.252384+01
wait_event_type   | IO
wait_event        | DataFileRead
state             | active
backend_xid       |
backend_xmin      | 3040710090
query_id          |
query             | autovacuum: VACUUM postpaid.datacdr
backend_type      | autovacuum worker
alexius2 commented 10 months ago

Hello. We have similar issue with pg_stat_activity_autovacuum_timestamp_seconds metric, but in our case it was 2 autovacuums working simultaneously on 2 different databases on tables with same name. I think metric should have both schemaname and datname. We are using postgres_exporter version 0.15.0.