prometheus-community / postgres_exporter

A PostgreSQL metric exporter for Prometheus
Apache License 2.0
2.81k stars 739 forks source link

Add some custom metrics to queries.yaml #634

Open machine424 opened 2 years ago

machine424 commented 2 years ago

Proposal

Hello,

Here are some custom metrics that are/were very useful for us:

Screenshot 2022-04-22 at 15 14 42

Pleas tell me if you find them relevant so we can adjust them (maybe correct them haha) and add them to queries.yaml.

# Maybe another name
pg_sizes:
  query: |
    SELECT
      current_database() datname,
      schemaname,
      S.relname as relname,
      C.reltoastrelid::regclass as toast_relname,
      pg_relation_size(schemaname || '.' || S.relname) as relation_size_bytes,
      pg_table_size(schemaname || '.' || S.relname) as table_size_bytes,
      pg_total_relation_size(schemaname || '.' || S.relname) as total_relation_size_bytes
    FROM
      pg_stat_user_tables as S
    LEFT JOIN
      pg_class as C ON S.relid = C.oid
  metrics:
    - datname:
        usage: LABEL
        description: Name of current database
    - schemaname:
        usage: LABEL
        description: Name of the schema that this table is in
    - relname:
        usage: LABEL
        description: Name of this table
    - toast_relname:
        usage: LABEL
        description: Name of the toast table if any
    - relation_size_bytes:
        usage: GAUGE
        description: "The size of just the table"
    - table_size_bytes:
        usage: GAUGE
        description: "The size of the table including TOAST, but excluding indexes"
    - total_relation_size_bytes:
        usage: GAUGE
        description: "The size of the table, including indexes and TOAST"

pg_stat_progress_vacuum:
  query: |
    SELECT
      S.pid as pid,
      S.datname as datname,
      C.relname as relname,
      Q.relname as primary_relname,
      CASE S.phase WHEN 'initializing' THEN 0
                WHEN 'scanning heap' THEN 1
                WHEN 'vacuuming indexes' THEN 2
                WHEN 'vacuuming heap' THEN 3
                WHEN 'cleaning up indexes' THEN 4
                WHEN 'truncating heap' THEN 5
                WHEN 'performing final cleanup' THEN 6
                END AS phase,
      heap_blks_total,
      heap_blks_scanned,
      heap_blks_vacuumed,
      index_vacuum_count,
      max_dead_tuples,
      num_dead_tuples,
      COALESCE(A.xact_start, '1970-01-01Z') as xact_start,
      COALESCE(A.query_start, '1970-01-01Z') as query_start,
      A.state as backend_state,
      COALESCE(A.state_change, '1970-01-01Z') as last_backend_state_change,
      A.wait_event_type as backend_wait_event_type,
      A.wait_event as backend_wait_event
    FROM
      pg_stat_progress_vacuum as S
    LEFT JOIN
      pg_class AS C ON S.relid = C.oid
    LEFT JOIN
      pg_class AS Q ON S.relid = Q.reltoastrelid
    LEFT JOIN
      pg_stat_activity AS A ON A.pid = S.pid
    WHERE
      Q.relname IS NULL OR (Q.relname NOT LIKE 'pg_%' AND Q.relname NOT LIKE 'sql_%')
  metrics:
    - pid:
        usage: GAUGE
        description: Process ID of backend
    - datname:
        usage: LABEL
        description: Name of the database to which this backend is connected
    - relname:
        usage: LABEL
        description: Name of the table
    - primary_relname:
        usage: LABEL
        description: Name of the primary table if any
    - phase:
        usage: GAUGE
        description: Current processing phase of vacuum
    - heap_blks_total:
        usage: GAUGE
        description: Total number of heap blocks in the table.
    - heap_blks_scanned:
        usage: GAUGE
        description: Number of heap blocks scanned
    - heap_blks_vacuumed:
        usage: GAUGE
        description: Number of heap blocks vacuumed.
    - index_vacuum_count:
        usage: GAUGE
        description: Total number of heap blocks in the table.
    - max_dead_tuples:
        usage: GAUGE
        description: Number of dead tuples that we can store before needing to perform an index vacuum cycle.
    - num_dead_tuples:
        usage: GAUGE
        description: Number of dead tuples collected since the last index vacuum cycle.
    - xact_start:
        usage: GAUGE
        description: Time when this process' current transaction was started.
    - query_start:
        usage: GAUGE
        description: Time when the currently active query was started
    - last_backend_state_change:
        usage: GAUGE
        description: Time when the backend_state was last changed
    - backend_state:
        usage: LABEL
        description: Current overall state of this backend
    - backend_wait_event:
        usage: LABEL
        description: Wait event name if backend is currently waiting
    - backend_wait_event_type:
        usage: LABEL
        description: The type of event for which the backend is waiting, if any.

pg_stat_all_tables:
  query: |
    SELECT
      current_database() datname,
      schemaname,
      S.relname as relname,
      C.relname as primary_relname,
      seq_scan,
      seq_tup_read,
      idx_scan,
      idx_tup_fetch,
      n_tup_ins,
      n_tup_upd,
      n_tup_del,
      n_tup_hot_upd,
      n_live_tup,
      n_dead_tup,
      n_mod_since_analyze,
      COALESCE(last_vacuum, '1970-01-01Z') as last_vacuum,
      COALESCE(last_autovacuum, '1970-01-01Z') as last_autovacuum,
      COALESCE(last_analyze, '1970-01-01Z') as last_analyze,
      COALESCE(last_autoanalyze, '1970-01-01Z') as last_autoanalyze,
      vacuum_count,
      autovacuum_count,
      analyze_count,
      autoanalyze_count
    FROM
      pg_stat_all_tables as S
    LEFT JOIN
      pg_class as C ON S.relid = C.reltoastrelid
    WHERE schemaname IN ('pg_toast', 'public') AND C.relname IS NULL OR (C.relname NOT LIKE 'pg_%' AND C.relname NOT LIKE 'sql_%')
  metrics:
  - datname:
      usage: LABEL
      description: Name of current database
  - schemaname:
      usage: LABEL
      description: Name of the schema that this table is in
  - relname:
      usage: LABEL
      description: Name of this table
  - primary_relname:
      usage: LABEL
      description: Name of the primary table if any
  - seq_scan:
      usage: COUNTER
      description: Number of sequential scans initiated on this table
  - seq_tup_read:
      usage: COUNTER
      description: Number of live rows fetched by sequential scans
  - idx_scan:
      usage: COUNTER
      description: Number of index scans initiated on this table
  - idx_tup_fetch:
      usage: COUNTER
      description: Number of live rows fetched by index scans
  - n_tup_ins:
      usage: COUNTER
      description: Number of rows inserted
  - n_tup_upd:
      usage: COUNTER
      description: Number of rows updated
  - n_tup_del:
      usage: COUNTER
      description: Number of rows deleted
  - n_tup_hot_upd:
      usage: COUNTER
      description: Number of rows HOT updated (i.e., with no separate index update required)
  - n_live_tup:
      usage: GAUGE
      description: Estimated number of live rows
  - n_dead_tup:
      usage: GAUGE
      description: Estimated number of dead rows
  - n_mod_since_analyze:
      usage: GAUGE
      description: Estimated number of rows changed since last analyze
  - last_vacuum:
      usage: GAUGE
      description: Last time at which this table was manually vacuumed (not counting VACUUM FULL)
  - last_autovacuum:
      usage: GAUGE
      description: Last time at which this table was vacuumed by the autovacuum daemon
  - last_analyze:
      usage: GAUGE
      description: Last time at which this table was manually analyzed
  - last_autoanalyze:
      usage: GAUGE
      description: Last time at which this table was analyzed by the autovacuum daemon
  - vacuum_count:
      usage: COUNTER
      description: Number of times this table has been manually vacuumed (not counting VACUUM FULL)
  - autovacuum_count:
      usage: COUNTER
      description: Number of times this table has been vacuumed by the autovacuum daemon
  - analyze_count:
      usage: COUNTER
      description: Number of times this table has been manually analyzed
  - autoanalyze_count:
      usage: COUNTER
      description: Number of times this table has been analyzed by the autovacuum daemon
SuperQ commented 2 years ago

We're migrating away from queries.yaml in favor of having new features added to the collector package

sysadmind commented 2 years ago

Thanks for the issue. These look to me like they would be valuable to implement. Having all of these queries and help descriptions will be useful for the person that actually implements these in that collector package.

richardnpaul commented 1 year ago

We're migrating away from queries.yaml in favor of having new features added to the collector package

@SuperQ this is just plain silly. You're just not going to want to take on the implementation of everything that anyone might want/currently has implemented and not everything that anyone currently implements through custom queries is going to be relevant to most users which is the bar that you'd probably want to set for implementing a new collector.

I can understand doing this for the standard query stuff but you still need a way to allow for extensibility otherwise you basically make it useless to anyone currently using custom queries and force those users to find a way around the limitation.

I'll give an example of a custom query that you don't want to implement as a collector. A query that returns metrics for the auth session table of the company that I work for's proprietary software. That's something that's specific to us and it's not something anyone else has and it's something because of that you're never going to want to write a collector for. My only options are, to fork and then maintain the fork for this project with a custom go collector that I write or use the old version (hopefully only until such a time as you - the team - come to your - also the team collectively - senses and realise your error here).

garry-t commented 1 year ago

Nice to have https://www.postgresql.org/docs/current/pgstattuple.html I dont see collector for this. means queries.yaml is still relevant at least for me

SuperQ commented 1 year ago

I'll give an example of a custom query that you don't want to implement as a collector. A query that returns metrics for the auth session table of the company that I work for's proprietary software.

This is better suited for a generic SQL exporter. This is not related to postgres server metrics. This exporter is specifically for server metrics, not generic business metrics.

@garry-t, would you mind opening a separate issue as a feature request to add that as a new collector. Please also describe what you want to get out of it.

mzupan commented 11 months ago

Looks like the idea now is to use this for anything custom

https://github.com/burningalchemist/sql_exporter