powa-team / pg_qualstats

A PostgreSQL extension for collecting statistics about predicates, helping find what indices are missing
Other
272 stars 26 forks source link
c extension postgresql predicate statistics

pg_qualstats

pg_qualstats is a PostgreSQL extension keeping statistics on predicates found in WHERE statements and JOIN clauses.

This is useful if you want to be able to analyze what are the most-often executed quals (predicates) on your database. The powa project makes use of this to provide advances index suggestions.

It also allows you to identify correlated columns, by identifying which columns are most frequently queried together.

The extension works by looking for known patterns in queries. Currently, this includes:

This extension also saves the first query text, as-is, for each distinct queryid executed, with a limit of pg_qualstats.max entries.

Please not that the gathered data are not saved when the PostgreSQL server is restarted.

Installation

   shared_preload_libraries = 'pg_qualstats'

Configuration

The following GUCs can be configured, in postgresql.conf:

Updating the extension

Note that as all extensions configured in shared_preload_libraries, most of the changes are only applied once PostgreSQL is restarted with the new shared library version. The extension objects themselves only provides SQL wrappers to access internal data structures.

Since version 2.0.4, an upgrade script is provided, allowing to upgade from the previous version only. If you want to upgrade the extension across multiple versions, or from a version older than 2.0.3, you will need top drop and recreate the extension to get the latest version.

Usage

   CREATE EXTENSION pg_qualstats;

Functions

The extension defines the following functions:

ro=# select * from pg_qualstats;
 userid │ dbid  │ lrelid │ lattnum │ opno │ rrelid │ rattnum │ qualid │ uniquequalid │ qualnodeid │ uniquequalnodeid │ occurences │ execution_count │ nbfiltered │ constant_position │ queryid │   constvalue   │ eval_type
--------+-------+--------+---------+------+--------+---------+--------+--------------+------------+------------------+------------+-----------------+------------+-------------------+---------+----------------+-----------
     10 │ 16384 │  16385 │       2 │   98 │ <NULL> │  <NULL> │ <NULL> │       <NULL> │  115075651 │       1858640877 │          1 │          100000 │      99999 │                29 │  <NULL> │ 'line 1'::text │ f
     10 │ 16384 │  16391 │       2 │   98 │  16385 │       2 │ <NULL> │       <NULL> │  497379130 │        497379130 │          1 │               0 │          0 │            <NULL> │  <NULL> │                │ f
SELECT v
  FROM json_array_elements(
    pg_qualstats_index_advisor(min_filter => 50)->'indexes') v
  ORDER BY v::text COLLATE "C";
                               v
---------------------------------------------------------------
 "CREATE INDEX ON public.adv USING btree (id1)"
 "CREATE INDEX ON public.adv USING btree (val, id1, id2, id3)"
 "CREATE INDEX ON public.pgqs USING btree (id)"
(3 rows)

SELECT v
  FROM json_array_elements(
    pg_qualstats_index_advisor(min_filter => 50)->'unoptimised') v
  ORDER BY v::text COLLATE "C";
        v
-----------------
 "adv.val ~~* ?"
(1 row)

Views

In addition to that, the extension defines some views on top of the pg_qualstats function:

ro=# select * from pg_qualstats_pretty;
 left_schema |    left_table    | left_column |   operator   | right_schema | right_table | right_column | occurences | execution_count | nbfiltered
-------------+------------------+-------------+--------------+--------------+-------------+--------------+------------+-----------------+------------
 public      | pgbench_accounts | aid         | pg_catalog.= |              |             |              |          5 |         5000000 |    4999995
 public      | pgbench_tellers  | tid         | pg_catalog.= |              |             |              |         10 |        10000000 |    9999990
 public      | pgbench_branches | bid         | pg_catalog.= |              |             |              |         10 |         2000000 |    1999990
 public      | t1               | id          | pg_catalog.= | public       | t2          | id_t1        |          1 |           10000 |       9999