pg_qualstats is a PostgreSQL extension keeping statistics on predicates found
in WHERE
statements and JOIN
clauses.
This is useful if you want to be able to analyze what are the most-often executed quals (predicates) on your database. The powa project makes use of this to provide advances index suggestions.
It also allows you to identify correlated columns, by identifying which columns are most frequently queried together.
The extension works by looking for known patterns in queries. Currently, this includes:
Binary OpExpr where at least one side is a column from a table. Whenever possible, the predicate will be swaped so that CONST OP VAR expressions are turned into VAR COMMUTED_OP CONST. AND and OR expression members are counted as separate entries. Ex: WHERE column1 = 2, WHERE column1 = column2, WHERE 3 = column3
ScalarArrayOpExpr where the left side is a VAR, and the right side is an array constant. Those will be counted one time per element in the array. Ex: WHERE column1 IN (2, 3) will be counted as 2 occurences for the (column1, '=') operator pair
BooleanTest where the expression is a simple boolean column reference Ex: WHERE column1 IS TRUE Please not that clauses like WHERE columns1, WHERE NOT column1 won't be processed by pg_qualstats (yet)
This extension also saves the first query text, as-is, for each distinct queryid executed, with a limit of pg_qualstats.max entries.
Please not that the gathered data are not saved when the PostgreSQL server is restarted.
shared_preload_libraries = 'pg_qualstats'
The following GUCs can be configured, in postgresql.conf:
Note that as all extensions configured in shared_preload_libraries, most of the changes are only applied once PostgreSQL is restarted with the new shared library version. The extension objects themselves only provides SQL wrappers to access internal data structures.
Since version 2.0.4, an upgrade script is provided, allowing to upgade from the previous version only. If you want to upgrade the extension across multiple versions, or from a version older than 2.0.3, you will need top drop and recreate the extension to get the latest version.
CREATE EXTENSION pg_qualstats;
The extension defines the following functions:
pg_qualstats: returns the counts for every qualifier, identified by the expression hash. This hash identifies each expression.
Example:
ro=# select * from pg_qualstats;
userid │ dbid │ lrelid │ lattnum │ opno │ rrelid │ rattnum │ qualid │ uniquequalid │ qualnodeid │ uniquequalnodeid │ occurences │ execution_count │ nbfiltered │ constant_position │ queryid │ constvalue │ eval_type
--------+-------+--------+---------+------+--------+---------+--------+--------------+------------+------------------+------------+-----------------+------------+-------------------+---------+----------------+-----------
10 │ 16384 │ 16385 │ 2 │ 98 │ <NULL> │ <NULL> │ <NULL> │ <NULL> │ 115075651 │ 1858640877 │ 1 │ 100000 │ 99999 │ 29 │ <NULL> │ 'line 1'::text │ f
10 │ 16384 │ 16391 │ 2 │ 98 │ 16385 │ 2 │ <NULL> │ <NULL> │ 497379130 │ 497379130 │ 1 │ 0 │ 0 │ <NULL> │ <NULL> │ │ f
pg_qualstats_index_advisor(min_filter, min_selectivity, forbidden_am):
Perform a global index suggestion. By default, only predicates filtering at
least 1000 rows and 30% of the rows in average will be considered, but this
can be passed as parameter. You can also provide an array of index access
method if you want to avoid some. For instance, on PostgreSQL 9.6 and
prior, hash
indexes will be ignored as those weren't crash safe yet.
Example:
SELECT v
FROM json_array_elements(
pg_qualstats_index_advisor(min_filter => 50)->'indexes') v
ORDER BY v::text COLLATE "C";
v
---------------------------------------------------------------
"CREATE INDEX ON public.adv USING btree (id1)"
"CREATE INDEX ON public.adv USING btree (val, id1, id2, id3)"
"CREATE INDEX ON public.pgqs USING btree (id)"
(3 rows)
SELECT v
FROM json_array_elements(
pg_qualstats_index_advisor(min_filter => 50)->'unoptimised') v
ORDER BY v::text COLLATE "C";
v
-----------------
"adv.val ~~* ?"
(1 row)
tablename.columname operatorname ?
. This is mostly for the global index
advisor.In addition to that, the extension defines some views on top of the pg_qualstats function:
pg_qualstats_pretty: performs the appropriate joins to display a readable aggregated form for every attribute from the pg_qualstats view
Example:
ro=# select * from pg_qualstats_pretty;
left_schema | left_table | left_column | operator | right_schema | right_table | right_column | occurences | execution_count | nbfiltered
-------------+------------------+-------------+--------------+--------------+-------------+--------------+------------+-----------------+------------
public | pgbench_accounts | aid | pg_catalog.= | | | | 5 | 5000000 | 4999995
public | pgbench_tellers | tid | pg_catalog.= | | | | 10 | 10000000 | 9999990
public | pgbench_branches | bid | pg_catalog.= | | | | 10 | 2000000 | 1999990
public | t1 | id | pg_catalog.= | public | t2 | id_t1 | 1 | 10000 | 9999
pg_qualstats_all: sums the counts for each attribute / operator pair, regardless of its position as an operand (LEFT or RIGHT), grouping together attributes used in AND clauses.
Example:
ro=# select * from pg_qualstats_all;
dbid | relid | userid | queryid | attnums | opno | qualid | occurences | execution_count | nbfiltered | qualnodeid
-------+-------+--------+---------+---------+------+--------+------------+-----------------+------------+------------
16384 | 16385 | 10 | | {2} | 98 | | 1 | 100000 | 99999 | 115075651
16384 | 16391 | 10 | | {2} | 98 | | 2 | 0 | 0 | 497379130
pg_qualstats_by_query: returns only predicates of the form VAR OPERATOR CONSTANT, aggregated by queryid.