Provide context for index advice

tedyu commented 2 years ago

Currently index advice from pg_qualstats_index_advisor() only provides the index statement. e.g.

"CREATE INDEX ON public.pgqs USING btree (id)"

It would be nice to include context along with the index advice. The context may include (not limited to):

query statement which would benefit by the suggested index
number of executions for the query statement which went through sampling by pg_qualstats (count / occurrences in pgqsEntry struct)

rjuju commented 2 years ago

query statement which would benefit by the suggested index

Agreed, that's something on my todo list. Note that this feature exist for the index advisor on powa, which will also try to create hypothetical indexes (if hypopg is installed in the target database), and will compare explain plans with and without those hypothetical indexes for all related queries and see if the query will use the hypothetical index or not.

number of rows for the query statement which went through sampling by pg_qualstats

I don't understand this one. The sampling in pg_qualstats is done on at the statement level, and pg_qualstats only works with the instrumentation provided by postgres and the execution end, not for each row processed.

tedyu commented 2 years ago

I think adding query statement would be a good first step.

I modified the second point - see if it is clearer now.

tedyu commented 2 years ago

What do you think of the following change ?

diff --git a/pg_qualstats--2.0.4.sql b/pg_qualstats--2.0.4.sql
index ae77842..3fc0ec7 100644
--- a/pg_qualstats--2.0.4.sql
+++ b/pg_qualstats--2.0.4.sql
@@ -357,7 +357,7 @@ CREATE TYPE qualname AS (

 CREATE OR REPLACE VIEW pg_qualstats_by_query AS
         SELECT coalesce(uniquequalid, uniquequalnodeid) as uniquequalnodeid, dbid, userid,  coalesce(qualid, qualnodeid) as qualnodeid, occurences, execution_count, nbfiltered, queryid,
-      array_agg(constvalue order by constant_position) as constvalues, array_agg(ROW(relid, attnum, opno, eval_type)::qual) as quals
+      array_agg(constvalue order by constant_position) as constvalues, array_agg(ROW(relid, attnum, opno, eval_type)::qual) as quals, query
       FROM
       (

@@ -381,10 +381,11 @@ CREATE OR REPLACE VIEW pg_qualstats_by_query AS
             qs.constvalue as constvalue,
             qs.nbfiltered as nbfiltered,
             qs.eval_type,
-            qs.constant_position
-        FROM pg_qualstats() qs
-        WHERE (qs.lrelid IS NULL) != (qs.rrelid IS NULL)
-    ) i GROUP BY coalesce(uniquequalid, uniquequalnodeid), coalesce(qualid, qualnodeid),  dbid, userid, occurences, execution_count, nbfiltered, queryid
+            qs.constant_position,
+            stat_stmt.query as query
+        FROM pg_qualstats() qs, pg_stat_statements stat_stmt
+        WHERE (qs.lrelid IS NULL) != (qs.rrelid IS NULL) and qs.queryid = stat_stmt.queryid
+    ) i GROUP BY coalesce(uniquequalid, uniquequalnodeid), coalesce(qualid, qualnodeid),  dbid, userid, occurences, execution_count, nbfiltered, queryid, query
 ;

 CREATE OR REPLACE FUNCTION pg_qualstats_deparse_qual(qual qual) RETURNS TEXT

Here is sample output:

# select * from pg_qualstats_by_query;
 uniquequalnodeid | dbid  | userid | qualnodeid | occurences | execution_count | nbfiltered |       queryid        |  constvalues  |        quals        |                  query
------------------+-------+--------+------------+------------+-----------------+------------+----------------------+---------------+---------------------+-----------------------------------------
       2211837441 | 13284 |  13283 |  322582514 |          1 |             100 |        100 | -8850145046835833150 | {'meh'::text} | {"(16672,2,641,f)"} | SELECT * FROM pgqs WHERE $1 ~ val
       3012998736 | 13284 |  13283 |  948898875 |          1 |             100 |         99 | -4649779342992481902 | {1::integer}  | {"(16672,1,96,f)"}  | SELECT COUNT(*) FROM pgqs WHERE id = $1
(2 rows)

rjuju commented 2 years ago

The current pg_qualstats_by_query view was done this way to avoid any assumption that pg_stat_statements is available on the system, and if it's that the case that the underlying pg_stat_statements entry hasn't be evicted, or simply that you would actually need the query text if you want to know the queryid(s) associated with a qual. Retrieving the query text is somewhat expensive, and a good monitoring / performance analysis system will cache the queryid / query text mapping so that it doesn't need to retrieve it from pg_stat_statements each time.

So this view only provides infrastructure to easily get the information if you want to, as you only need to write something like:

SELECT q.*, pgss.query
FROM pg_qualstats_by_query q
JOIN pg_stat_statements pgss USING (dbid, userid, queryid)

Or join instead your own queryid / query text mapping table.

powa-team / pg_qualstats

Provide context for index advice #49