performance optimisation possible on several queries

marco44 commented 1 year ago

This family of queries run when we're on a "per database" overview. They're trying to figure out min/max of counters for each query, for wait events… and they can be slow. It's the case for me, as the VM that runs powa is quite small, and has very little IO. So any sort on disk is very slow.

So let's take this query as an example (extracted from my server log):

SELECT anon_1.srvid, anon_1.queryid, powa_statements.query, anon_1.event_type, anon_1.event, sum(anon_1.count) AS counts 
        FROM (SELECT srvid, queryid, dbid, datname, event_type, event, max(count) - min(count) AS count 
        FROM 
          powa_databases,
          LATERAL
          (
            SELECT unnested.dbid, unnested.queryid,
              unnested.event_type, unnested.event, (unnested.records).*
            FROM (
              SELECT wsh.dbid, wsh.queryid, wsh.event_type, wsh.event,
                wsh.coalesce_range, unnest(records) AS records
              FROM powa_wait_sampling_history wsh
              WHERE coalesce_range && tstzrange('2023-04-23 16:23:15+0200', '2023-04-24 16:23:15+0200', '[]')
              AND wsh.dbid = powa_databases.oid
              AND wsh.queryid IN (
                SELECT ps.queryid
                FROM powa_statements ps
                WHERE ps.dbid = powa_databases.oid
                  AND ps.srvid = '4'
              )
              AND wsh.srvid = '4'
            ) AS unnested
            WHERE  (records).ts <@ tstzrange('2023-04-23 16:23:15+0200', '2023-04-24 16:23:15+0200', '[]')
            UNION ALL
            SELECT wsc.dbid, wsc.queryid, wsc.event_type, wsc.event, (wsc.record).*
            FROM powa_wait_sampling_history_current wsc
            WHERE (record).ts <@ tstzrange('2023-04-23 16:23:15+0200','2023-04-24 16:23:15+0200','[]')
            AND wsc.dbid = powa_databases.oid
            AND wsc.queryid IN (
              SELECT ps.queryid
              FROM powa_statements ps
              WHERE ps.dbid = powa_databases.oid
                AND ps.srvid = '4'
            )
            AND wsc.srvid = '4'
          ) h
          WHERE powa_databases.srvid = '4'
         GROUP BY srvid, queryid, dbid, datname, event_type, event 
        HAVING max(count) - min(count) > 0) AS anon_1 JOIN powa_statements ON powa_statements.srvid = anon_1.srvid AND powa_statements.queryid = anon_1.queryid AND powa_statements.dbid = anon_1.dbid 
        WHERE anon_1.datname = 'my_database' GROUP BY anon_1.srvid, anon_1.queryid, powa_statements.query, anon_1.event_type, anon_1.event ORDER BY sum(anon_1.count) DESC;

Values have no importance, just note that it's one day of data, and it already takes a long time. The problem is that we have to sort all those records (and everything attached to them) to perform the group by in the end. Which results in a big sort, with lots of data written on disk.

This query is made of 2 parts: one that unnests all records on the interval from powa_wait_sampling_history, and one that gets the records from powa_wait_sampling_history_current. All of this is then unioned and then we take max/min for the whole dataset.

This can, I think, be optimized this way:

All records from powa_wait_sampling_history already embed a min and a max record that are stored alongside the array of records of the coalesced period. We can retrieve those min/max directly instead of all records, for coalesce periods that are ENTIRELY comprised in the search interval, as they're the only candidates in the whole array that potentially can be the min and max of the whole dataset. That's less to sort, and if we're lucky, we're even saving a trip to the toast
We still need to min/max the records from the "current" table that match the search interval so this subquery stays the same
And we need to retrieve the real min/max for the coalesced records that are on the bounds of our search interval: not all records in them are going to match

So here's a rewrite proposal:

SELECT anon_1.srvid, anon_1.queryid, powa_statements.query, anon_1.event_type, anon_1.event, sum(anon_1.count) AS counts 
        FROM (SELECT srvid, queryid, dbid, datname, event_type, event, max(count) - min(count) AS count 
        FROM 
          powa_databases,
          LATERAL
          (
            -- Left bound: the search interval a single timestamp, the smallest one of the search interval, and has to be inside the coalesce_range. We still need to unnest this one
            SELECT unnested.dbid, unnested.queryid,
              unnested.event_type, unnested.event, (unnested.records).*
            FROM (
              SELECT wsh.dbid, wsh.queryid, wsh.event_type, wsh.event,
                wsh.coalesce_range, unnest(records) AS records
              FROM powa_wait_sampling_history wsh
              WHERE coalesce_range && tstzrange('2023-04-23 16:23:15+0200', '2023-04-23 16:23:15+0200', '[]')
              AND wsh.dbid = powa_databases.oid
              AND wsh.queryid IN (
                SELECT ps.queryid
                FROM powa_statements ps
                WHERE ps.dbid = powa_databases.oid
                  AND ps.srvid = '4'
              )
              AND wsh.srvid = '4'
            ) AS unnested
            WHERE  (records).ts <@ tstzrange('2023-04-23 16:23:15+0200', '2023-04-24 16:23:15+0200', '[]')
            UNION ALL
            -- right bound: the search interval a single timestamp, the largest one of the search interval, and has to be inside the coalesce_range. We still need to unnest this one
            SELECT unnested.dbid, unnested.queryid,
              unnested.event_type, unnested.event, (unnested.records).*
            FROM (
              SELECT wsh.dbid, wsh.queryid, wsh.event_type, wsh.event,
                wsh.coalesce_range, unnest(records) AS records
              FROM powa_wait_sampling_history wsh
              WHERE coalesce_range && tstzrange('2023-04-24 16:23:15+0200', '2023-04-24 16:23:15+0200', '[]')
              AND wsh.dbid = powa_databases.oid
              AND wsh.queryid IN (
                SELECT ps.queryid
                FROM powa_statements ps
                WHERE ps.dbid = powa_databases.oid
                  AND ps.srvid = '4'
              )
              AND wsh.srvid = '4'
            ) AS unnested
            WHERE  (records).ts <@ tstzrange('2023-04-23 16:23:15+0200', '2023-04-24 16:23:15+0200', '[]')
            UNION ALL
            -- These entries have their coalesce range ENTIRELY inside the search rang. So we don't need to unnest them
            -- We just retrieve the mins_in_range, maxs_in_range from the record, build an array of this, and return it like it was the full record
            SELECT unnested.dbid, unnested.queryid,
              unnested.event_type, unnested.event, (unnested.records).*
            FROM (
              SELECT wsh.dbid, wsh.queryid, wsh.event_type, wsh.event,
                wsh.coalesce_range, unnest(ARRAY[mins_in_range,maxs_in_range]) AS records
              FROM powa_wait_sampling_history wsh
              WHERE coalesce_range <@ tstzrange('2023-04-23 16:23:15+0200', '2023-04-24 16:23:15+0201', '[]')
              AND wsh.dbid = powa_databases.oid
              AND wsh.queryid IN (
                SELECT ps.queryid
                FROM powa_statements ps
                WHERE ps.dbid = powa_databases.oid
                  AND ps.srvid = '4'
              )
              AND wsh.srvid = '4'
            ) AS unnested
            WHERE  (records).ts <@ tstzrange('2023-04-23 16:23:15+0200', '2023-04-24 16:23:15+0200', '[]')
            UNION ALL
            -- This is just as before: we have to filter all "current" records
            SELECT wsc.dbid, wsc.queryid, wsc.event_type, wsc.event, (wsc.record).*
            FROM powa_wait_sampling_history_current wsc
            WHERE (record).ts <@ tstzrange('2023-04-23 16:23:15+0200','2023-04-24 16:23:15+0200','[]')
            AND wsc.dbid = powa_databases.oid
            AND wsc.queryid IN (
              SELECT ps.queryid
              FROM powa_statements ps
              WHERE ps.dbid = powa_databases.oid
                AND ps.srvid = '4'
            )
            AND wsc.srvid = '4'
          ) h
          WHERE powa_databases.srvid = '4'
         GROUP BY srvid, queryid, dbid, datname, event_type, event 
        HAVING max(count) - min(count) > 0) AS anon_1 JOIN powa_statements ON powa_statements.srvid = anon_1.srvid AND powa_statements.queryid = anon_1.queryid AND powa_statements.dbid = anon_1.dbid 
        WHERE anon_1.datname = 'my_database' GROUP BY anon_1.srvid, anon_1.queryid, powa_statements.query, anon_1.event_type, anon_1.event ORDER BY sum(anon_1.count) DESC;

rjuju commented 1 year ago

Nice idea. While looking at it, I was thinking that for the part were we use the metadata rather than unnesting a whole record, we could use exclusive bounds (:from, :to, '()') rather than inclusive bounds, as we will already fully unnest those? Shouldn't make much a difference but still a slight win.

marco44 commented 1 year ago

Yeah exclusive bounds should work too

rjuju commented 1 year ago

I forgot to update this issue. This has been committed as of https://github.com/powa-team/powa-web/commit/be95314e1a008007679e866238217dfd71ef1eea and released in the latest version 4.1.4, so marking this issue as closed.

FTR I kept the bounds inclusive for now to keep things simple as it can only lead to sorting 4 extra rows, which isn't much. We could later change that, after triple checking that this is indeed correct, if needed.

powa-team / powa-web

performance optimisation possible on several queries #167