citusdata / citus

Distributed PostgreSQL as an extension
https://www.citusdata.com
GNU Affero General Public License v3.0
10.65k stars 670 forks source link

[Performance, Columnar] window function in subquery greatly slows down query, even when unreferenced #7660

Open srblum opened 4 months ago

srblum commented 4 months ago

Problem

When using large columnar tables, window functions in subqueries greatly slow down query performance, even when the window function columns are not referenced by the outer query.

I am hoping that this could be fixed by adjusting columnar scans to ignore columns in subqueries that are unreferenced.

I am more than happy to provide additional context, including query plans, plpgsql functions to recreate test data, version info, etc.

Thanks for your time and support. 🙇

Why would someone ever need an unreferenced window function?

I work on a manufacturing analytics platform that allows users to create and update views of their time-series data with arbitrary expressions. Some of those expressions contain window functions.

Below is an example of a command to create such a view, containing a trivial expression.

CREATE VIEW v_washer AS (
    SELECT *, 1 AS x
    FROM washer
);

After the view is created, it becomes the new basis for all users' queries on the platform, regardless of whether those queries need any of the computed expressions. This has been a fantastic way to allow users to easily customize their data.

Unfortunately, as soon as any user adds a window function to the view, it spoils load times for everyone.

What does the data look like?

Manufacturing time-series data. Each row represents the state of a machine for a given time interval. The data used for testing was structured as follows:

starttime endtime machine__source ...700 other columns
2021-01-01 00:00:00+00 2021-01-01 00:00:30+00 A ...
2021-01-01 00:00:30+00 2021-01-01 00:01:00+00 A ...
2021-01-01 00:01:00+00 2021-01-01 00:01:30+00 A ...
... ... ... ...

The test queries below were performed on a partitioned table with the following properties:

700 columns 12 million total rows 3 years of data, split into 36 monthly, columnar partitions Each of the 36 partitions is about 36MB for a total of ~1.3 GB (highly compressed--the same data is ~100GB in row-based storage.)

Query performance

Below are average query speeds for each of 7 test queries. All queries were performed 3 times after throwing out the first result to remove the variance associated with caching.

Query Time
Query1 (small subquery, no wf) 0.09s
Query2 (large subquery, no wf) 0.09s
Query3 (no subquery, wf) 0.15s
Query4 (small subquery, ignored wf) 3.0s
Query5 (small subquery, referenced wf) 6.2s
Query6 (large subquery, ignored wf) 7.5s
Query7 (large subquery, referenced wf) 9.8s

My hope is that, by fixing the query planner:

Full Queries

Below are the full queries referenced in the table above.

Query1 (small subquery, no wf)

select sum("inner".output2) from (
    select
        endtime,
        machine__source,
        output + 1 as output2
    from timeseries
) as "inner"
where "inner".machine__source in ('A') and
"inner".endtime > '2023-09-01'::timestamptz;

Query2 (large subquery, no wf)

select sum("inner".output2) from (
    select
        *,
        output + 1 as output2
    from timeseries
) as "inner"
where "inner".machine__source in ('A') and
"inner".endtime > '2023-09-01'::timestamptz;

Query3 (no subquery, wf)

select
    case
        when
            shift = 'A' and
            lag(shift) over (partition by machine__source order by endtime) != 'A'
        then 1
        else 0
    end as window_expr
from timeseries
where machine__source in ('A') and
endtime > '2023-09-01'::timestamptz;

Query4 (small subquery, ignored wf)

select "inner".output from (
    select
        output,
        machine__source,
        endtime,
        case
          when shift = 'A' and
          lag(shift) over (partition by machine__source order by endtime) != 'A'
          then 1
          else 0
        end as window_expr
    from timeseries
) as "inner"
where "inner".machine__source in ('A') and
"inner".endtime > '2023-09-01'::timestamptz;

Query5 (small subquery, referenced wf)

select "inner".window_expr from (
    select
        output,
        machine__source,
        endtime,
        case
          when shift = 'A' and
          lag(shift) over (partition by machine__source order by endtime) != 'A'
          then 1
          else 0
        end as window_expr
    from timeseries
) as "inner"
where "inner".machine__source in ('A') and
"inner".endtime > '2023-09-01'::timestamptz;

Query6 (large subquery, ignored wf)

select "inner".output from (
    select
        *,
        case
          when
            shift = 'A' and
            lag(shift) over (partition by machine__source order by endtime) != 'A'
          then 1
          else 0
        end as window_expr
    from timeseries
) as "inner"
where "inner".machine__source in ('A') and
"inner".endtime > '2023-09-01'::timestamptz;

Query7 (large subquery, referenced wf)

select "inner".window_expr from (
    select
        *,
        case
          when
            shift = 'A' and
            lag(shift) over (partition by machine__source order by endtime) != 'A'
          then 1
          else 0
        end as window_expr
    from timeseries
) as "inner"
where "inner".machine__source in ('A') and
"inner".endtime > '2023-09-01'::timestamptz;

Version

# select * from version();
                                                        version                                                        
-----------------------------------------------------------------------------------------------------------------------
 PostgreSQL 14.11 (Debian 14.11-1.pgdg120+2) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit

# select * from pg_extension;
    oid    |    extname     | extowner | extnamespace | extrelocatable | extversion | extconfig | extcondition 
-----------+----------------+----------+--------------+----------------+------------+-----------+--------------
 867184318 | citus_columnar |       10 |           11 | f              | 11.3-1     |           | 

Note: the issue was not resolved when running the same queries in PostgreSQL 16.3