timescale / timescaledb

An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.
https://www.timescale.com/
Other
17.91k stars 882 forks source link

[Bug]: Bad plan produced when using CAGG name but good plan produced when using hypertable name! #6286

Open jvanns opened 1 year ago

jvanns commented 1 year ago

What type of bug is this?

Performance issue

What subsystems and features are affected?

Continuous aggregate, Query planner

What happened?

Well, this is an odd one for sure and it's plagued be for weeks until I managed to narrow down where the problem lies. Although I cannot easily craft a full MRP for you, I can provide a straight-forward query and the resulting plans for you. The crux of the issue appears to be some sort of disconnect between CAGG name and hypertable name. I have 2x CAGGs - lets call them A and B. A sources from the main 'raw' hypertable, B sources from A thus creating a hierarchical CAGG setup. CAGG A buckets at 15m intervals, B at 30m intervals. The whole DB schema is created up-front via a .sql script and the CAGGs are explicitly created thus;

materialized only NO DATA chunk interval set to 12h on both compression enabled

Here are the links to both the queries and respective plans;

https://explain.dalibo.com/plan/eag931806e5b6h82#plan https://explain.dalibo.com/plan/0c8fbe46e424142e#plan https://explain.dalibo.com/plan/54ec723bca45e7f7#plan https://explain.dalibo.com/plan/a21542ffb46acf57#plan

Basically, using the underlying materialized hypertable name produces an expected good plan (executing in milliseconds) whereas using the CAGG name, we get a very odd 'bad' plan (executing in minutes). Whats more is this appears to effect other queries - the one provided is a simple way of reproducing it.

TimescaleDB version affected

2.11

PostgreSQL version used

15.3

What operating system did you use?

Centos 7.9

What installation method did you use?

Docker

What platform did you run on?

On prem/Self-hosted

Relevant log output and stack trace

https://explain.dalibo.com/plan/eag931806e5b6h82#plan
https://explain.dalibo.com/plan/0c8fbe46e424142e#plan
https://explain.dalibo.com/plan/54ec723bca45e7f7#plan
https://explain.dalibo.com/plan/a21542ffb46acf57#plan

How can we reproduce the bug?

Hopefully enough information is given above to start with. I'll try and provide an isolated MRP later.
jvanns commented 1 year ago

This could be related to https://github.com/timescale/timescaledb/issues/2841 somehow?

zilder commented 1 year ago

Hi @jvanns,

Could you pls share the definition of your cagg's materialized view (and if possible other details such as schema definition, caggs hierarchy definition etc)

I've been trying to reproduce your use-case. I've noticed that the slower one pulls all the rows first then sorts them and runs aggregation, while the fast one just picks the first row from the ChunkAppend node (ordered by ts DESC) and calls it a day. So something prevents the first one from using ordered ChunkAppend.

I think I was able to reproduce the issue (see the example below). Pls let me know if it is similar to your situation.

-- create and populate a hypertable
create table test (ts timestamp not null, device_id int, val float);
select create_hypertable('test', 'ts', chunk_time_interval=>'1 day'::interval);

insert into test
select
    generate_series('2023-11-01', '2023-11-15', '1 minute'::interval) as ts,
    (random() * 10)::int as device_id,
    random() * 10 as val;

-- create an explicitly ORDERED cagg
create materialized view test_15m
with (
    timescaledb.continuous,
    timescaledb.materialized_only = true
) as
select
  time_bucket('15 minute', "ts") AS ts,
min(val), max(val), avg(val)
from test
group by 1
order by 1;

-- a cagg without ORDER BY
create materialized view test_15m_unordered
with (
    timescaledb.continuous,
    timescaledb.materialized_only = true
) as
select
  time_bucket('15 minute', "ts") AS ts,
min(val), max(val), avg(val)
from test
group by 1;

And the query plans. The first one uses the matview with explicit ORDER BY. It looks similar to your slow plan example (i.e. also pulling all the rows first, then aggregating):

# explain (analyze, costs off) select max(ts) from test_15m;
                                                                              QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate (actual time=1.358..1.362 rows=1 loops=1)
   ->  Result (actual time=0.042..1.157 rows=1345 loops=1)
         ->  Custom Scan (ChunkAppend) on _materialized_hypertable_2 (actual time=0.035..0.877 rows=1345 loops=1)
               Order: _materialized_hypertable_2.ts
               ->  Index Only Scan Backward using _hyper_2_16_chunk__materialized_hypertable_2_ts_idx on _hyper_2_16_chunk (actual time=0.033..0.417 rows=768 loops=1)
                     Heap Fetches: 768
               ->  Index Only Scan Backward using _hyper_2_17_chunk__materialized_hypertable_2_ts_idx on _hyper_2_17_chunk (actual time=0.015..0.294 rows=577 loops=1)
                     Heap Fetches: 577
 Planning Time: 1.410 ms
 Execution Time: 1.439 ms
(10 rows)

The second one just picks the first row from the top of the ordered by ts DESC ChunkAppend and finishes:

# explain (analyze, costs off) select max(ts) from test_15m_unordered;
                                                                          QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------
 Result (actual time=0.054..0.056 rows=1 loops=1)
   InitPlan 1 (returns $0)
     ->  Limit (actual time=0.047..0.049 rows=1 loops=1)
           ->  Custom Scan (ChunkAppend) on _materialized_hypertable_3 (actual time=0.045..0.046 rows=1 loops=1)
                 Order: _materialized_hypertable_3.ts DESC
                 ->  Index Only Scan using _hyper_3_19_chunk__materialized_hypertable_3_ts_idx on _hyper_3_19_chunk (actual time=0.044..0.044 rows=1 loops=1)
                       Index Cond: (ts IS NOT NULL)
                       Heap Fetches: 1
                 ->  Index Only Scan using _hyper_3_18_chunk__materialized_hypertable_3_ts_idx on _hyper_3_18_chunk (never executed)
                       Index Cond: (ts IS NOT NULL)
                       Heap Fetches: 0
 Planning Time: 1.198 ms
 Execution Time: 0.111 ms
(13 rows)
jvanns commented 12 months ago

Sure, with some obfuscation to hide any potential sensitive schema naming etc., our CAGG definition looks like this;

CREATE MATERIALIZED VIEW foobar_metrics_15m
WITH (timescaledb.continuous) AS
SELECT
   -- Try to match order and names of source metrics table first
   time_bucket(INTERVAL '30 minutes', ts) AS ts,
   dev_id,
   loc_id,
   ctxt_id,
   SUM(num) AS num,
   SUM(size) AS size,
   (result = 0)::BOOLEAN AS success,
   ((flags & 2::SMALLINT) != 0)::BOOLEAN AS special,
   PERCENTILE_CONT(0.50) WITHIN GROUP(ORDER BY duration) AS duration,
   operation_class(code) AS function_class
FROM metrics
GROUP BY
      -- All bar the aggregates, obviously...
      1,2,3,4,7,8,10
   ORDER BY
      -- Groupings
      1,2,3,4,10,
      -- Aggregates
      5 DESC, -- Num
      6 DESC, -- Size
      7 DESC, -- Success?
      8 DESC  -- Special?
WITH NO DATA;
ALTER MATERIALIZED VIEW foobar_metrics_15m SET (
   timescaledb.materialized_only = true
);
SELECT set_chunk_time_interval(
   materialization_hypertable_name::TEXT,
   INTERVAL '6 hours'
)
FROM timescaledb_information.continuous_aggregates
WHERE view_name = 'foobar_metrics_15m';

So you think its more to do with explicit ordering in the definition? Note that the peculiarity I reported seemed to be that if you queried via the CAGG VIEW name you get the 'bad' plan yet the exact same query against the underlying hypertable name, .e.g _materialized_hypertable_2 in your example above, produced the 'good' plan and executed in a far more acceptable time!

Thanks for looking into it and reporting back. Glad to see there is at least a reproducible discrepancy of sorts.

jvanns commented 7 months ago

Just wanted to flag that this bug is still present and renders our CAGGs largely useless because they're impractical (result sets never returned in a reasonable time). I've managed to confirm the oddity does appear to arise from the use of ORDERing in the CAGG definition as @zilder has suspected and reproduced. I've attached two more 'bad' and 'better' plans that demonstrate the plan divergence simply by changing the CAGG name for the materialized hypertable name. Note that in tbe bad plan, for some reason, the ORDERing is applied for each column despite only ever querying on the timestamp (the 1st column in the ORDERing). The better plan is definitely more acceptable and expected.

better-plan.txt bad-plan.txt

jvanns commented 7 months ago

I can confirm that this indeed due to the use of an ORDER BY in the materialized view query definition. With it, queries always require a sort key using all those chosen fields. If removed, then the query is planned and executed more how you'd expect.

nckswt commented 3 months ago

Just noting that we've had to disable realtime aggregates, seemingly because of this issue :cry: