Open raymalt opened 1 year ago
Hello @raymalt,
Thanks for letting us know about this issue. I was able to reproduce it on my local system with TimescaleDB 2.12.0-dev:
CREATE TABLE sensor_data(
time timestamp NOT NULL,
sensor_id integer NOT NULL,
cpu double precision,
temperature double precision);
SELECT FROM create_hypertable(relation=>'sensor_data', time_column_name=> 'time');
INSERT INTO sensor_data
SELECT time + (INTERVAL '1 minute' * random()) AS time,
sensor_id,
random() AS cpu,
random()* 100 AS temperature
FROM
generate_series('1990-01-01'::date, '2000-01-01', INTERVAL '1 week') AS g1(time),
generate_series(1, 100, 1 ) AS g2(sensor_id)
ORDER BY time;
explain (verbose, analyze) SELECT
time,
sensor_id,
temperature
FROM
sensor_data
WHERE
time > now()
AND sensor_id IN (4)
AND FALSE;
QUERY PLAN
------------------------------------------------------------------------------------
Result (cost=0.00..0.00 rows=0 width=0) (actual time=0.004..0.005 rows=0 loops=1)
Output: "time", sensor_id, temperature
One-Time Filter: false
Planning Time: 1.307 ms
Execution Time: 0.038 ms
(5 rows)
explain (verbose, analyze) SELECT
time,
sensor_id,
temperature
FROM
sensor_data
WHERE
time > now()
AND sensor_id IN (SELECT 4)
AND FALSE;
QUERY PLAN
------------------------------------------------------------------------------------
Result (cost=0.00..0.00 rows=0 width=0) (actual time=0.004..0.005 rows=0 loops=1)
Output: sensor_data."time", sensor_data.sensor_id, sensor_data.temperature
One-Time Filter: false
Planning Time: 105.857 ms
Execution Time: 0.051 ms
(5 rows)
So, query 2 needs around 104 ms more planning time.
The hypertable expansion function (ts_plan_expand_hypertable_chunks
) appears to be causing longer planning times. This function is not called for query 1 (which contains a const), but for query 2 (which includes a subquery). It is called once and takes approximately 101 ms to execute.
# Query 1
$ sudo funclatency-bpfcc /home/jan/postgresql-sandbox/bin/REL_14_9_RELEASE/lib/timescaledb-2.12.0-dev.so:ts_plan_expand_hypertable_chunks
[...]
# Query 2
$ sudo funclatency-bpfcc /home/jan/postgresql-sandbox/bin/REL_14_9_RELEASE/lib/timescaledb-2.12.0-dev.so:ts_plan_expand_hypertable_chunks
[...]
Function = b'ts_plan_expand_hypertable_chunks' [3333927]
nsecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 0 | |
1024 -> 2047 : 0 | |
2048 -> 4095 : 0 | |
4096 -> 8191 : 0 | |
8192 -> 16383 : 0 | |
16384 -> 32767 : 0 | |
32768 -> 65535 : 0 | |
65536 -> 131071 : 0 | |
131072 -> 262143 : 0 | |
262144 -> 524287 : 0 | |
524288 -> 1048575 : 0 | |
1048576 -> 2097151 : 0 | |
2097152 -> 4194303 : 0 | |
4194304 -> 8388607 : 0 | |
8388608 -> 16777215 : 0 | |
16777216 -> 33554431 : 0 | |
33554432 -> 67108863 : 0 | |
67108864 -> 134217727 : 1 |****************************************|
avg = 101887997 nsecs, total: 101887997 nsecs, count: 1
What type of bug is this?
Performance issue
What subsystems and features are affected?
Query planner
What happened?
This blog post contains solutions for combining several continuous aggregates in a visualization, where the selection is made based on the time resolution of the plotting tool (e.g. Grafana). This means that only one query should return anything, and the others should have a FALSE condition - causing them to return nothing.
It seems that combining this with a condition based on a subquery in the
WHERE
clause causes the query planning to take a very long time - a time which is probably dependent on the number of chunks in the hypertable.A stripped-down query is shown below, the first one had 0.13 ms planning time for me and the second one had 480 ms. The only difference is that the device_id is filtered from list of values vs. a SELECT statement. The hypertable has 4000+ chunks. Another hypertable with 800+ had the same issue, but the planning time went down to about 50 ms instead - still a 400x increase in planning time.
N.B.: While the below examples might not seem very useful and practical, they are actually taken (and stripped down) from a real use case, where the
UNION ALL
method is combined with a dynamic query based on a user setting.Note that the queries are intentionally written in a way so that they return zero rows (
time > now()
).The outputs from
EXPLAIN (ANALYZE, VERBOSE, BUFFERS
are included in the log output section below. I assume thatBuffers: shared hit=<very large number>
is relevant.Query with 0.16 ms planning time
Query with 577 ms planning time
TimescaleDB version affected
2.11.2
PostgreSQL version used
15.4
What operating system did you use?
Timescale Cloud
What installation method did you use?
Not applicable
What platform did you run on?
Timescale Cloud
Relevant log output and stack trace
Query 2