Open pohmelie opened 3 weeks ago
@pohmelie is your timescaledb version 2.5.1
? That seems very old or did you want to mention 2.15.1
?
@nikkhils yes, my bad. It is 2.15.1.
@pohmelie the two queries
explain analyze select * from table_data_1m
where timestamp >= '2024-01-01'
order by timestamp asc limit 5;
AND
explain analyze select * from _timescaledb_internal._materialized_hypertable_2
where timestamp >= '2024-01-01'
order by timestamp asc limit 5;
are NOT the same. In the first query due to real time aggregation, the query that runs on the _timescaledb_internal._materialized_hypertable_2
table has a plan of:
-> Sort (cost=1693.69..1704.66 rows=4386 width=31) (actual time=259.512..259.514 rows=5 loops=1)
Sort Key: _hyper_2_2_chunk."timestamp"
Sort Method: top-N heapsort Memory: 25kB
-> Append (cost=0.00..1620.84 rows=4386 width=31) (actual time=0.016..179.104 rows=538387 loops=1)
-> Seq Scan on _hyper_2_2_chunk (cost=0.00..55.41 rows=2560 width=19) (actual time=0.015..0.448 rows=2561 loops=1)
Filter: (("timestamp" < '2024-06-14 21:35:00+04'::timestamp with time zone) AND ("timestamp" >= '2024-01-01 00:00:00+04'::timestamp with time zone))
-> Bitmap Heap Scan on _hyper_2_5_chunk (cost=24.64..1543.50 rows=1826 width=48) (actual time=17.811..111.441 rows=535826 loops=1)
Recheck Cond: (("timestamp" < '2024-06-14 21:35:00+04'::timestamp with time zone) AND ("timestamp" >= '2024-01-01 00:00:00+04'::timestamp with time zone))
Heap Blocks: exact=3413
-> Bitmap Index Scan on _hyper_2_5_chunk__materialized_hypertable_2_timestamp_idx (cost=0.00..24.18 rows=1826 width=0) (actual time=17.148..17.148 rows=535826 loops=1)
Index Cond: (("timestamp" < '2024-06-14 21:35:00+04'::timestamp with time zone) AND ("timestamp" >= '2024-01-01 00:00:00+04'::timestamp with time zone))
The expected rows and actual number of rows are out of sync.
(cost=24.64..1543.50 rows=1826 width=48) (actual time=17.811..111.441 rows=535826 loops=1)
Please run ANALYZE _timescaledb_internal._materialized_hypertable_2
to get the statistics up-to-date and then retry the query.
Actually it is explain analyze
in original script, so it will not change anything. Anyway.
create table table_data (
prefix text not null,
"timestamp" timestamp with time zone not null,
value float,
constraint unique_table_data_timestamp_prefix unique (timestamp, prefix)
);
select create_hypertable('table_data', by_range('timestamp'));
create materialized view table_data_1m
with (timescaledb.continuous, timescaledb.materialized_only = false) as
select
prefix,
time_bucket('1 minute', timestamp) as timestamp,
last(value, "timestamp") as value
from table_data
group by time_bucket('1 minute', timestamp), prefix;
create index table_data_1m_prefix_timestamp
on table_data_1m (prefix, timestamp);
call refresh_continuous_aggregate('table_data_1m', null, null);
insert into table_data (prefix, timestamp, value)
select
left(md5(random()::text), 2),
'2024-01-01'::timestamptz + make_interval(secs => i / 1000.0),
random()
from generate_series(1, 10 * 60 * 1000) s(i);
call refresh_continuous_aggregate('table_data_1m', null, null);
insert into table_data (prefix, timestamp, value)
select
left(md5(random()::text), 2),
now() + make_interval(secs => i),
random()
from generate_series(1, 10 * 60 * 1000) s(i);
call refresh_continuous_aggregate('table_data_1m', '2024-06-01', null);
explain analyze select * from table_data_1m
where timestamp >= '2024-01-01'
order by timestamp asc limit 5;
ANALYZE _timescaledb_internal._materialized_hypertable_2;
explain analyze select * from _timescaledb_internal._materialized_hypertable_2
where timestamp >= '2024-01-01'
order by timestamp asc limit 5;
Limit (cost=1694.08..1694.14 rows=5 width=31) (actual time=239.235..239.240 rows=5 loops=1)
-> Merge Append (cost=1694.08..1748.93 rows=4386 width=31) (actual time=239.233..239.236 rows=5 loops=1)
Sort Key: _hyper_2_2_chunk."timestamp"
-> Sort (cost=1692.63..1703.59 rows=4385 width=31) (actual time=239.179..239.181 rows=5 loops=1)
Sort Key: _hyper_2_2_chunk."timestamp"
Sort Method: top-N heapsort Memory: 25kB
-> Append (cost=0.00..1619.79 rows=4385 width=31) (actual time=0.008..163.374 rows=538155 loops=1)
-> Seq Scan on _hyper_2_2_chunk (cost=0.00..55.41 rows=2560 width=19) (actual time=0.007..0.556 rows=2561 loops=1)
Filter: (("timestamp" < '2024-06-28 01:26:00+04'::timestamp with time zone) AND ("timestamp" >= '2024-01-01 00:00:00+04'::timestamp with time zone))
-> Bitmap Heap Scan on _hyper_2_5_chunk (cost=24.63..1542.45 rows=1825 width=48) (actual time=18.866..102.341 rows=535594 loops=1)
Recheck Cond: (("timestamp" < '2024-06-28 01:26:00+04'::timestamp with time zone) AND ("timestamp" >= '2024-01-01 00:00:00+04'::timestamp with time zone))
Heap Blocks: exact=3412
-> Bitmap Index Scan on _hyper_2_5_chunk__materialized_hypertable_2_timestamp_idx (cost=0.00..24.17 rows=1825 width=0) (actual time=18.251..18.251 rows=535594 loops=1)
Index Cond: (("timestamp" < '2024-06-28 01:26:00+04'::timestamp with time zone) AND ("timestamp" >= '2024-01-01 00:00:00+04'::timestamp with time zone))
-> GroupAggregate (cost=1.44..1.47 rows=1 width=19) (actual time=0.052..0.053 rows=0 loops=1)
Group Key: (time_bucket('00:01:00'::interval, _hyper_1_4_chunk."timestamp")), _hyper_1_4_chunk.prefix
-> Sort (cost=1.44..1.44 rows=1 width=27) (actual time=0.051..0.051 rows=0 loops=1)
Sort Key: (time_bucket('00:01:00'::interval, _hyper_1_4_chunk."timestamp")), _hyper_1_4_chunk.prefix
Sort Method: quicksort Memory: 25kB
-> Result (cost=0.29..1.43 rows=1 width=27) (actual time=0.022..0.022 rows=0 loops=1)
-> Index Scan using _hyper_1_4_chunk_table_data_timestamp_idx on _hyper_1_4_chunk (cost=0.29..1.42 rows=1 width=19) (actual time=0.021..0.021 rows=0 loops=1)
Index Cond: (("timestamp" >= '2024-06-28 01:26:00+04'::timestamp with time zone) AND ("timestamp" >= '2024-01-01 00:00:00+04'::timestamp with time zone))
Filter: (time_bucket('00:01:00'::interval, "timestamp") >= '2024-01-01 00:00:00+04'::timestamp with time zone)
Planning Time: 2.398 ms
Execution Time: 239.378 ms
Limit (cost=0.28..0.41 rows=5 width=19) (actual time=0.017..0.020 rows=5 loops=1)
-> Custom Scan (ChunkAppend) on _materialized_hypertable_2 (cost=0.28..68.80 rows=2561 width=19) (actual time=0.016..0.019 rows=5 loops=1)
Order: _materialized_hypertable_2."timestamp"
-> Index Scan Backward using _hyper_2_2_chunk__materialized_hypertable_2_timestamp_idx on _hyper_2_2_chunk (cost=0.28..68.80 rows=2561 width=19) (actual time=0.016..0.017 rows=5 loops=1)
Index Cond: ("timestamp" >= '2024-01-01 00:00:00+04'::timestamp with time zone)
-> Index Scan Backward using _hyper_2_5_chunk__materialized_hypertable_2_timestamp_idx on _hyper_2_5_chunk (cost=0.42..13709.42 rows=535594 width=19) (never executed)
Index Cond: ("timestamp" >= '2024-01-01 00:00:00+04'::timestamp with time zone)
Planning Time: 0.966 ms
Execution Time: 0.048 ms
What type of bug is this?
Performance issue
What subsystems and features are affected?
Query executor, Query planner
What happened?
We faced problem that realtime aggregation select request is much heavier than request to the internal materialized view. In terms of execution it is more than 1000 times.
I read some other issues (#6976, #6321), but it looks like they solved and this one is irrelevant.
TimescaleDB version affected
2.15.1
PostgreSQL version used
16.3
What operating system did you use?
ubuntu 22.04
What installation method did you use?
Docker
What platform did you run on?
Not applicable
Relevant log output and stack trace
No response
How can we reproduce the bug?