[Bug]: query planner not using index with subquery & multiple chunks

phemmer commented 6 days ago

What type of bug is this?

Performance issue

What subsystems and features are affected?

Query planner

What happened?

When I execute a query that uses a subquery filter & multiple chunks, the wrong (or no) index is used, causing a large performance degradation. If I don't use a subquery, or if I only query a single chunk at a time, it works fine.

Here's an example showing the issue: https://explain.dalibo.com/plan/a5bh7372bgcg0ee8#raw 2_chunks_subquery.txt

We can see that on _timescaledb_internal._hyper_2427_264588_chunk, it's doing a seq scan without using an index, takes 10 seconds, and returns 27,604,988 rows, causing a ton of work for the higher operations. I have an index, which is on both the tag_id and time columns, which would result in a much faster query. This is why I'm using a materialized CTE here, as I was trying to strongly encourage postgres to use the index containing the tag_id column. No matter if I use a normal subquery, a join, etc, none result in using the correct index.

If I manually take that subquery (the CTE), evaluate it, and copy/paste the results into the where clause, it goes much faster: https://explain.dalibo.com/plan/54h8h7b5ee5b8gd6#raw 2_chunks_copypaste.txt

We can see now that the correct index was used (_hyper_2427_264588_chunk_haproxy_server_tag_id_time_idx), which returned only 2,400 rows, and completed in 3.2ms.

Both the above queries spanned 2 chunks. If I reduce to just the second (chronologically) of the two chunks (the one that resulted in the performance difference in the above 2 queries), though still using the subquery, the plan again uses the correct index: https://explain.dalibo.com/plan/556dh3acg5f3173g#raw 1_chunk_subquery.txt

And just for comparison, when using copy/paste instead of subquery, it has similar plan & performance: https://explain.dalibo.com/plan/825h52f73d389f9h#raw 1_chunk_copypaste.txt

So basically:

2 chunks (one compressed, one uncompressed) with subquery: bad
2 chunks (one compressed, one uncompressed) with copy/paste: good
1 chunk (uncompressed) with subquery: good
1 chunk (uncompressed) with copy/paste: good

TimescaleDB version affected

2.14.2

PostgreSQL version used

16.2

What operating system did you use?

Debian 16

What installation method did you use?

Deb/Apt

What platform did you run on?

On prem/Self-hosted

Relevant log output and stack trace

No response

How can we reproduce the bug?

create table test (time timestamptz, tag_id bigint, a int, b int, c int, d int, e int, f int, g int, h int, i int, j int, k int, l int, m int, n int, o int, p int, q int, r int, s int, t int, u int, v int, w int, x int, y int, z int);
select create_hypertable('test', 'time', chunk_time_interval => interval '4h');
alter table test set (timescaledb.compress);
create index on test (tag_id, time);
insert into test (select '2024-06-25T00:00:00Z'::timestamptz + make_interval(mins => i/10000), i%10000, i, i+1, i+2, i+3, i+4, i+5, i+6, i+7, i+8, i+9, i+10, i+11, i+12, i+13, i+14, i+15, i+16, i+17, i+18, i+19, i+20, i+21, i+22, i+23, i+24, i+25 from (select generate_series(0,60*10000-1) as i) a);
insert into test (select '2024-06-25T01:00:00Z'::timestamptz + make_interval(mins => i/10000), i%10000, i, i+1, i+2, i+3, i+4, i+5, i+6, i+7, i+8, i+9, i+10, i+11, i+12, i+13, i+14, i+15, i+16, i+17, i+18, i+19, i+20, i+21, i+22, i+23, i+24, i+25 from (select generate_series(0,60*10000-1) as i) a);
insert into test (select '2024-06-25T02:00:00Z'::timestamptz + make_interval(mins => i/10000), i%10000, i, i+1, i+2, i+3, i+4, i+5, i+6, i+7, i+8, i+9, i+10, i+11, i+12, i+13, i+14, i+15, i+16, i+17, i+18, i+19, i+20, i+21, i+22, i+23, i+24, i+25 from (select generate_series(0,60*10000-1) as i) a);
insert into test (select '2024-06-25T03:00:00Z'::timestamptz + make_interval(mins => i/10000), i%10000, i, i+1, i+2, i+3, i+4, i+5, i+6, i+7, i+8, i+9, i+10, i+11, i+12, i+13, i+14, i+15, i+16, i+17, i+18, i+19, i+20, i+21, i+22, i+23, i+24, i+25 from (select generate_series(0,60*10000-1) as i) a);
insert into test (select '2024-06-25T04:00:00Z'::timestamptz + make_interval(mins => i/10000), i%10000, i, i+1, i+2, i+3, i+4, i+5, i+6, i+7, i+8, i+9, i+10, i+11, i+12, i+13, i+14, i+15, i+16, i+17, i+18, i+19, i+20, i+21, i+22, i+23, i+24, i+25 from (select generate_series(0,60*10000-1) as i) a);
insert into test (select '2024-06-25T05:00:00Z'::timestamptz + make_interval(mins => i/10000), i%10000, i, i+1, i+2, i+3, i+4, i+5, i+6, i+7, i+8, i+9, i+10, i+11, i+12, i+13, i+14, i+15, i+16, i+17, i+18, i+19, i+20, i+21, i+22, i+23, i+24, i+25 from (select generate_series(0,60*10000-1) as i) a);
insert into test (select '2024-06-25T06:00:00Z'::timestamptz + make_interval(mins => i/10000), i%10000, i, i+1, i+2, i+3, i+4, i+5, i+6, i+7, i+8, i+9, i+10, i+11, i+12, i+13, i+14, i+15, i+16, i+17, i+18, i+19, i+20, i+21, i+22, i+23, i+24, i+25 from (select generate_series(0,60*10000-1) as i) a);
insert into test (select '2024-06-25T07:00:00Z'::timestamptz + make_interval(mins => i/10000), i%10000, i, i+1, i+2, i+3, i+4, i+5, i+6, i+7, i+8, i+9, i+10, i+11, i+12, i+13, i+14, i+15, i+16, i+17, i+18, i+19, i+20, i+21, i+22, i+23, i+24, i+25 from (select generate_series(0,60*10000-1) as i) a);
insert into test (select '2024-06-25T08:00:00Z'::timestamptz + make_interval(mins => i/10000), i%10000, i, i+1, i+2, i+3, i+4, i+5, i+6, i+7, i+8, i+9, i+10, i+11, i+12, i+13, i+14, i+15, i+16, i+17, i+18, i+19, i+20, i+21, i+22, i+23, i+24, i+25 from (select generate_series(0,60*10000-1) as i) a);
select compress_chunk(chunk_schema || '.' || chunk_name) from timescaledb_information.chunks where hypertable_name = 'test' order by range_start limit 1;

explain (analyze, verbose, costs, buffers)
with tag_ids as materialized (select distinct generate_series(0,29) as tag_id)
select tag_id,a from test
where time between '2024-06-25T00:00:00Z' and '2024-06-25T07:59:59Z'
and tag_id in (select distinct tag_id from tag_ids)
--and tag_id in (0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,34,25,26,27,28,29)
;

The above will perform a seq scan on the second chunk. But you can then decompress the chunk and watch it perform an index scan on both chunks.

nikkhils commented 6 days ago

@phemmer the main issue is that it's not able to push the materialized subquery down to the filter level for the scans on the chunks to be effective. This might be a core PostgreSQL planner limitation.

phemmer commented 5 days ago

Ok, I've managed to create a reproducer. It has to do with compression. If the first of the 2 chunks is compressed, then a seq scan is used on the second chunk. But if the first chunk is decompressed, an index scan is used on both.

nikkhils commented 4 days ago

@phemmer care to share the reproducer script here?

phemmer commented 4 days ago

I already did. it's in the description

timescale / timescaledb