Open phemmer opened 6 days ago
@phemmer the main issue is that it's not able to push the materialized subquery down to the filter level for the scans on the chunks to be effective. This might be a core PostgreSQL planner limitation.
Ok, I've managed to create a reproducer. It has to do with compression. If the first of the 2 chunks is compressed, then a seq scan is used on the second chunk. But if the first chunk is decompressed, an index scan is used on both.
@phemmer care to share the reproducer script here?
I already did. it's in the description
What type of bug is this?
Performance issue
What subsystems and features are affected?
Query planner
What happened?
When I execute a query that uses a subquery filter & multiple chunks, the wrong (or no) index is used, causing a large performance degradation. If I don't use a subquery, or if I only query a single chunk at a time, it works fine.
Here's an example showing the issue: https://explain.dalibo.com/plan/a5bh7372bgcg0ee8#raw 2_chunks_subquery.txt
We can see that on
_timescaledb_internal._hyper_2427_264588_chunk
, it's doing aseq scan
without using an index, takes 10 seconds, and returns 27,604,988 rows, causing a ton of work for the higher operations. I have an index, which is on both thetag_id
andtime
columns, which would result in a much faster query. This is why I'm using a materialized CTE here, as I was trying to strongly encourage postgres to use the index containing thetag_id
column. No matter if I use a normal subquery, a join, etc, none result in using the correct index.If I manually take that subquery (the CTE), evaluate it, and copy/paste the results into the
where
clause, it goes much faster: https://explain.dalibo.com/plan/54h8h7b5ee5b8gd6#raw 2_chunks_copypaste.txtWe can see now that the correct index was used (
_hyper_2427_264588_chunk_haproxy_server_tag_id_time_idx
), which returned only 2,400 rows, and completed in 3.2ms.Both the above queries spanned 2 chunks. If I reduce to just the second (chronologically) of the two chunks (the one that resulted in the performance difference in the above 2 queries), though still using the subquery, the plan again uses the correct index: https://explain.dalibo.com/plan/556dh3acg5f3173g#raw 1_chunk_subquery.txt
And just for comparison, when using copy/paste instead of subquery, it has similar plan & performance: https://explain.dalibo.com/plan/825h52f73d389f9h#raw 1_chunk_copypaste.txt
So basically:
TimescaleDB version affected
2.14.2
PostgreSQL version used
16.2
What operating system did you use?
Debian 16
What installation method did you use?
Deb/Apt
What platform did you run on?
On prem/Self-hosted
Relevant log output and stack trace
No response
How can we reproduce the bug?
The above will perform a seq scan on the second chunk. But you can then decompress the chunk and watch it perform an index scan on both chunks.