[Enhancement]: Query Planner for 'limit 1 order by time' queries should only plan to scan one chunk

bdawson-bas commented 1 year ago

What type of enhancement is this?

Performance

What subsystems and features will be improved?

Query planner

What does the enhancement do?

This enhancement concerns the behaviour of the query planner in certain limit 1 queries. It is currently the case that for a query like SELECT * FROM htable ORDER BY TIME DESC LIMIT 1;, the planner will create SCAN nodes for every chunk, and then only execute the scan on the latest one. This is fine for execution times but leads to a needless bloat in the planning time, given that it’s trivially true that the result of the query will be found in the latest non-empty chunk. Our enhancement request is that, in queries like this, the planner should exclude all but the latest non-empty chunk from its plan construction. We believe that this should be possible given the available metadata, but it’s unclear how difficult the changes would be from a structural perspective.

At this stage I would like to explain why our use-case precludes some common workarounds to this issue. We use Timescale to record time-series data from scientific instruments aboard a Research Vessel. Each hypertable corresponds to one instrument. Most of our data needs to be continuously available on the vessel in perpetuity, which is why we can get a particularly large number of chunks per hypertable. At peak load, we make thousands of LIMIT 1 queries like this every minute, to help inform operational decisions for the vessel. The data in this case needs to be available as close to real time as possible.

One workaround we’ve considered would be to keep up a set of lightweight tables containing only the latest value from each hypertable. This is undesirable for us because of the tight constraints we have on resources for our production database.

Another workaround would be to use time-based ‘WHERE’ clauses to restrict the scope of the search. The main problem here is that it’s non-trivial to determine an appropriate bound for many of these queries we make. Furthermore, many of them are automated through things like Grafana dashboards, and the appropriate bounds will change over time. Using ‘WHERE’ clauses in this way will add a lot of upkeep that doesn’t seem like it should be necessary for such simple queries.

We’re not aware of any other workarounds for our situation, but we would be open to considering alternatives to this.

For the sake of completeness, I’ve included below the EXPLAIN ANALYSE output of one of these queries on a reduced hypertable. The planning times here are trivial on account of there being only 4 chunks. However, in extreme cases we’ve seen planning times up to 1.6 seconds, on a compressed table with ~500 chunks. This was exacerbated by the compression, since the planner needs to account for decompression nodes, but the more significant issue seems to be that the planner is creating nodes for every chunk.

# explain analyse select * from schema_name.hypertable_name order by time desc limit 1;
                                                                                             QUERY PLAN                                                                                              
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.14..0.29 rows=1 width=73) (actual time=0.013..0.015 rows=1 loops=1)
   ->  Custom Scan (ChunkAppend) on hypertable_name  (cost=0.14..29.65 rows=206 width=73) (actual time=0.012..0.013 rows=1 loops=1)
         Order: hypertable_name."time" DESC
         ->  Index Scan using _hyper_85_260_chunk_hypertable_name on _hyper_85_260_chunk  (cost=0.14..29.65 rows=206 width=73) (actual time=0.011..0.011 rows=1 loops=1)
         ->  Index Scan using _hyper_85_189_chunk_hypertable_name on _hyper_85_189_chunk  (cost=0.27..67.19 rows=446 width=73) (never executed)
         ->  Index Scan using _hyper_85_118_chunk_hypertable_name on _hyper_85_118_chunk  (cost=0.28..547.82 rows=5124 width=73) (never executed)
         ->  Index Scan using _hyper_85_39_chunk_ hypertable_name on _hyper_85_39_chunk  (cost=0.28..592.47 rows=5122 width=73) (never executed)
 Planning Time: 0.383 ms
 Execution Time: 0.060 ms
(9 rows)

Implementation challenges

The query planner should have access to meta-data about the hypertable chunks, such that it can determine which the most recent non-empty chunk is (or indeed which the first non-empty chunk is depending on the query). However, it may be challenging to sufficiently re-work how it plans the query for this very specific type of query.

I believe that one compromise would be a function that can efficiently return as a record, the latest or first chunk from a hypertable. We could then make queries against that function, rather than the whole hypertable.

akuzm commented 1 year ago

Generally it's hard to do this kind of thing at planning time. Planning happens once, and then the query is executed at some unspecified later point of time, possibly many times (e.g. when using prepared statements). So we can't really make strong assumptions about the content of the tables during planning.

I would suggest you try 2.12 when it's released (in a couple of weeks), because there were actually several bugs leading to slow planning time for this kind of queries over compressed tables, and they were fixed.

If it's still not good enough, maybe for some clients you could use prepared statement with generic plans, so that they are not replanned on each execution.

bdawson-bas commented 1 year ago

Thanks, @akuzm, I was just writing a follow up comment but it's not all that relevant now.

I can definitely understand why this sort of optimisation is difficult with the query planner. Something similar was suggested in response to a different issue here, which is what gave the idea that this would be possible. That said, it doesn't seem like that suggestion was implemented, so it doesn't prove anything.

We can wait for 2.12 to see whether that improves things, and we'll look into prepared statements as you've suggested as well. Will it be okay if we leave this issue-tracker open in the meantime? I'm happy for it to be closed in principle, but only if we'll still be able to get eyes on it easily if we have to reopen?

akuzm commented 1 year ago

Sure, let's leave it open, I would love to hear if 2.12 improves the things for you.

svenklemm commented 1 year ago

You can create a custom procedure that implements the behaviour. Query most recent chunk only and return result from that chunk and iterate if result could not be found in that chunk.

github-actions[bot] commented 11 months ago

Dear Author,

This issue has been automatically marked as stale due to lack of activity. With only the issue description that is currently provided, we do not have enough information to take action. If you have or find the answers we would need, please reach out. Otherwise, this issue will be closed in 30 days.

Thank you!

github-actions[bot] commented 10 months ago

Dear Author,

We are closing this issue due to lack of activity. Feel free to add a comment to this issue if you can provide more information and we will re-open it.

Thank you!

timescale / timescaledb