questdb / questdb

QuestDB is an open source time-series database for fast ingest and SQL queries
https://questdb.io
Apache License 2.0
14.58k stars 1.18k forks source link

Optimize ORDER BY timestamp, <other columns> #4198

Open piotrrzysko opened 9 months ago

piotrrzysko commented 9 months ago

To reproduce

While working on #4180, I noticed that:

explain select cab_type, vendor_id, pickup_datetime from trips order by pickup_datetime desc, cab_type desc limit 100;

produces the following plan:

image

This seems suboptimal. In my opinion, instead of scanning the entire table, we should leverage the fact that the designated timestamp is the first column in the ORDER BY clause and apply a backward scan + LimitedSizePartiallySortedLightRecordCursor.

The same applies to queries with DISTINCT:

explain select DISTINCT cab_type, vendor_id, pickup_datetime from trips order by pickup_datetime desc, cab_type desc limit 100;

The trips table comes form QuestDB's demo.

QuestDB version:

7.3.9

OS, in case of Docker specify Docker and the Host OS:

Linux

File System, in case of Docker specify Host File System:

ext4

Full Name:

Piotr Rżysko

Affiliation:

QuestDB

Have you followed Linux, MacOs kernel configuration steps to increase Maximum open files and Maximum virtual memory areas limit?

Additional context

No response