To make our system leverage the property that our storage provides (merge sort!), we should work on each component to make them fully understand sort keys. To get a minimum e2e test to work, this tracking issue involves the following sub-tasks:
Secondly, we should be able to specify such properties at the storage side, and exploit the storage property to do range scan, which is covered in https://github.com/risinglightdb/risinglight/issues/589. This is not required for a full e2e to run, but is still useful.
Then, we can write some executors to leverage the sort property. We already have merge sort join, and can have things like sort agg.
Finally, the optimizer should know how each executor produces data. We should add a "sorted" property on executors https://github.com/risinglightdb/risinglight/issues/601, so as to generate optimized plans. After that, we can add rules to leverage the sort property from the storage layer. Alternative: do some ad-hoc rewrite on order by + table scan.
To make our system leverage the property that our storage provides (merge sort!), we should work on each component to make them fully understand sort keys. To get a minimum e2e test to work, this tracking issue involves the following sub-tasks:
Firstly, we should have a user interface to specify the sort mode of tables. This is covered by https://github.com/risinglightdb/risinglight/issues/625. We should refactor catalog to support composite primary keys.
Secondly, we should be able to specify such properties at the storage side, and exploit the storage property to do range scan, which is covered in https://github.com/risinglightdb/risinglight/issues/589. This is not required for a full e2e to run, but is still useful.
Then, we can write some executors to leverage the sort property. We already have merge sort join, and can have things like sort agg.
Finally, the optimizer should know how each executor produces data. We should add a "sorted" property on executors https://github.com/risinglightdb/risinglight/issues/601, so as to generate optimized plans. After that, we can add rules to leverage the sort property from the storage layer. Alternative: do some ad-hoc rewrite on order by + table scan.