Since forever, we have had the logicalPlan interface that is used during planning. This served us well on the V3 planner.
In the gen4 work, we introduced the Operator interface that we use when doing the join ordering and route merging.
Once that is done, we transform the operator tree to a logical plan tree to do the grouping/order by/limit planning.
Unfortunately, this means that in some situations we cannot decide early on if we can merge a route with another or not. For example when using derived tables and UNION, if the inner queries are using aggregation, we can't make a good decision during join ordering, and later when we do the aggregation planning, it's too late to merge the routes, because we are by then on logical plans instead.
What I want to do is:
[x] Make it possible to mix logical and physical operators in the same tree. This will allow us to transform the tree piece by piece, instead of being forced to do everything at once.
[x] Move a lot of the operator tree traversal to outside the operators. Today a lot of the code on the operators is concerned with sending on calls to the inputs.
[x] Add column and offset handling to the operators
[ ] Make it possible to produce 'engine.Primitive' structs straight from the operator tree without first having to transform them to logicalPlan
Hopefully the refactoring should make the code easier to work with and understand as well.
Internal Refactoring
Since forever, we have had the
logicalPlan
interface that is used during planning. This served us well on the V3 planner. In the gen4 work, we introduced theOperator
interface that we use when doing the join ordering and route merging.Once that is done, we transform the operator tree to a logical plan tree to do the grouping/order by/limit planning.
Unfortunately, this means that in some situations we cannot decide early on if we can merge a route with another or not. For example when using derived tables and UNION, if the inner queries are using aggregation, we can't make a good decision during join ordering, and later when we do the aggregation planning, it's too late to merge the routes, because we are by then on logical plans instead.
What I want to do is:
sum(distinct col))
)