mabel-dev / opteryx

🦖 A SQL-on-everything Query Engine you can execute over multiple databases and file formats. Query your data, where it lives.
https://opteryx.dev
Apache License 2.0
49 stars 10 forks source link

✨ Flatten the expression tree #1364

Open joocer opened 6 months ago

joocer commented 6 months ago

The expression tree currently cascades AND and OR conditions, e.g.: a AND b AND c is ((a AND b) AND c). This means that the engine traverses a layer in the tree to execute the AND c, and there are a lot of situations where the ordering of these predicates isn't being changed to save cost.

Concurrent ANDs and ORs should be flattened to a single layer and the layer ordered to try to reduce the time to execute. This will mean AND and OR nodes will have an array of children rather than the existing LEFT/RIGHT.

This should probably be done at initial planning rather than do it differently and then undo it in the optimizer.

joocer commented 5 months ago

needed before the cost based optimizer orders predicates

joocer commented 5 months ago

Steps:

This will require changes to the evaluation engine to evaluate AND and ORs as lists of predicates, not LEFT/RIGHT combinations.

joocer commented 5 months ago

Made an initial attempt at this change, but it did not improve the efficiency of the execution - Initial thoughts on why is the creation of the masks, although haven't tested this.