cmu-db / optd

CMU-DB's Cascades optimizer framework
https://cmu-db.github.io/optd/
MIT License
383 stars 22 forks source link

fix(df-repr): depjoin elimination rule #212

Closed skyzh closed 4 weeks ago

skyzh commented 4 weeks ago

This is not the most efficient way to implement this, but at least it works. Reimplemented the eliminate depjoin rule, so that it can remove depjoin and insert join when there are no correlated columns. This is done by recursively inspecting whether ExternCol exists on the right side of the plan tree. This also means that this eliminate depjoin rule must be used as a heuristics rule.

Added a new subquery regression test and enabled TPC-H 1-5 to use the optd logical optimizer.

skyzh commented 4 weeks ago

If we want to implement it as a cascades rule, the most obvious way I can think of is to add a HasCorrelatedColumn logical property. Probably we also need to record what correlated columns are there in this property, so that it can handle nested depjoin correctly.

jurplel commented 4 weeks ago

great work!