Open Dandandan opened 2 years ago
We could also look at DuckDB join reordering: https://www.youtube.com/watch?v=aNRoR0Z3SzU
I filed a duplicate issue before I saw this one, although mine is specifically for the logical plan. https://github.com/apache/arrow-datafusion/issues/3984
Other resources: Simplicity Done Right for Join Ordering - https://www.cidrdb.org/cidr2021/papers/cidr2021_paper01.pdf Using the Join-Order-Benchmark (JOB), our simple approach provides better join orderings with significantly less optimization overhead, resulting in a substantially faster response time for all 113 JOB queries compared to state-of-the-art and recent approaches.
The MonetDB Architecture Martin Kersten CWI - https://homepages.cwi.nl/~manegold/teaching/adt/lectures/lecture2.pdf
Hi, is there any progress? I can take the issue for initial development.
Is your feature request related to a problem or challenge? Please describe what you are trying to do. For complex queries, like those in TCP-H and TCP-DS it is essential to find a good Join order.
HashBuildProbeOrder
implements a rule to optimize the probe / build side of joins, but this only optimizes the joins locally (e.g. swapping the join left / right).We should implement an algorithm that (tries to) find a (close to) global optimum based on the total estimated cost of the joins.
Describe the solution you'd like Implement an efficient algorithm for optimizing I'm not sure what the SOTA is on this. Some material I found with some Googling:
https://db.in.tum.de/teaching/ws1415/queryopt/chapter3.pdf https://db.in.tum.de/~radke/papers/hugejoins.pdf https://www.cockroachlabs.com/blog/join-ordering-pt1/ https://www.cockroachlabs.com/blog/join-ordering-ii-the-ikkbz-algorithm/ http://mlwiki.org/index.php/Join_Ordering
Describe alternatives you've considered
Additional context