Open jon-chuang opened 2 years ago
My idea is similar to this. But there are some details:
equal join associated with 2 tables > other join associated with 2 tables > cross product = join associated with multiple tables
. We need to consider the join dependency, e.g., A join B join{a.x+b.x = c.x} C MultipleJoin
for the situation join -> project -> join
. The project should be absorbed to MultipleJoin for the situation join -> project -> join.
Agreed on this. But I decided to make this a low priority after the rest has been done.
equal join associated with 2 tables > other join associated with 2 tables > cross product = join associated with multiple tables
Currently, the plan for the RFC is more fine grained:
But it omits the last > sign. We could definitely detect (non-equijoin associated with 2 tables) and add this into the join priority. I'm not sure we can take advantage of this. For now, it will still result in an error for streaming. Any ideas @fuyufjh ?
For me, I did not consider 3 columns as a join condition (A join B join{a.x+b.x = c.x} C). This is considered a filter on a cross join. Technically, there is some optimization that could turn this into an eq join but I don't think we implement it yet.
For me, I did not consider 3 columns as a join condition (A join B join{a.x+b.x = c.x} C). This is considered a filter on a cross join.
You mean extract the conditions associated with multiple tables and create a filter post after the MultipleJoin
?
You mean extract the conditions associated with multiple tables and create a filter post after the MultipleJoin?
Yes. It will be pushed down into one of the joins after the multijoin has been reordered. However, we don't try to make the joins adjacent during the join ordering. I'm guessing this is equivalent to calcite.
You mean extract the conditions associated with multiple tables and create a filter post after the MultipleJoin?
Yes. It will be pushed down into one of the joins after the multijoin has been reordered. However, we don't try to make the joins adjacent during the join ordering. I'm guessing this is equivalent to calcite.
However, it may cause some cross-products, e.g., B join C join A for the above example. Maybe we can use some constraints to avoid it. For example, we can maintain some partial order relations to enforce C must appear after B and A.
However, it may cause some cross-products, e.g., B join C join A for the above example. Maybe we can use some constraints to avoid it. For example, we can maintain some partial order relations to enforce A must appear after B and C.
I assume that A join B join{a.x+b.x = c.x} C
is probably more like (A join{a.y = b.y} B) join{a.x+b.x = c.x} C
, So we will be doing a cross product between (A join B) and C. Technically, one could do a Project(A join B, (*, a.x+b.x AS z)) join {z = c.x} C
.
As for pairwise non-eq join and multi-table conditions, we could definitely have another round of optimization that treats these as being higher selectivity than cross join, and use these to choose the order in which connected components (in terms of eq join) are joined together.
It can be explored (as a lower priority task) after prioritizing equijoins is complete.
There are many ways to improve the join ordering. But do note that join ordering is generally hard. For heuristics, I think we may want to first keep it simple. We could leave more nuanced decisions to a statistics-driven optimizer.
@jon-chuang May check out https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/rel/rules/MultiJoinOptimizeBushyRule.java and https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/rel/rules/LoptOptimizeJoinRule.java
https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/tools/Programs.java#L186 as the entry point
@lmatz I think those already use table statistics https://github.com/apache/calcite/blob/86b34f7ba81cc85aec4589f8fa4dcbf503ad8677/core/src/main/java/org/apache/calcite/rel/rules/MultiJoinOptimizeBushyRule.java#L127
But seems to use some selectivity heuristics based on the condition as well (https://github.com/apache/calcite/blob/86b34f7ba81cc85aec4589f8fa4dcbf503ad8677/core/src/main/java/org/apache/calcite/rel/metadata/RelMdUtil.java#L392)
The second one seems related to: https://www.cockroachlabs.com/blog/join-ordering-ii-the-ikkbz-algorithm/
For the left-deep tree, it will first check the priority of the condition and then select the factor according to cardinality if priority is equal: https://github.com/apache/calcite/blob/86b34f7ba81cc85aec4589f8fa4dcbf503ad8677/core/src/main/java/org/apache/calcite/rel/rules/LoptOptimizeJoinRule.java#L772
The condition priority is set in https://github.com/apache/calcite/blob/86b34f7ba81cc85aec4589f8fa4dcbf503ad8677/core/src/main/java/org/apache/calcite/rel/rules/LoptMultiJoin.java#L536
By the way, we need a left-deep tree or a bushy tree for join reordering?
LGTM
(A join B join{a.x+b.x = c.x}
I think we can push down the a.x+b.x at first and then there will be a project between the binary join. then divided by the project, the two binary join will not be merged into a multijoin.Some off-topic things: weeks ago I've written a so-called delta-join solver: https://github.com/singularity-data/risingwave/blob/main/src/frontend/src/optimizer/delta_join_solver.rs. It seems to have done exactly the same thing as step 1 and 2, but specifically designed to output one lookup path in delta join.
- how will you get the primary key? I propose just analyzing only when the input is a LogicalScan. the pk_indices seems a special thing only for stream operators.
We also support pk for normal table, right?
We may do a similar but different pk derivation for the batch plan.
As long as all the columns of the pk of a table are preserved through some operators(recursively), then we can consider the output of these operators as having the same pk of table.
A simple version requires any column in the pk cannot be changed by some function but only simply referred in an operator.
A more sophisticated version may allow some column goes through some function that does not hurt the uniqueness of a pk.
how about just doing it simply: #2242 and then add a method like can_look_up_by(idx: &[usize])
how will you get the primary key? I propose just analyzing only when the input is a LogicalScan. the pk_indices seems a special thing only for stream operators.
There are likely to be some project and filter on top of LogicalScan, I think we should match them also.
We may do a similar but different pk derivation for the batch plan.
Seems not necessary. On the other hand, if we support indexes, you may see that what we want here are actually indexes, not only PK.
how about just doing it simply: https://github.com/singularity-data/risingwave/issues/2343and then add a method like can_look_up_by(idx: &[usize])
Also looks okay to me
FYI. A very comprehensive summary about join reordering (especially the "Greedy Heuristic" section, since we may hope it to be simple for now) -> https://db.in.tum.de/teaching/ws1415/queryopt/chapter3.pdf
FYI. A very comprehensive summary about join reordering (especially the "Greedy Heuristic" section, since we may hope it to be simple for now) -> https://db.in.tum.de/teaching/ws1415/queryopt/chapter3.pdf
It has a very nice textbook http://pi3.informatik.uni-mannheim.de/~moer/querycompiler.pdf
@XieJiann Hi, please join us in Slack https://risingwave-community.slack.com/archives/C03FWTNVB7B
how will you get the primary key? I propose just analyzing only when the input is a LogicalScan. the pk_indices seems a special thing only for stream operators.
If the pk indices are not present in the join condition, then they are irrelevant for our heuristic!
Is your feature request related to a problem? Please describe. This RFC is meant to propose a heuristic join reordering rule for multi-way inner joins that is meant to:
Here is a description of the algorithm proposed:
Related:
Tasks:
LogicalMultiJoin
and rule for merging adjacent inner joins (https://github.com/singularity-data/risingwave/pull/2526)Low priority tasks:
LogicalMultiJoin
. Push the projection to the top of the MultiJoin. (later we can optimize away with projection pushdown). Compatibility: If a projection is not a simple alias and is a join key, we will not merge the projection.