Given fact that the compilation of both outer tables and inner tables in a join query cannot be strictly isolated and the input and output of compiling an outer/inner table depends on and is depended on by other tables' compilation input/output, calling QueryOptimizer.optimize within the process of join query compilation would require defining a trickier but incomprehensible interface for QueryCompiler.compile() (which in turn is called by Optimizer.optimize()).
So instead, we do optimize() right before the real compilation process: for each join table in the query, we compose an independent query to get the optimized plan and the index table used by this plan. We then re-write the original join query by replacing the column references and the table references with the corresponding index tables and columns. And finally we will start to compile this new join-query statement.
Changes:
Add JoinCompiler.optimize(), which takes in the original join statement and returns an optimized join statement.
Re-use the IndexStatementRewriter class for re-writing multiple-table queries.
Unit tests: add HashJoinWithIndexTest and testJoinPlanWithIndex().
@maryannxue - let me know if I should file issues for the outstanding items in this change, or if you'll just incorporate them into your next change. Thanks, as usual, for the excellent work!
Given fact that the compilation of both outer tables and inner tables in a join query cannot be strictly isolated and the input and output of compiling an outer/inner table depends on and is depended on by other tables' compilation input/output, calling QueryOptimizer.optimize within the process of join query compilation would require defining a trickier but incomprehensible interface for QueryCompiler.compile() (which in turn is called by Optimizer.optimize()).
So instead, we do optimize() right before the real compilation process: for each join table in the query, we compose an independent query to get the optimized plan and the index table used by this plan. We then re-write the original join query by replacing the column references and the table references with the corresponding index tables and columns. And finally we will start to compile this new join-query statement.
Changes: