zhouqingqing / qpmodel

A Relational Optimizer and Executor
MIT License
64 stars 18 forks source link

REMOVE_FROM WORK: stage 5 #250

Closed pkommoju closed 3 years ago

pkommoju commented 3 years ago

Handle aggregates appearing in group by after transforming FromQuery. If aggregates are seen in the group by, check if each of them is a select expression in one of the FromQuery's in scope and if so, throw no error.

When resolving ordinals in LogicAgg, if there are aggregates in the group by, remove each aggregate, each child of AggrRef and other expressions and make a new group by list, save the original group by and null it out. Turn the newGroupBy as required from child. Once the child resolve is done, set back the saved groupby as group by and resolve it.

This has a side effect on Agg(Agg(Agg(...))). In general these queries will throw various exceptions but if the arguments of each of the top level aggregates are added as group by of the top level select, they work.

Also avoid moving a filter into aggregate node if the filter references more than one table. Let these filters be dealt with in FilterPushdown which can better decide if the filter can be moved into aggregate node or a join above/below it. The query select a1 from a, (select max(b3) maxb3 from b) b where a1 < maxb3 generates bad plan LogicJoin 1063 -> LogicScanTable 1064 a -> LogicAgg 1066

     Filter: a.a1[0]<max(b.b3[2])
  -> LogicScanTable 1065 b

Obviously b can't produce a1 and b3.

With remove_from optimization, the expressions in join filter will have to be DeQueryRef'd so that the tableRefs are correctly setup. This was missed in stage 4 PR.