zhouqingqing / qpmodel

A Relational Optimizer and Executor
MIT License
66 stars 18 forks source link

Incorrect result from derived table subquery with aggregations #178

Closed pkommoju closed 4 years ago

pkommoju commented 4 years ago

Given the standard table a

select c1, c1 from (select sum(abs(-10.3 a1)) c1, sum(round(10.7 a2, 2)) c2 from a) x;

Should return 30.9,64.2

Instead, it returns 30.9,30.9

But the subquery by itself produces correct result.

If the query is modified as

select c1, c1 from (select sum(abs(-10.3 a1)) c1, sum(round(10.7 a2, 2)) c2 from a) x where c1 = c2;

there is no output which is correct. So the values being projected seem to be different from values being compared in the top level query. As can be expected now, if the condition is c1 <> c2, there is the erroneous output again.

Incidentally, PostgreSQL v12 behaves the same way.

zhouqingqing commented 4 years ago

if .removefrom is false

30.9,30.9
Total cost: 9, memory=32
PhysicFromQuery 1083_1089 <x> (inccost=9, cost=1, rows=1) (actual rows=1)
    Output: c1[0],c1[0]
    -> PhysicHashAgg 1086_1091  (inccost=8, cost=5, rows=1, memory=32) (actual rows=1)
        Output: {sum(abs(-10.3*a1))}[0],{sum(round(10.7*a2,2))}[1]
        Aggregates: sum(abs(-10.3*a1[2])), sum(round(10.7*a2[5],2))
        -> PhysicScanTable 1088_1093 a (inccost=3, cost=3, rows=3) (actual rows=3)
            Output: -10.3*a1[0],-10.3,a1[0],10.7*a2[1],10.7,a2[1],2

obviously Output: c1[0], c1[0] is the place to blame.

.removefrom set to true has similar ordinal resolution error.

zhouqingqing commented 4 years ago

A related topic is the column reference say a.a1 which may show up on multiple levels in the plan.

In short, we shall unify ordinal resolution's ExprRef usage and reduce Clone() there.

pkommoju commented 4 years ago

The bug has been fixed.

Replacing ColExpr has not been attempted but replacing all output expressions with something like ExprRef didn't show much improvement.