zhouqingqing / qpmodel

A Relational Optimizer and Executor
MIT License
64 stars 18 forks source link

where exists or exists exception #213

Closed pkommoju closed 3 years ago

pkommoju commented 3 years ago

select a.a1 from a where exists (select from b where a.a2 = b.b2) or exists (select from c where a.a2 = c.c2) order by 1

fails because the first exists subquery has no physical plan, it is possible that the second one doesn't either. When OR is changed to AND it produces correct results.

Another query which has an OR also fails.

select 1 from a where a.a1 > (select b1 from b where b.b2 > (select c2 from c where c.c2=b2) or b.b1 > ((select c2 from c where c.c2=b2)))

zhouqingqing commented 3 years ago

We didn't handle OR systematically

another example:

            sql = "select a1 from a where a.a1 = (select b1 from b bx where bx.b2 = a.a2 OR bx.b1 = (select b1 from b bz where bz.b2 = 2*a1 and bz.b3>1) and bx.b2<3)";
9DemonFox commented 3 years ago

This is the cause of this issue. In practical, there should be no "or expression" in semi-join expression if you want to unnest the subquery.

 if (queryOpt_.optimize_.enable_subquery_unnest_)
                    {
                        // use the plan 'root' containing the subexpr 'x'
                        var replacement = oneSubqueryToJoin(root, x);
                        newroot = (LogicNode)newroot.SearchAndReplace(root,
                                                                replacement);
                    }

But the codes here don't consider this condition, so it failed to handle subquery with "or expression". I will handle it next Monday.

9DemonFox commented 3 years ago

Blow is the postgreSQL's plan of SQL(explain select a.a1 from a where exists (select from b where a.a2 = b.b2) or exists (select from c where a.a2 = c.c2) clearly, it didn't be unnested . image

9DemonFox commented 3 years ago

Blow is the plan of SQL(explain select a.a1 from a where exists (select from b where a.a2 = b.b2) and exists (select from c where a.a2 = c.c2) clearly, it was unnested . image

9DemonFox commented 3 years ago

During handling this issue. I find another issue. "select a.a1 from b" will return result "null" instead of Error, It might need to opening anothor issue if necessary.

pkommoju commented 3 years ago

This is very bad. I mean unnest or otherwise returning correct results even if it is slow if preferable than returning outrageously incorrect results. I will open an issue and assign it to myself.

zhouqingqing commented 3 years ago

In practical, there should be no "or expression" in semi-join expression if you want to unnest the subquery.

This is not true. The sub-query related papers explicitly give examples the transformations with singlejoin can handle OR.

Blow is the postgreSQL's plan of SQL

We shall get the same PG plan with enable_subqueryunnest = false. PG can't do complex subquery decorrelation as us.

9DemonFox commented 3 years ago

The paper Unnesting Arbitrary Queries To the best of our knowledge, no existing system can unnest such a query. And indeed, unnesting this query is hard:

The routine of oneSubqueryToJoin(root, x) indeed doesn't process the situation of "or expression" in subquery as @zhouqingqing refered.

We didn't handle OR systematically

I originally think the issue is just the problem of "when there is a or in subquey, we don't need to unnest it". So I have to read this paper to understand and realize it.

zhouqingqing commented 3 years ago

@9DemonFox where are we with this issue?

9DemonFox commented 3 years ago

I take over this issue from zhourui today. And locate the codes causing this BUG as bellow.

            LogicFilter Filter = new LogicFilter(markjoin, topfilter);

20201026-142208(eSpace) it wil handel @1 OR @2 to the plan to the Fig A, i.e. embedding the @2 to a lower level. For AND, it will be ok to do so. I try to handle it to the plan like Fig B.

zhouqingqing commented 3 years ago

agree fig B is the right plan: for AND, the predicate can further push down becoming figure A, for OR, we can't push it down.