Closed 9DemonFox closed 3 years ago
every SQLstatment has a member CteInfo, and each has a CteInfoEntries. For convenience, all of the members are declared as PUBLIC.
when meet with cte as (cteQuery)
, create a plan for it , see CteSelectStmt
's constructor.
public class CteSelectStmt : SelectStmt
{
private CteExpr originCte_;
public int CteId() => originCte_.cteId_;
// we have to constract a CteSelection
// select * from cte
public CteSelectStmt(CteExpr originCte, CTEQueryRef from, List<CteExpr> previousCtes) : base(
new List<Expr>() { new SelStar(originCte.cteName_) },
new List<TableRef>() { from },
null, null, null,
previousCtes, null, null, null, $"select * from {originCte.cteName_}")
{
originCte_ = originCte;
}
}
in InitCteInfo, bind and create plan for each CTE, and add previous CTEs into current cte’s context and ctes_, so if current cte refer to previous ctes, it can find them.
List<CteExpr> previousCtes = cteInfo_.GetAllCteExprs();
// prevoious ctes have smaller cteId
Debug.Assert(previousCtes.All(x => x.cteId_ < curCte.cteId_));
SelectStmt cieStmt = new CteSelectStmt(curCte, from, previousCtes);
It seems easy to do so, if a cte
is not referred, it should be delete, but the paper put forward a extreme stupid situation. see:
sql1:
"with cte0 as (select * from a),cte1 as (select * from cte0) select * from b where b.b1 = 2";
cte0 has been referred by cte1, but cte1 is never used. here we use a stack to remove cte0.
TU.ExecuteSQL(sql, "2,3,4,5", out phyplan, option);
var answer = @"PhysicScanTable b (actual rows=1)
Output: b.b1[0],b.b2[1],b.b3[2],b.b4[3]
Filter: b.b1[0]=2";
TU.PlanAssertEqual(answer, phyplan);
Algorithm findUnusedCTE
INPUT: cteStack
OUTPUT: cte will be marked used or not.
step1 findCteConsumer(rootPlan,&cteStack)
step2
for subq in subqueres:
findCteConsumer(subq,&cteStack)
step3 while not cteStack.empty():
cteid = cteStack.pop()
findCteConsumer(GetCteInfoEntryByCteId(cteid).plan_,&cteStack)
findCteConsumer
will find CteConsumer
from LogicNode
recursively, push and mark cteInfoEntry
as used.
For instance, in sql1, step1 will not push cte1
into cteStack
, step2 will not either. so the cte1 will not be marked used. so the cte0 will not be marked used too.
For sql1, cte1 will not be marked used, so it will not push into the cteStack, so cte1 is not mark used. so they are all deleted.
the LogicCteAnchor
will be add at the root of a plan, such as (b) .
the realize code is bellow in cteToAnchor
for (int i = 0; i < cteIsUsed.Count; i++)
{
var lca = new LogicCteAnchor(cteInfo_.GetCteInfoEntryByCteId(cteIsUsed[i].cteId_));
lca.children_.Add(root);
root = lca;
}
return root;
the plan produced for select * fom (cteQuey)
is the plan for CTEProducer
.
the LogicCteProdicer
will be produced through RuleTrans
the child of the LogicCteProducer
is the select * from cteQuery
, see section 1.1.
public LogicCteProducer(CteInfoEntry cteInfoEntry)
{
cteInfoEntry_ = cteInfoEntry;
children_.Add(cteInfoEntry.plan_);
// CTEProducer has only one child
Debug.Assert(children_.Count() == 1);
}
cteQueryRef
will be transform as LogicCteConsumer
.
RuleTrans
CteAnchor2CteProd
will produce a LogicSequenceNode
, and it has the same MemoSignature
with CTEAnchor
for it.
the LogicCteProducer
will be add through RulesTrans
. when we meet a LogicCTEAnchor
, it will producer a LogicSequence
Node, and the LogicSequence
will add CTEProducer
Node as its child.
when meet CTEConsumer
, it will producer LogicCTESelect
and has the same Soginature
with CTEConsumer
. it will also create nodes from cteplan
. To avoid Group7 produce the same Signature
with Group6, the Signature have to consider if it is belong to a child of cte consumer, cteId
and cteRefId
.
group | is child of cte consumer | cteid | cteRefId | Signature |
---|---|---|---|---|
Group6 | false | 1 | - | |
Group7 | true | 1 | 1 | |
Group8 | true | 1 | 2 |
by this , cteid = 2
and cteRefid = 1
will has the same signature with cteid = 1
and cteRefid=2
will has the same Signature. To avoid this , see cteRefId
as string.
group | is child of cte consumer | cteid | cteRefId | Signature |
---|---|---|---|---|
Group6 | false | 1 | - | |
Group7 | true | 1 | ref1 |
|
Group8 | true | 1 | ref2 |
Figure 4.1
A LogicCteSequence
which is a binary operator will bind to a cte
, so it has its cteid
and will generate requirement (p 0)
for left child and (c 0)
for right child. And a LogicCteAnchor
will generate no requirement for its child.
requirement for CTE
will always pass to children and never disappear. The Group Member of a Group will satisfied the requirement only when it's cteSpec
is the sub set of the CteReq
a LogicCteAnchor
will generate no requirement for its child, when meet consumer0
, consumer0
will seen as [(c 0)], it is not the subset of []. So the plan will not pass.
In fact, c and d should be avoid. But now we don't support transform property from left to right child, they are not avoid.
But they will be optimized out by cost.
cost-based select will not choose plan like (c) and (d)
still need to perfect: