There are some difficulties implementing it I can think of. First is operators are stored in the memo, so we need to construct an operator tree when we want to print it. I think printing the best operator tree in the memo should be good enough for most cases. We could take the GetBestPlan method as a reference.
Printing memo at arbitrary time during optimization would be helpful when we implement more complex optimizations
Another nice-to-have feature is to print output columns for each operator, e.g. t1.a + AVG(t2.b), as it's also something we could easily mess up. The challenge is these output columns are not stored in the operators. They are constructed on the fly and stored in a map when constructing the plan.
I previously thought we could print these in the plan node, but we only store output column offset as oid in the plan node, which is not the most intuitive debugging info.
I think we should print out predicates, e.g. scan predicates, join predicates, having clauses, in the plan node. I've added some already. Please check if there's anything left.
By the way, GetInfo() method for expressions is also kind of crappy, it's not succinct enough, an expression t1.a + AVG(t2.b) spans multiple lines with a lot of redundant information. We should also fix it.
To summarize, the features needed to be implemented are:
GetInfo() for the best operator tree in the optimizer
This is a little bit different from what we discussed in the meeting, I'll explain in the description.
There's no GetInfo() method for operator nodes in the optimizer, which makes debugging the optimizer extremely painful.
https://github.com/cmu-db/peloton/blob/master/src/include/optimizer/operators.h#L52
There are some difficulties implementing it I can think of. First is operators are stored in the memo, so we need to construct an operator tree when we want to print it. I think printing the best operator tree in the memo should be good enough for most cases. We could take the
GetBestPlan
method as a reference.https://github.com/cmu-db/peloton/blob/master/src/optimizer/optimizer.cpp#L288
Printing memo at arbitrary time during optimization would be helpful when we implement more complex optimizations
Another nice-to-have feature is to print output columns for each operator, e.g.
t1.a + AVG(t2.b)
, as it's also something we could easily mess up. The challenge is these output columns are not stored in the operators. They are constructed on the fly and stored in a map when constructing the plan.https://github.com/cmu-db/peloton/blob/master/src/include/common/internal_types.h#L1392
I previously thought we could print these in the plan node, but we only store output column offset as
oid
in the plan node, which is not the most intuitive debugging info.I think we should print out predicates, e.g. scan predicates, join predicates, having clauses, in the plan node. I've added some already. Please check if there's anything left.
By the way,
GetInfo()
method for expressions is also kind of crappy, it's not succinct enough, an expressiont1.a + AVG(t2.b)
spans multiple lines with a lot of redundant information. We should also fix it.To summarize, the features needed to be implemented are:
GetInfo()
for the best operator tree in the optimizerGetInfo()
for plan nodesGetInfo()
method for expressionsThe features that are nice to have:
GetInfo()
for memo