LucidDB / luciddb

DEFUNCT: See README
https://github.com/LucidDB/luciddb
Apache License 2.0
53 stars 24 forks source link

[FRG-67] RelSubset.getRows() always returns 1 #803

Open dynamobi-build opened 12 years ago

dynamobi-build commented 12 years ago

[reporter="jvs", created="Thu, 9 Mar 2006 16:48:42 -0500 (GMT-05:00)"] The pattern in most non-leaf rels for implementing RelNode.getRows() is to compute some function of the child's row-count, e.g. for FilterRel:

    public double getRows()
    {
        return getChild().getRows() * RexUtil.getSelectivity(condition);
    }

Most rels also use getRows() in implementing computeSelfCost(). Unfortunately, throughout all of planning except the very final phase (buildCheapestPlan), the child is always a RelSubset equivalence class, not any particular rel. And RelSubset inherits its getRows implementation from AbstractRelNode, which always returns 1. So most of the costs used for plan selection come out highly bogus. It's easy to overlook this without tracing because EXPLAIN PLAN INCLUDING ALL ATTRIBUTES shows the final (correctly propagated) rowcounts.

FennelWindowRel is the one exception I have found; it does this:

        RelOptCost childCost = planner.getCost(getChild());
        final double rowsIn = childCost.getRows();

This isn't really correct either, because VolcanoPlanner.getCost sums the total number of rows from all descendants, not just the immediate child. But at least the author seems to have realized that SOMETHING better was needed. :)

Anyway, perhaps the implementation of RelSubset.getRows should be changed to use the logic from VolcanoPlanner.getCost:

        if (rel instanceof RelSubset) {
            return ((RelSubset) rel).bestCost;
        }

so return bestCost.getRows().

Or am I totally misunderstanding things?

dynamobi-build commented 12 years ago

[author="jvs", created="Thu, 9 Mar 2006 17:16:41 -0500 (GMT-05:00)"] Just to see what would happen I added a RelSubset.getRows as above and then ran unitsql/optimizer/index.sql. Not surprisingly, index search optimization stopped working. The delicate network of fudge factories that has sprung up around the existing costing will have to be replaced to fix this while keeping tests working. Probably makes sense to do this as part of

http://wiki.eigenbase.org/PersonalityBasedCostEstimation

together with coming up with some new "rule sandbox" session plugins for making unit tests less sensitive to FarragoDefaultPlanner ruleset evolution.

dynamobi-build commented 12 years ago

[author="jvs", created="Thu, 9 Mar 2006 17:26:44 -0500 (GMT-05:00)"] Oops, scratch that about unitsql/optimizer/index.sql. It turns out I had forgotten I had a few other rules disabled to test something else, and that caused index search to stop working.

dynamobi-build commented 12 years ago

[author="jvs", created="Fri, 10 Mar 2006 00:37:29 -0500 (GMT-05:00)"] With this change made in my sandbox, Farrago tests actually ran through fine (except one with mock where the cost came out to zero, so I changed Volcano to bump that to TINY instead of asserting).

dynamobi-build commented 12 years ago

[author="jvs", created="Fri, 10 Mar 2006 09:17:31 -0500 (GMT-05:00)"] I'm checking in (together with some calc costing changes to account for selectivity) as eigenchange 5770. Julian, if this is wrong, let me know.

Probably better would be to use the best cost across the entire RelSet (not just RelSubset) since for getRows() we don't care about the traits.

Zelaine, if you sync this one down it's worth another try on the semijoins (at least to verify in the trace that the rowcounts are getting propagated correctly, even if it still doesn't choose the plan; may still need to reduce the 0.5 selectivity guess).