dbt-labs / metricflow

MetricFlow allows you to define, build, and maintain metrics in code.
https://docs.getdbt.com/docs/build/about-metricflow
Other
1.12k stars 92 forks source link

Remove duplicated WhereConstraintNodes in predicate pushdown #1304

Closed tlento closed 2 months ago

tlento commented 2 months ago

The original implementation of predicate pushdown generated duplicated where filter constraints, applying it once as a result of pushdown and again in the original where constraint node.

This change removes the duplication, and does so at the level of the individual filter spec. Logically, this means the following:

  1. If all filter specs for a where constraint node can be pushed down, the node itself will be moved past the join.
  2. If some filter specs for a where constraint node can be pushed down, the node itself will remain in place, but it will only apply the filters that cannot be pushed down.
  3. Any WhereConstraintNode added to the DataflowPlan for the purposes of cleaning up spurious rows added as the result of an outer join, such as we have in one particular JoinToTimeSpineNode scenario, must be annotated to indicate that we should apply that constraint filter no matter what.

This change updates all of our logic to ensure that we effectively apply each where filter exactly once within a branch in the DAG, and only re-apply filters where explicitly requested at build time.

Note this change is all based on certain assumptions about how the DataflowPlanBuilder constructs the plan. Specifically, we assume that we will never need to re-apply filters upstream of a node unless otherwise indicated. If we start to do fancy things with nesting where constraints or swapping join sides - particularly if we push groupable metric filters to an inner join or if we begin adding query time constraints outside of an aggregated output join - this could get dodgy. Conversion metrics are of particular concern here.

tlento commented 2 months ago

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @tlento and the rest of your teammates on Graphite Graphite

tlento commented 2 months ago

Merge activity