dbt-labs / dbt-core

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
https://getdbt.com
Apache License 2.0
9.94k stars 1.63k forks source link

[CT-3541] [Bug] Duplicate Metric InputMeasures cause query time errors in MetricFlow #9360

Closed QMalcolm closed 8 months ago

QMalcolm commented 10 months ago

Is this a new bug in dbt-core?

Current Behavior

After parsing a metric, we process a metric to populate it's input_measures. This happens in _process_metric_node. It's possible for there to be duplicate input measures in the input_measures list. When there is, a validation warning shows up in the output. Additionally on the MetricFlow side, this causes a query time error for the metric.

Expected Behavior

There shouldn't be duplicate InputMeasures in a Metric's Metric.type_params.input_measures and thus no validation warning should be raised.

Steps To Reproduce

  1. Add a ratio metric that depends on two simple metrics that have the same underlying measure a. Alternatively pull down my qmalcolm--duplicate-input-measure-repro from jaffle-sl-template
  2. Run dbt parse
  3. Observe validation warning in output

Relevant log output

PydanticMetric ratio_new_customers has multiple identical input measures specifications for measure customers_with_orders. This might be hiding a semantic error. Input measure specification: name='customers_with_orders' filter=None alias=None join_to_timespine=False fill_nulls_with=None.

Environment

- OS: macOS (Sonoma 14.2)
- Python: 3.8
- dbt: 1.7.4

Which database adapter are you using with dbt?

snowflake

Additional Context

The Semantic Layer team, is going to work on mitigating this in MetricFlow as an immediate fix, however we should also fix this on the Core side.

QMalcolm commented 10 months ago

Additionally I would be a proponent of getting the fix backported to 1.7.latest