dbt-labs / dbt-core

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
https://getdbt.com
Apache License 2.0
9.25k stars 1.53k forks source link

Add `CumulativeTypeParams` & sub-daily granularities to semantic manifest #10350

Open courtneyholcomb opened 1 week ago

courtneyholcomb commented 1 week ago

resolves #10360

CumulativeTypeParams

DSI has recently moved fields related to cumulative metrics (specifically, window and grain_to_date) from type_params to type_params.cumulative_type_params. This creates better organization in the metric spec, and also adds a new field called type_params.cumulative_type_params.period_agg which is used to re-aggregate cumulative metrics for non-default granularities. This PR adds core support for the new fields.

Note that the old fields type_params.window and type_params.grain_to_date will still work, so this is not a breaking change. There is a transformation that runs in MetricFlow to copy the old fields into the new fields if the new fields are not set, and sets the default period_agg value to PeriodAggregation.FIRST if it's not already set in the manifest.

Soon, setting the old fields will trigger a validation warning prompting users to move those values to the new fields. The plan is to eventually deprecate those fields.

Sub-Daily Granularities

Since the new sub-daily granularity options are included in the DSI version upgrade, this PR also adds support for those.

Notes

Checklist

codecov[bot] commented 3 days ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 88.77%. Comparing base (ca163c3) to head (96b79a8). Report is 4 commits behind head on main.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #10350 +/- ## ========================================== + Coverage 88.75% 88.77% +0.01% ========================================== Files 180 180 Lines 22486 22517 +31 ========================================== + Hits 19958 19989 +31 Misses 2528 2528 ``` | [Flag](https://app.codecov.io/gh/dbt-labs/dbt-core/pull/10350/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=dbt-labs) | Coverage Δ | | |---|---|---| | [integration](https://app.codecov.io/gh/dbt-labs/dbt-core/pull/10350/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=dbt-labs) | `86.03% <100.00%> (+0.01%)` | :arrow_up: | | [unit](https://app.codecov.io/gh/dbt-labs/dbt-core/pull/10350/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=dbt-labs) | `62.16% <39.39%> (-0.03%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=dbt-labs#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

Jstein77 commented 3 days ago

Pulling out the yaml spec here:

    - name: cumulative_orders
      label: Rolling total of orders (all time)
      type: cumulative
      type_params:
        measure: num_orders
        cumulative_type_params:
          period_agg: last
courtneyholcomb commented 1 day ago

There is one thing we probably should change though. In the DSI counterpart (https://github.com/dbt-labs/dbt-semantic-interfaces/pull/288) we https://github.com/dbt-labs/dbt-semantic-interfaces/pull/288#pullrequestreview-2111135930. I think from that we are missing 2.b where in we populate the new paths from the old paths (iff the new paths aren't specified and the old paths are)

@QMalcolm ah ok! We are running the transformation in MF that should handle this. I didn't realize we would need to handle that in core, too. What's the reason for needing it in both places?

QMalcolm commented 1 day ago

@QMalcolm ah ok! We are running the transformation in MF that should handle this. I didn't realize we would need to handle that in core, too. What's the reason for needing it in both places?

In a vacuum, if all we care about is the functionality working, then that would be correct. However, that isn't our only goal. Our additional goal is to move in the direction of deprecating and removing the old paths. In order for DSI to remove the old paths, we first need to reasonably guarantee that nearly all semantic manifests being consumed have the new paths set when applicable. There are four possible states for the new paths and old paths:

  1. Old paths not set, new paths not set
  2. Old paths not set, new paths set
  3. Old paths set, new paths not set
  4. Old paths set, new paths set

Currently, any core produced semantic manifest will produce (3). In order for DSI to ever remove the old paths (and the related transformations), we have to guarantee that semantic manifests being consumed by the MetricFlow / SL don't have state (3). So, although we don't technically need to do it now, we need to do it eventually. However, the longer we wait to do it, the longer DSI and Metricflow have to handle both states. Thus, doing it now means the tech debt we incur can be shorter lived and arguably more contained.

courtneyholcomb commented 1 day ago

@QMalcolm Updated, and added tests for the old paths too.

mirnawong1 commented 1 day ago

Affects PR https://github.com/dbt-labs/docs.getdbt.com/pull/5688/files