dbt-labs / metricflow

MetricFlow allows you to define, build, and maintain metrics in code.
https://docs.getdbt.com/docs/build/about-metricflow
Other
1.13k stars 94 forks source link

[SL-915] [Feature] Support Saved Queries #765

Closed Jstein77 closed 11 months ago

Jstein77 commented 1 year ago

MetricFlow should have a saved query construct. Saved queries extend governance beyond the semantic objects (metrics, dimensions, and entities) to their consumption by enabling users to pre-define a query in a governed format (yaml/git) and then reference it in downstream interfaces. This construct will use the same structure as the existing querying interface in the Metricflow CLI: metrics, group_bys, a where clause, limit, and order. Saved queries should be referenceable through semantic layer APIs, including by orchestration in dbt cloud for exports and caching.

In Cloud, Saved queries are the foundation for supporting exports and a declarative cache. Exports will allow users to write the results of saved queries back to their data warehouse. Exports would be a set of configurations you apply to a saved query that defines how the query should be executed, including destination, materialization strategy, and table name.

Saved Query Design

The saved query should provide a matching interface to the query interfaces used in all other APIs but in YAML form. I’m in favor of nesting these fields into a sub-param called query as shown below:

saved_queries:
  - name: my_query
    description:
    query:
      metrics:
        -
      group_bys:
        -
      where:
      limit:
      order:

From SyncLinear.com | SL-915

QMalcolm commented 1 year ago

@Jstein77 what is a the shape of the metrics list elements? Should this be the same as metrics on metric type params? For clarity the shape of metrics elements for metric type params can be specified in two ways:

metrics:
  - my_metric
     ...
     type_params:
       ...
       metrics:
         - other_metric
         - name: other_other_metric
           alias: <optional_alias>
           filter: <optional_filter>
           offset_window: <optional_offset_window>
           offset_to_grain: <optional_offset_to_grain>

Some separate questions:

nhandel commented 1 year ago

@QMalcolm This is a bit simpler than the derived metric type metric list. We just need a list of strings here similar to the structure we use in our query inputs. As an example, the following query from the CLI:

mf query --metrics "users, revenue" --group-by "metric_time__day,Dimension('signup_at').grain('month')" --where Dimension('user__is_active')

would become this saved query:

saved_queries:
  - name: key_metrics_by_signup_month
    description: Key metrics grouped by the users signup month
    query:
      metrics:
        - users
        - revenue
      group_bys:
        - metric_time__day
        - Dimension('signup_at').grain('month')
      where: Dimension('user__is_active')
siljamardla commented 12 months ago

I understand this will be supporting the export functionality in Cloud. Will the saved query functionality also be available in CLI setups? Can I use this to somehow replicate the metrics.calculate() macro functionality?

plypaul commented 10 months ago

@Jstein77 do you have context on the question from @siljamardla?

Jstein77 commented 10 months ago

@siljamardla you will be able to refer to a saved query through the CLI i.e mf query --saved_query <saved_query_name> (exact syntax to be determined). Beyond that, there won't be a way to export that query via the CLI, so the best bet to replicating export functionality is pulling out the generated SQL like you would with a normal query.

amirbtb commented 10 months ago

@siljamardla @plypaul I think this dbt-labs/dbt-semantic-interfaces issue is related to the feature you're asking about : [Exports] Add exports configuration to YAML spec . I'm looking forward to using this feature !