dbt-labs / metricflow

MetricFlow allows you to define, build, and maintain metrics in code.
https://docs.getdbt.com/docs/build/about-metricflow
Other
1.13k stars 94 forks source link

[SL-1634] [Feature] Support Dimension-only queries for SCD dimensions #1019

Open Jstein77 opened 7 months ago

Jstein77 commented 7 months ago

Is this your first time submitting a feature request?

Describe the feature

One use case for the semantic layer is generating unaggregated data sets for operational reporting. Think of questions like how many support tickets has this company created? The expected output is an unaggregated list of tickets that an operations team might use to evaluate customer health. Dimension-only queries are supported in the semantic layer today but don't work with SCD-type dimensions. For example, take the following schema:

Screenshot 2024-01-30 at 7 56 44 PM

The expected output is auction name, order_date and order_id where the order name is based on the validity window of the dim_auction semantic model.

We could create the following semantic models to model this relationship:

semantic_models:
  - name: dim_auctions
    model: ref('dim_auctions')
    description: "Table with one row per auction"
    primary_entity: auction
    entities:
      - name: auction
        type: natural
        expr: auction_code
    dimensions:
      - name: auction_name
        type: categorical
      - name: auction_date_start 
        type: time
        type_params:
          time_granularity: day 
          validity_params:
            is_start: True
      - name: auction_date_end 
        type: time
        type_params:
          time_granularity: day 
          validity_params:
            is_end: True
  - name: fct_orders
    defaults:
      agg_time_dimension: order_date
    model: ref('orders')  
    entities:
      - name: order
        type: primary
        expr: order_id
      - name: auction
        type: foreign
        expr: auction_code
    dimensions:
      - name: order_date
        type: time
        type_params:
          time_granularity: day
    measures:
      - name: orders
        agg: count_distinct
        create_metric: true

Then create a query that shows us the acution name, order date and order id mf query --group-by metric_time,order,auction__auction_name --explain

This query currently errors out, and we need to add a dummy metric in order for this to work for example adding a count of orders which will always return 1 since we're looking at the data set at the order_id level. mf query --metrics orders --group-by metric_time,order,auction__auction_name --explainDimension-only

Describe alternatives you've considered

You can work around this by adding a dummy metric and hiding it from the final view in the BI tool. This works, but we should support this natively.

Who will this benefit?

No response

Are you interested in contributing this feature?

No response

Anything else?

No response

From SyncLinear.com | SL-1634

courtneyholcomb commented 7 months ago

@jordan.stein thoughts on prio for this / where it should slot in?

Jstein77 commented 6 months ago

@courtneyholcomb I updated the status to low pri. This is a good one to keep in the backlog.