gouline / dbt-metabase

dbt + Metabase integration
https://pypi.org/project/dbt-metabase/
MIT License
442 stars 63 forks source link

feat: Metric Parsing #66

Closed z3z1ma closed 2 years ago

z3z1ma commented 2 years ago

NOTE: I am moving the original PR #39 to here since we have had enough changes since then that it was easier to integrate freshly. This branch is fully functional for any eager users/early testers and is on the roadmap for the 0.9.0 release.

Description

Bring your metrics into revision control, bring them into your data model, sync it directly with Metabase.

dbt-metabase metrics \
  --dbt_database test \
  --dbt_path tests/fixtures/metric/ \
   --schema public \
   --metabase_host localhost:3000 \
   --metabase_user alex@... \
   --metabase_password "..." \
   --metabase_database unit_testing \
   --metabase_use_http --verbose

Expression syntax: https://www.metabase.com/docs/latest/users-guide/expressions.html

  - name: Number of Customers with Large Orders
    description: Customers who are big spenders should be tracked independently of total,
      any customer who orders over 20 AUD of jaffle is counted
    metric: countif([customer_lifetime_value] > 20)

We will parse: countif([customer_lifetime_value] > 20) into['count-where', ['>', ['field', 41, None], 20]] The parser should be able to handle any type of expression allowing users to use the nice excel like syntax built by metabase team directly alongside your data models. They become centralized, self contained, and gain all the advantages of dbt/jinja.

image

Parsing Examples

Purposefully convoluted examples showing robustness and possibilities (most metrics are simple in theory with preprocessing logic in the model)

Input: Sum(case([site_dispenser_count] + 1 > 1 or [site_dispenser_count] - 1 > 1, [site_dispenser_count] + 1)) Output: ['sum', ['case', [[['or', ['>', ['+', ['field', 1, 'site_dispenser_count'], 1], 1], ['>', ['-', ['field', 1, 'site_dispenser_count'], 1], 1]], ['+', ['field', 1, 'site_dispenser_count'], 1]]]]]

Input: Sum([table.order] + [qty] 2 + 4 + 5 - 4 + 5) Output: ['sum', ['+', ['-', ['+', ['field', 1, 'table.order'], ['', ['field', 1, 'qty'], 2], 4, 5], 4], 5]]

Input: SumIf([site_dispenser_count], [site_city] > "Phoenix" or [site_state] = "Arizona") Output: ['sum-where', ['field', 1, 'site_dispenser_count'], ['or', ['>', ['field', 1, 'site_city'], '"Phoenix"'], ['=', ['field', 1, 'site_state'], '"Arizona"']]]

Input: Distinct(case([site_city] = "Phoenix", [site_panel_count])) / distinct([site_dispenser_count]) Output: ['/', ['distinct', ['case', [[['=', ['field', 1, 'site_city'], '"Phoenix"'], ['field', 1, 'site_panel_count']]]]], ['distinct', ['field', 1, 'site_dispenser_count']]]

Type of change

How Has This Been Tested?

Test Configuration:

Checklist:

References #25

remigabillet commented 2 years ago

@z3z1ma what's the latest on getting this work merged?