gouline / dbt-metabase

dbt + Metabase integration
https://pypi.org/project/dbt-metabase/
MIT License
448 stars 64 forks source link

Metrics Parsing #39

Closed z3z1ma closed 2 years ago

z3z1ma commented 3 years ago

Description

Bring your metrics into revision control, bring them into your data model, sync it directly with Metabase.

dbt-metabase metrics \
  --dbt_database test \
  --dbt_path tests/fixtures/metric/ \
   --schema public \
   --metabase_host localhost:3000 \
   --metabase_user alex@... \
   --metabase_password "..." \
   --metabase_database unit_testing \
   --metabase_use_http --verbose

Expression syntax: https://www.metabase.com/docs/latest/users-guide/expressions.html

  - name: Number of Customers with Large Orders
    description: Customers who are big spenders should be tracked independently of total,
      any customer who orders over 20 AUD of jaffle is counted
    metric: countif([customer_lifetime_value] > 20)

We will parse: countif([customer_lifetime_value] > 20) into['count-where', ['>', ['field', 41, None], 20]] The parser should be able to handle any type of expression allowing users to use the nice excel like syntax built by metabase team directly alongside your data models. They become centralized, self contained, and gain all the advantages of dbt/jinja.

image

Parsing Examples

Purposefully convoluted examples showing robustness and possibilities (most metrics are simple in theory with preprocessing logic in the model)

Input: Sum(case([site_dispenser_count] + 1 > 1 or [site_dispenser_count] - 1 > 1, [site_dispenser_count] + 1)) Output: ['sum', ['case', [[['or', ['>', ['+', ['field', 1, 'site_dispenser_count'], 1], 1], ['>', ['-', ['field', 1, 'site_dispenser_count'], 1], 1]], ['+', ['field', 1, 'site_dispenser_count'], 1]]]]]

Input: Sum([table.order] + [qty] 2 + 4 + 5 - 4 + 5) Output: ['sum', ['+', ['-', ['+', ['field', 1, 'table.order'], ['', ['field', 1, 'qty'], 2], 4, 5], 4], 5]]

Input: SumIf([site_dispenser_count], [site_city] > "Phoenix" or [site_state] = "Arizona") Output: ['sum-where', ['field', 1, 'site_dispenser_count'], ['or', ['>', ['field', 1, 'site_city'], '"Phoenix"'], ['=', ['field', 1, 'site_state'], '"Arizona"']]]

Input: Distinct(case([site_city] = "Phoenix", [site_panel_count])) / distinct([site_dispenser_count]) Output: ['/', ['distinct', ['case', [[['=', ['field', 1, 'site_city'], '"Phoenix"'], ['field', 1, 'site_panel_count']]]]], ['distinct', ['field', 1, 'site_dispenser_count']]]

Type of change

How Has This Been Tested?

Test Configuration:

Checklist:

References #25

remigabillet commented 3 years ago

@z3z1ma it's exciting to see a bit step towards making this happen. In the next couple of days, I'm going to try to run it locally, I'll keep you posted.

remigabillet commented 3 years ago

Where did you find a reference to the metabase expression grammar by the way?

z3z1ma commented 3 years ago

Using the clojure/jsx in their repository [unit tests and data models], good knowledge of infix/prefix/postfix mathematical notations in code format, and knowledge of PEG/pyparsing drove most of the work.

This was highly useful here which has configs for expressions and filters which is what I think your interested in @remigabillet

remigabillet commented 3 years ago

I just played with it locally. It's such a cool addition to the project! Great work @z3z1ma! I was able to easily parse a few metrics (I had to lowercase "NOT" btw). I also adding some code to post payload to the API and it created fine.

As we chatted about on Slack, I think that adding a way to update metrics is the only critical functionality missing.

remigabillet commented 2 years ago

hey @z3z1ma What's the latest on this PR btw? I think this feature is HUGE! I'd love to see it land.

z3z1ma commented 2 years ago

hey @z3z1ma What's the latest on this PR btw? I think this feature is HUGE! I'd love to see it land.

Hey @remigabillet

Yeah totally. I think I just need to rebase on the latest stable master branch and put in a couple unit tests. It's actually not much left to do at all provided there aren't too many conflicts. The metric parser is in its own module so I expect it to be easy. Let me run this down so we get a branch in the main repo we can use until goulline is ready to merge it to an RC. I'll handle this, this week. It is indeed a superpowered feature.

remigabillet commented 2 years ago

Sounds great @z3z1ma. Ping me when it's ready for review.

z3z1ma commented 2 years ago

closing this in favor of #66 which is the same PR but from a fresh branch