dbt-labs / dbt-labs-experimental-features

dbt support for database features which are not yet supported natively in dbt-core
Apache License 2.0
145 stars 43 forks source link

Macro-based approach for lambda views #6

Closed jtcohen6 closed 4 years ago

jtcohen6 commented 4 years ago

Alternative to #5

Macro-based approach

Pros

lambda_filter prints a static timestamp at runtime into both the historical and new versions of the data, rather than relying on a dynamic query result or a current_timestamp relative to query time. This static timestamp serves as a stable bookmark that can only be pushed forward by the next dbt run, saving ourselves from potential data gaps in the event of job failures. (@clrcrl explained this well here.)

The default value is run_started_at. @amychen1776 suggested that an optional var makes sense for CLI override; I've named it (badly) as lambda_split.

Cons

We didn't like this for packages. Are we willing to go along with it for this use case? As long as we establish really good conventions for organizing the macros/ directory, I'm personally leaning in this direction.

The alternative is copying the model SQL into both the historical and the lambda models (as in #5). This saves a file and makes the code more immediately visible, but at the cost of duplicating logic and requiring analysts to update / cross-check it in multiple places.

Outputs

Screen Shot 2020-07-21 at 2 15 35 PM Screen Shot 2020-07-21 at 2 05 54 PM

Alt-alt approach: custom materialization??

I've included a mockup in models/thought_experiment_only. As the name suggests, this is only a thought experiment.

Pros:

Cons:

Challenges:

clrcrl commented 4 years ago

@jtcohen6 — I wanted to craft more narrative around this, so spent some time today re-organizing history over on #5

jtcohen6 commented 4 years ago

Closing in favor of #5 (which includes a lot of the code here)