Macro-based approach for lambda views

jtcohen6 commented 4 years ago

Alternative to #5

Macro-based approach

Pros

Take care of all boilerplate SQL (dynamic filtering, union logic)
Write model SQL only once

lambda_filter prints a static timestamp at runtime into both the historical and new versions of the data, rather than relying on a dynamic query result or a current_timestamp relative to query time. This static timestamp serves as a stable bookmark that can only be pushed forward by the next dbt run, saving ourselves from potential data gaps in the event of job failures. (@clrcrl explained this well here.)

The default value is run_started_at. @amychen1776 suggested that an optional var makes sense for CLI override; I've named it (badly) as lambda_split.

Cons

Requires three separate files: historical model file, lambda model file, macro file to encode model SQL
Model SQL is stored in macros/ instead of models/. Model files just call macros.

We didn't like this for packages. Are we willing to go along with it for this use case? As long as we establish really good conventions for organizing the macros/ directory, I'm personally leaning in this direction.

The alternative is copying the model SQL into both the historical and the lambda models (as in #5). This saves a file and makes the code more immediately visible, but at the cost of duplicating logic and requiring analysts to update / cross-check it in multiple places.

Outputs

Alt-alt approach: custom materialization??

I've included a mockup in models/thought_experiment_only. As the name suggests, this is only a thought experiment.

Pros:

We can both keep model SQL within the model file and write that SQL only once

Cons:

It obfuscates a lot of logic into the materialization layer

Challenges:

How to pass config values down to the historical model? Namely materialization (table or incremental), schema, alias
How to "call" one materialization from another? We don't want to copy+paste all the logic from every adapter's incremental materialization into a new lambda_view materialization

clrcrl commented 4 years ago

@jtcohen6 — I wanted to craft more narrative around this, so spent some time today re-organizing history over on #5

jtcohen6 commented 4 years ago

Closing in favor of #5 (which includes a lot of the code here)

dbt-labs / dbt-labs-experimental-features

Macro-based approach for lambda views #6

Macro-based approach

Pros

Cons

Outputs

Alt-alt approach: custom materialization??