tnightengale / dbt-activity-schema

A dbt-Core package for generating models from an activity stream.
GNU General Public License v3.0
38 stars 5 forks source link

Add Aggregation Registry #28

Open bcodell opened 1 year ago

bcodell commented 1 year ago

Description

Rather than calling aggregation functions from the top level of the package (e.g. dbt_activity_schema.max), a cleaner interface would be to create something akin to a registry from which to call them (e.g. dbt_activity_schema.activities().max).

Implementation

{% do return({
    'min': dbt_activity_schema._min,
    'max': dbt_activity_schema._max,
    'count': dbt_activity_schema._count    
}) %}

Open Questions

tnightengale commented 1 year ago

I am not opposed to this, but what do we get for adding the complexity of this abstraction? As far as I can tell, there no added safety, speed, etc.

bcodell commented 1 year ago

Ergonomics when defining multiple appended activities in datasets. The code pattern might look as follows:

{%- set aggregations = dbt_activity_schema.aggregations() -%}

{{ dbt_activity_schema.dataset(
    dbt_activity_schema.activity(...),
    [
        dbt_activity_schema.activity(..., aggregations.max),
        dbt_activity_schema.activity(..., aggregations.min),
        dbt_activity_schema.activity(..., aggregations.count),
    ]
)}}

If this abstraction proves useful, then there's an argument to do something similar for relationships.

bcodell commented 1 year ago

Was thinking about this more. The biggest benefit is in developer speed by enabling auto-complete for this in dev environments so that developers don't need to have the full set of available aggregations memorized when defining appended activities.

Alternatively, another approach could be via naming conventions. Could prefix each aggregation macro with a standard prefix (e.g. agg_ to achieve this.