It's technically outside of the Activity Schema spec, but it would be nice if the package shipped with additional aggregation functions to use with the aggregate relationships. Aggregations include:
average
median
listagg (available in Narrator)
listagg distinct
count distinct
boolean sum - convert boolean features to integers and sum them
not null - returns true if the feature has at least one non-null value, else false. Useful for sparse features.
Dependencies
25 - dbt project needs to know data types for features
28 - new aggregations need to be registered in the Aggregation Registry
Implementation
Add a macro for each aggregation with the naming convention _aggfunc_name.sql (e.g. _average.sql)
Implement each using the caller() implementation pattern (see example)
Register the aggregation in the Aggregation Registry
Checklist for each of the aggregations to implement:
[ ] average
[ ] median
[ ] listagg
[ ] listagg distinct
[ ] count distinct
[ ] boolean sum
[ ] not null
Open Questions
Are these reasonable to implement, even though they aren't included in the Activity Schema spec?
Description
It's technically outside of the Activity Schema spec, but it would be nice if the package shipped with additional aggregation functions to use with the
aggregate
relationships. Aggregations include:true
if the feature has at least one non-null value, else false. Useful for sparse features.Dependencies
25 - dbt project needs to know data types for features
28 - new aggregations need to be registered in the Aggregation Registry
Implementation
_aggfunc_name.sql
(e.g._average.sql
)caller()
implementation pattern (see example)Checklist for each of the aggregations to implement:
Open Questions