Closed awoehrl closed 8 months ago
Hey @awoehrl I'm working on this now. If we're going to extend this functionality to support table clustering, then do you know how to cluster tables on column expressions (e.g. cluster by coalesce(customer_id, anonymous_customer_id)
in Bigquery? My research thus far suggests that it isn't possible, in which case I'd need to materialize such a column in each activity model. But I'm not a Bigquery expert, so let me know if you're aware of any workarounds here. Thanks!
Similiar to dataset joins the activity columns should be calculated based on both id columns. The two window functions can be extended to partition by both.
Things to consider:
How does this impact performance? Are there alternatives. E.g. using one merged id column instead of two throughout the activities