Velir / dbt-ga4

dbt Package for modeling raw data exported by Google Analytics 4. BigQuery support, only.
MIT License
289 stars 128 forks source link

Support custom item parameters in `item_params` #284

Open dgitis opened 7 months ago

dgitis commented 7 months ago

I wanted to start a discussion around supporting custom item parameters in item_params. In particular, I'd like to discuss where we unnest those parameters.

With the changes to the base_select macros in https://github.com/Velir/dbt-ga4/pull/283/commits/7ff99d79d3e393baf287eb83de87ae4396cc74eb, it seems quite elegant to unnest item_params in the base_select_renamed macro as we are already unpacking and repacking the items record there.

However, this macro is called in the base_ga4__events model. I dislike changing the base model because it should be the biggest table and the full refresh required when adding item_params is literally making the process of adding a new item parameter as expensive as possible.

The alternative is unnesting in the stg_ga4__event_items model. This makes more sense to me as adding a new item parameter would only require a refresh of this model and any downstream models.

I wanted to give others a chance to add their opinions before jumping in with a PR to address this.

adamribaudo-velir commented 7 months ago

Just want to confirm 2 things:

  1. you're talking about unnesting the item parameters out to the level of the item, not the event right? So if I have an item parameter of variant_id, it would be accessible under items.variant_id? I can't wrap my head around the SQL for that, but interested to see it.
  2. users would need to specify the item parameter names in the project yml, yes?

We don't apply any custom business rules from the project yml until stg_ga4__events so I wouldn't do it in the base select macro.

I'm also up for creating a separate items model that has all the unnested item data and a join key. Seems cleaner than 1 above.

dgitis commented 7 months ago

We already have a separate items model, it's stg_ga4__event_items which is a table with one row per item. I guess then you agree to use that model and, the more that I think about it, the more I agree.

I'd also like to rename that model to stg_ga4__ecommerce_items to make the name more distinct from the event models. Since the model is a view, it won't really affect the model directly, but it will break downstream models. I'm not sure if changing the name is worth the effot.

adamribaudo-velir commented 7 months ago

Oh right! Forgot about that one.

I see what you mean about the model name. Slight preference to leave it alone to not break any user dependencies. If we see a lot of stg_ga4_ecommerce* models coming in the future, I could be convinced we should update it.