Velir / dbt-ga4

dbt Package for modeling raw data exported by Google Analytics 4. BigQuery support, only.
MIT License
289 stars 128 forks source link

Feature Request: Additional Conversion Metrics for GA4 Events in stg_ga4_conversions_daily Tables #299

Closed ageofneil closed 4 months ago

ageofneil commented 4 months ago

I'm currently looking at the stg_ga4_conversions_daily tables, which only provide a "count" metric for conversion events. It would be helpful if these tables included additional fields for default GA4 events.

Specifically, purchase events, the inclusion of the following columns would greatly help with downstream reporting needs:

transaction_id: To uniquely identify each transaction. value: To capture the total value of the transaction. tax: To record the tax amount associated with the transaction. shipping: To detail the shipping costs. currency: To specify the currency used in the transaction. coupon: To note any coupon codes applied.

These are all fields in the standard ga4 purchas event (https://developers.google.com/analytics/devguides/collection/ga4/set-up-ecommerce).

Is this something that would be useful, or have I've overlooked something and they are included in another model?

dgitis commented 4 months ago

It is expected that you will create your own fct and dim tables on top of this package. What the stg_ga4__conversions_daily table does is add conversion metrics to the various sessions tables.

The data that you are asking for should either be put in a fct_ga4__event_purchase table (if there's a 1:1 relationship to the data) or in a dim_ga4__items table (if there's a 1:many relationship) that can be joined to your purchase table.

Basically, you should be creating a fct_ga4__event_* table for any event that you want to report on that contains the important data for that event and all of the keys that you need to join it to the various dim tables.

I'm actually putting together some training materials that explains how to use this package for analysts who are new to dbt and data modelling.

Feel free to comment here if you need any further explanation but close this issue when you're comfortable with how you should solve this for yourself.

ageofneil commented 4 months ago

Thanks for the feedback! That makes sense and I'll proceed like that.

However, I would like the table to be partitioned. Could I follow the _daily format used for the sessions table and add a stg_ga4__event_purchase_daily and fct_ga4__event_purchase_daily?

dgitis commented 4 months ago

Here's a purchase model that's partitioned. It's similar to the sessions partitioning but the date column is different and there aren't as many tables to pull from.

Feel free to adjust which columns you add. The privacy_info columns, for example, are going to become more useful soon as Google forces Consent Mode on everyone. Device and geo data would also be good additions. The boilerplate shouldn't change, though.

{% set partitions_to_replace = ['current_date'] %}
{% for i in range(var('static_incremental_days')) %}
    {% set partitions_to_replace = partitions_to_replace.append('date_sub(current_date, interval ' + (i+1)|string + ' day)') %}
{% endfor %}
{{
    config(
        materialized = 'incremental',
        incremental_strategy = 'insert_overwrite',
        partition_by={
            "field": "event_date_dt",
            "data_type": "date",
        },
        partitions = partitions_to_replace,
    )
}}
select
    transaction_id,
    event_date_dt,
    event_timestamp,
    user_id,
    session_key,
    event_key,
    purchase_revenue,
    shipping_value,
    tax_value,
    coupon,
    affiliation,
    page_location,
    page_referrer,
    original_page_location,
    original_page_referrer,
from {{ ref('stg_ga4__event_purchase_deduplicated') }}
{% if is_incremental() %}
    where event_date_dt in ({{ partitions_to_replace | join(',') }})
{% endif %}
ageofneil commented 4 months ago

Much appreciated! thank you.