Open adamribaudo-velir opened 2 weeks ago
@dgitis I ran this and noticed differences between our stg_ga4__sessions_traffic_sources_last_non_direct_daily
model and Google's last_click
fields. When I looked at the raw events, it looked like our model was more accurate which was weird.
I think it will only confuse people to have 2 definitions of 'session last click attribution' in the package. I suppose we should just include these fields and remove stg_ga4__sessions_traffic_sources_last_non_direct_daily
just thought I'd check with you first.
The advantage of our method over Google's is that we don't decouple the attribution fields where GA4 does.
So, if you see source / medium / campaign from sessions in this order from earliest to latest:
facebook / paid_social / my_fb_campaign google / organic direct
The last, non direct source / medium / campaign in GA4 would be google / organic / my_fb_campaign while the package would return google / organic / null.
As with my comment on the other PR, should we maybe make this configurable?
The advantages of using Google's definitions are as follows:
While the advantage of our definitions are as follows:
I'm thinking that we create a use_google_attribution_fields
variable.
At this stage, the new variable only enables these fields in the base model, but in the future we could modify our various attribution models to detect this variable and return vastly different SQL depending on the configuration.
Description & motivation
Google recently released a new set of fields,
session_traffic_source_last_click
, documented as:These fields have been added to the base package model.
Checklist
dbt test
andpython -m pytest .
to validate existing tests