Velir / dbt-ga4

dbt Package for modeling raw data exported by Google Analytics 4. BigQuery support, only.
MIT License
289 stars 128 forks source link

Differences between GA4 Reports and BigQuery reports - # of Engaged Sessions #329

Open ShaiDiamant opened 2 weeks ago

ShaiDiamant commented 2 weeks ago

According to GA4, "Engaged Sessions":

The number of sessions that lasted longer than 10 seconds, or had a key event, or had 2 or more screen or page views. Learn more about sessions.

I am not sure of the reason, but I think the "session_engaged" event param is not inclusive of this definition. There's another param that I noticed appearing called "engaged_session_event" that might be the reason for the difference.

In any way, I see a significant difference of about 50% less engaged sessions when reporting from BigQuery, compared to the GA4 report on the exact same data. The rest of the metrics are identical - number of users, number of sessions, even the session avg. engagement time.

I tried to count the amount of sessions that had either a max engaged of 1 or 2 or more page views and the number remained the same, so it doesn't seem to be the issue.

I'm still trying to locate the exact reason for the difference and will update if I find any more information

dgitis commented 2 weeks ago

Are you filtering any events from your export?

In particular, the user_engagement event is often filtered but that event often carries important engagement parameters.

Otherwise, I would look at differences in our definitions of sessions.

Are you using multiple streams with identifiers, specifically the cid, being passed between streams? That could do it as, to prevent session ID collisions in multi-site, we split session identifiers by stream. If you are deliberately passing the cid between streams, then the adding stream to the session_key would break these sessions.

Still, I would expect the number of sessions to be off in this case.

Unless you are filtering events before they go to BQ, I can't think of any reason for the discrepancy.

ShaiDiamant commented 2 weeks ago

Hmm no we dont filter events before they go to BQ, and we only use one stream with one property. Still trying to locate the difference but no success so far