GSA / data.gov

Main repository for the data.gov service
https://data.gov
Other
648 stars 101 forks source link

GA not recording `org` and `publisher` correctly #4743

Closed tdlowden closed 4 months ago

tdlowden commented 6 months ago

After implementing CSS Selectors to collect organization and publisher (when present), GA is showing that the implementation worked, but it is recording those variables for some pageviews and not all.

Image

How to reproduce

  1. Log into datagovGA4
  2. Create a report to show datagov_dataset_organization
  3. Filter by a specific dataset URL
  4. witness the variance in organization and (not set)

Expected behavior

100% of dataset pageviews will attribute a datagov_dataset_organization and a datagov_dataset_publisher. Each dataset page will contain one value per variable, and not a value and also (not set) for some pageviews.

Actual behavior

~60% of pageviews record an org and publisher, and ~40% record (not set)

Sketch

tdlowden commented 6 months ago

New goal: Have the organization and publisher drawn from CKAN directly to populate a dataLayer array upon pageload, like usa.gov does:

Image

This would apply on all pages related to a dataset, for example:

Should all have the same array with organization set to State of Washington and publisher set to data.wa.gov

This tutorial should help: https://www.analyticsmania.com/post/ultimate-google-tag-manager-data-layer-tutorial/

From there, I can use GTM dataLayer variables to collect the data and send on pageviews and file downloads to GA, to associate those events with the org.

tdlowden commented 5 months ago

@robert-bryson any updates here?

gujral-rei commented 4 months ago

@robert-bryson , please update the ticket.

robert-bryson commented 4 months ago

The current approach (JS Google Tag Manger script) is the recommended by Tag Manager but not working for us, obviously. There is a way to generate it via an official server-side tagging option. The docs describe a very different scenario than what is presented above so I am working on achieving the above with the official functionality if possible.

robert-bryson commented 4 months ago

Draft PR at https://github.com/GSA/ckanext-datagovtheme/pull/204.

tdlowden commented 4 months ago

FYI:

https://github.com/GSA/data.gov/issues/4783#issuecomment-2214240705

Thanks to @jbrown-xentity for a quick fix PR

tdlowden commented 4 months ago

@robert-bryson added

window.dataLayer = window.dataLayer \|\| [];
--
  |  
  | dataLayer.push({

which resolved the GTM issue. Moving to blocked until pentesting is done so we can push to prod

tdlowden commented 4 months ago

Pushed to prod. Subject to data populating, will QA Wednesday

tdlowden commented 4 months ago

as of 7/17, we now have 98% accuracy. Not perfect, but well within a margin of error for bots that I feel comfortable with making the data public

Image