osulp / Scholars-Archive

ScholarsArchive@OSU, institutional repository for Oregon State University
https://ir.library.oregonstate.edu/
15 stars 3 forks source link

Preserve Google Analytics 3 legacy data for SA #2599

Closed carakey closed 1 week ago

carakey commented 1 month ago

Descriptive summary

The Google Analytics 3 property for SA@OSU will be shut down on July 1, 2024. See Google Support. We want to preserve usage statistics for SA content so we can eventually reconnect those numbers with the works for comprehensive usage data over time.

carakey commented 1 month ago

We're currently unable to archive the data because of UIT restrictions on Google accounts. @KennaW has an outstanding service desk ticket requesting EITHER approval for the Google Sheets add-on for Analytics or access to SA data for a non-university Google account.

carakey commented 3 weeks ago

Properties:

SA@OSU Hyrax All Web Site Data 148834124 Dates: July 1, 2017 - June 30, 2023

Scholars Archive Scholars Archive Production 71605283 Dates: July 1, 2013 - Dec 31, 2017

Interested in views per page for ALL the (work and fileset) pages. Doesn't need to be daily data; cumulative/lifetime is fine.

In the UI I can view this data at Behavior >> Site Content >> All Pages However it seems I can only export the top 10 rows of per-page data from the UI no matter what the view is in the UI. The rest of the exported data is visits per day.

carakey commented 1 week ago

@KennaW I believe the data for paths with "concern" from June 24 (in 148834124 for 2017-07-01 to 2023-06-30) is usable and I have grabbed a CSV export.

We still need data for paths with “downloads” for the same property (148834124) and same dates (2017-07-01 to 2023-06-30).

As above, we also still need data for the earlier SA instance, Property ID 71605283; dates: July 1, 2013 - Dec 31, 2017. I believe we're looking for things with “handle” in the path.

carakey commented 1 week ago

CSVs are saved to SA Google Drive here: https://drive.google.com/drive/u/0/folders/1vCG7L_jucq1h3jOyecC7xLL3MvogXC2X

Hyrax data had to be split up due to size. One set is hits for paths with concern; one is for paths with downloads with a ?locale= flag; and the last is for paths with downloads without a ?locale= flag.