Closed jillpe closed 10 months ago
Is having it report to GA sufficient, or are you wanting the analytics summary to display in Hyrax? Only getting it to report to GA is the easier lift (about a 5). If you are wanting to be able to look at the analytics summary in Hyrax that will be a bigger lift (multiple sprints). I know we'd talked about setting aside additional analytics works for a separate sprint, so the bigger lift may be what we set aside
We are ok with getting just the GA working for now; My understanding is that we are waiting for in-app work in Hyrax to be completed
to clarify -- We have google analytics for the whole repo currently. We would additionally like our tenants to be able to create and manage their own per-tenant GA
@aahurford To clarify there is presently a G-3TWN24N1XN
on the proprietor (e.g. https://hykucommons.org) and all of the tenants. It sounds like you would like each tenant to be able to add another analytics code that is specific to their organization (e.g. Goshen College has an analytics ID that they use for their website and want to use for their hykucommons.org tenant).
Am I understanding that correctly?
Also, if the tenant adds their GA code, it does not clobber/replace the proprietor GA code for that tenant.
@jeremyf Yep! It's my understanding that British Library has something similar implemented. The way we are doing it currently makes it difficult to retrieve stats at the tenant level.
cc @laritakr not sure if this will help but I captured this screenshot during yesterday's meeting. It may be relevant to this work?. I asked Rob why this project had its own hyrax branch and if we could/should switch them back:
tl;dr The Hyrax branch is a bit of a red herring.
Given that the request is for a per tenant Google Analytics And Hyrax has no concept of tenants Then this is not a problem that different branches/versions of Hyrax would solve.
In discussion with @laritakr, there's https://github.com/scientist-softserv/palni-palci/pull/479 to consider. It appears to implement per tenant google analytics but at the cost of proprietor level analytics.
However, it does inform how we might go about doing this.
ATTN @aahurford, please read the following to see if it makes sense.
There are two functions for Google Analytics in Hyrax
For Issue #910 we are looking to have functionality where:
In our investigation, we’ve discovered that there is presently a per-tenant configuration of Google Analytics (but no longer a proprietor level configuration). That was brought about by PR#479 addressing Issue #408.
It appears that the Google Analytics ID of the tenants (as well as the google analytics auth keys) are copied to each of the tenants; that is each tenant has the same Google Analytics ID.
The proposed solution is:
The above proposal would then ensure that when GA4-fetch integration is completed, all tenants can fetch data from google (using the proprietor’s provide auth keys). And both proprietor and tenant would have their respective information shown in the Google Analytics. This would alleviate the need for each tenant to provide an OAUth configuration for fetching data.
When a tenant chose not to specify a google analytics ID, then they would not be able to go to https://analytics.google.com/analytics/web/ to see information about their tenant. However, once the fetch integration worked, they would piggyback retrieving information from Google to show in the Hyrax native analytics environment.
Note: This assumes that tenants are not able to opt out of the proprietor sending analytics to Google.
I think this plan makes sense, but I wanted to clarify that we are currently gathering data for all tenants in the main account (GA-4 285666830) -- the problem is that there is no easy way to pull data for a single tenant.
We previously explored the option of adding sub accounts under the main one to achieve this but abandoned that plan due to lack of documentation for how exactly to do it. We are now pursuing this direction after hearing from Jenny at British Library that they have something similar set up.
So, it sounds like we are on the same page, but I want to make sure we are pursuing the least complicated path at achieving both overall and tenant level analytics, and however will be of best use to the community.
Best case scenario is that tenant admins can access their stats on their own -- but not entirely necessary if that makes the work terribly more complex. I also want to make sure that British Library's setup is reviewed to avoid reinventing the wheel.
@aahurford I'm asking @j-basford for some clarification. What I'm seeing is that they have a per tenant analytics configuration; they also send (through a different mechanism) data to Iris (at a proprietor level, but likely with tenant level segmentation).
Which would mean that:
This would mean then deviate from future Hyrax Google Analytics 4 work; because we'd lose the ability for the proprietor to fetch analytics of all tenants (requiring each tenant to configure OAuth keys to fetch that information into Hyrax).
It is possible that Iris would be suitable, and with the COUNTER metrics work for Pitt, however we introduce a mix of analytics down the road that will require notable refactoring; Hyrax assumes that it's sending analytics to the same place that it will later fetch from.
So, I'm still thinking through this and will bring it to the team for further discussion/consideration.
From @j-basford:
Hi Jeremy, we have a GA tag that we can look at for everyone's tenant, which I think is in-built into our page setup. then we can also go in the superadmin view and add a tag that each tenant supplies, and they have to look at that themselves.
The BL uses a company called Crafted to do all our GA monitoring and they create us nice dashboards to monitor; they simply added the repositories to the list. I (and Graham and Nora) can only see the repository figures, but the BL web team can see them too, plus all the other BL websites that they monitor. My understanding is that our partner tenants have a similar set up and their web teams give them the tag to use and they monitor it in-house.
Happy to show you and Amanda what it looks like in our GA dashboard if that helps. We do also use IRUS, Graham is much more au fait with those figures than I am as he tested those and is requesting some tweaks on it with the IRUS-UK team right now
@aahurford to summarize, the pathway forward is to have two analytics codes on each page. One proprietor and one tenant. However, there are still issues to resolve (see Discussion)
This resolves sending the data to Google. The next issue is getting the data out of Google; in particular with a focus on tenant level segmentation. There are two places and two "users" that the data:
The places are:
The users are:
With the above, I'm wondering if you can add hostName
to Google Analytics dashboard. I found that while working on the Sushi counter metrics (here's the related code documentation regarding hostName
dimension).
The Querying Google Analytics is a known issue for Hyrax; and would be part of Hyrax 5 (with a possible backport to Hyrax 4). Put another way, this is further down the horizon.
The Iris pathway that BL took, is a different consideration as well. Namely, it hooks into another service and sends that data. And we wouldn't be able to back fill that.
Since the proprietor's analytics is used on each tenant; that means the tenant does not have access (at this time) to Google Analytics for their tenant (unless you share the properietor analytics information; and provide a dimenion query with hostName
).
Put another way, due to the configuration reality, the best way to ensure we have historic data prior to any changes (which we should make) is to continue with sending properietor analytics; but move that information back to the proprietor level then allow each tenant to add their own analytics. This would mean the historical data is available, but would only be visible to the tenant when one of three things becomes true:
hostName
dimension can be added.Thank you for your thoughtful examination of this, Jeremy!
TODO: @jeremyf to do a writeup of tasks to accomplish this
This does not pass QA. The tenant only showed their own ID, and not the proprietor's ID.
cc @ndroark ready for your re-review
Tested on: Staging
Given I'm on a proprietor only page
When I inspect the source
Then I will see a google analytics section for the proprietor's analytics id
Given I'm on a tenant page
And the tenant has configured their analytics ID
When I inspect the source
Then I will see a google analytics section for the proprietor's analytics id
And I will see the google analytics section for the tenant's analytics id
Given I'm on a tenant page
And the tenant has NOT configured their analytics ID
When I inspect the source
Then I will see a google analytics section for the proprietor's analytics id
And I will not see the google analytics section for the tenant's analytics id
Given I'm on a tenant page
And the tenant has configured their analytics ID
And the tenant's analytics ID is the same as the proprietors
When I inspect the source
Then I will see only one google analytics section for that singular analytics id
Summary
(Updated and revised 2023-12-08)
Goal: For each page to send data to Google Analytics; either the proprietor only or the proprietor and tenant (see Use Cases).
We want:
Use Cases
Use Cases/Acceptance Criteria
```gherkin Given I'm on a proprietor only page When I inspect the source Then I will see a google analytics section for the proprietor's analytics id ``` ```gherkin Given I'm on a tenant page And the tenant has configured their analytics ID When I inspect the source Then I will see a google analytics section for the proprietor's analytics id And I will see the google analytics section for the tenant's analytics id ``` ```gherkin Given I'm on a tenant page And the tenant has NOT configured their analytics ID When I inspect the source Then I will see a google analytics section for the proprietor's analytics id And I will not see the google analytics section for the tenant's analytics id ``` ```gherkin Given I'm on a tenant page And the tenant has configured their analytics ID And the tenant's analytics ID is the same as the proprietors When I inspect the source Then I will see only one google analytics section for that singular analytics id ``` *The above handles not needing to worry too much about the data migration to rollback changes to the `AccountSettings`.*Tasks
A cautionary note: at present, the tenant's have Analytics ID along with secrets. It is highly likely that the Analytics ID and secrets are duplicated (and are in fact the Proprietor's information).
Breakdown
We want
Hyrax::Analytics.config.analytics_id
to always be the proprietor analytics id. In doing this, we'll ensure that future work regarding integration with Google Analytics version 4 will be the least disruptive. Further this better aligns with Hyrax assumptions.At the proprietor-level, we need to be able to provide:
[x] Proprietor configuration, as defined in baseline Hyku
analytics_id
:: viaENV['GOOGLE_ANALYTICS_ID']
app_name
:: viaENV['GOOGLE_OAUTH_APP_NAME']
app_version
:: viaENV['GOOGLE_OAUTH_APP_VERSION']
privkey_path
:: viaENV['GOOGLE_OAUTH_PRIVATE_KEY_PATH']
privkey_secret
:: viaENV['GOOGLE_OAUTH_PRIVATE_KEY_SECRET']
client_email
:: viaENV['GOOGLE_OAUTH_CLIENT_EMAIL']
[x] Ensure that Pals production deploy has the above environment variables for the proprietor
At the tenant-level we should only expose
google_analytics_id
.google_analytics_id
)google_oauth_app_name
google_oauth_app_version
google_oauth_client_email
google_oauth_private_key_path
google_oauth_private_key_secret
google_oauth_private_key_value
We will almost certainly want to undo most of the work from PR #479; in particular the switching logic for amending the
Hyrax.config
andHyrax::Analytics.config
:[x] Review removing AccountSettings#reload_analytics
[x] Remove the
Hyrax.config
and call to#reload_analytics
portion of AccountSettings#reload_library_config[x] Override ./app/views/shared/_ga4.html in Pals Hyku to render one or more analytics IDs.
Hyrax::Analytics.config.analytics_id
and pushing, if applicableSite.google_analytics_id
. Then making that array unique and finally iterating over that array to render one section per ID.[ ] Review the list of Pals's tenant's Google Analytics IDs; a list of all
google_analytics_id
and for eachgoogle_analytics_id
the associatedcnames
.Acceptance Criteria
Testing Instructions
See above "Use Case/Acceptance Criteria" for testing instructions.
User Documentation
The proprietor analytics are configured on the server via Environment variables. To change these requires coordination with SoftServ.
To send data to Google Analytics requires a Google Analytics ID.
To fetch data from Google Analytics requires creating a service account with Google and adding the service account keys to the server.
See Hyku README’s Google Section for details regarding the service accounts.
With credentials set, Hyku will send analytics data to Google. And when the community completes Hyrax Google 4 Analytics, it will be able to pull Google analytics data into the Hyku dashboard for all tenants.
The tenant analytics are set via a tenant administrator in the Configuration > Settings > Account form. Set the “Google Analytics” field to the tenant provided Google Analytics ID (e.g. G-10CHARLONG).