alphagov / search-api

Search API for GOV.UK
https://docs.publishing.service.gov.uk/apps/search-api.html
MIT License
32 stars 9 forks source link

TESTING BRANCH - Migrate GA4 logic to search-api #2964

Open georges1996 opened 1 month ago

georges1996 commented 1 month ago

Ticket: https://trello.com/c/SHc04ixR/110-search-analytics-pipeline-ua-to-ga4

The final output is the 'popularity' fields: https://www.gov.uk/api/search.json?fields=popularity_b,popularity

As part of the migration to GA4, the task was to understand the existing Python application and its output https://github.com/alphagov/search-analytics, then to update the Ruby application https://github.com/alphagov/search-analytics-ga4/ in order to get an identical/very similar output to the Python app.

Once this was correct the next part was to migrate the Search Analytics GA4 code from the Ruby repo to be part search-api repo (rationale: it’s easier to run the task if part of search-api - can be done on a pod in K8s, whereas running search-analytics from GitHub actions was disastrous for catching failures, it’ll be easier to maintain as a single codebase and not a separate thing for teams to look at) this decision will also bring down costs as we no longer need to maintain another code base and we don't need to have an S3 bucket anymore to dump the data to and for search-api to pick up, it can all be done as part of the rake page_traffic:load task

Rollback Plan

If in the unlikely case this fails in production we have a way to roll back. This will be that we revert this PR, and then it will revert back to using the data in the S3 bucket. We will then have to rerun the cron jobs in Argo to reset the popularity back.