Closed richgreen-moj closed 7 months ago
mocked up a badge dashboard and showed the team at yesterday's stand-up (received some good feedback) https://user-guide.modernisation-platform.service.justice.gov.uk/user-guide/workflow-status.html
have also been reaching out to other teams to see if / how they are monitoring workflows (most just use a Slack message on failure).
had a quick look at some 3rd party tools (datadog, thundra-foresight) and a few more need to dig in a little deeper to assess cost and value for money.
Added badges to repo readme to test visibility improvement 🤔 https://user-guide.modernisation-platform.service.justice.gov.uk/user-guide/workflow-status.html
During the spike, I delved into the Dashboard/Status Page as demonstrated here. I presented this to the team and received positive feedback. Additionally, I implemented the concept of status badges on the repository homepage (example here). To further refine these features, I've created two GitHub tickets:
Issue #6675 - Implementation of status badges on repositories. Issue #6676 - Development of a comprehensive dashboard overview status page.
In my exploration, I also assessed several third-party solutions. However, the majority proved to be either costly, offered poor value for money, or were outdated without ongoing support. Moreover, these external tools exhibited limited integration capabilities, which would require additional user interface interactions. Such an approach could potentially detract our engineers and users from the primary codebase, diminishing efficiency.
Considering these challenges and in light of GitHub's recent beta release of Actions Usage Metrics public beta as of March 28, 2024, it's clear that investing time in these third-party solutions may not be the optimal strategy at this juncture. However, revisiting GitHub's integrated metrics at a later stage should be considered as a valuable approach.
User Story
As a MP Engineer I want to have better visibility of the status of my GitHub Actions So that I can spot errors and avoid conflicts more easily
Value / Purpose
We have certain workflows which take a long time to complete e.g. Terraform: Scheduled Baseline .
During a busy working day we may have multiple PRs we wish to merge in but we want to deconflict hitting API errors etc. by not having too many concurrent GH actions running. We do get failure notifications but these are more in the moment and can get lost in Slack.
I feel like we could benefit from a birds eye view of the status of our GitHub actions to aid us in spotting failing actions and helping to deconflict when merging in PRs.
I'm not sure how best we could do this but some ideas are:
something to consider - https://docs.github.com/en/actions/monitoring-and-troubleshooting-workflows/adding-a-workflow-status-badge
Useful Contacts
@richgreen-moj
Additional Information
No response
Proposal / Unknowns
No response
Definition of Done