Closed agile-josiah closed 1 month ago
Drafting the following documentation:
1. Service Blueprint: On-Call Use Case - documents the process of on-call support by outlining the internal and external (partner teams) actions/steps, technology used, and information shared (TBD: could also collect our ideas for SLO/SLAs) Ticket #2612
Collaboration: working session after daily sync (date TBD)
2. Comms Strategy - generate copy for rollout announcement Ticket #2586
Collaboration: async editing and commenting
Note: This documentation is a first iteration and is meant to be edited and updated as changes are made, especially as we develop an incident response plan and disaster recovery plan.
We decided to try out our new incident response plan for a sprint and prioritize the communication piece next Sprint.
User Story
As a VRO engineer, I want oncall responsibilities to be documented, so that partner teams that require support and oncall vro engineers have a reference for how to get/give support.
Work with Bianca on the comms aspect of this, including how we roll out the comms, as well as the cadence for getting partner team feedback.
Acceptance Criteria
Not included in this work
Fully defining the oncall process. This should be an MVP that is iterated upon when unknown or new issues for support become relevant. (ie communicating with LHDI or triaging a pod in the k8s cluster.)
OUT OF SCOPE: Incident response process, anything to do with monitoring and alerting, disaster recovery plan. All of these are important but not in scope for this.
Notes about work Could use a spike to get SLO/SLA and research could be useful from @bianca-rivera to know the needs of our partner teams as well as what VRO engineers are willing to support, and meet somewhere in the middle.
We fully expect that this documentation will need to be fleshed out as we develop an incident response plan, a disaster recovery plan, etc.
Tech Spec reference