Open sj-williams opened 9 months ago
Further diagnosis of the R Shiny app issue here provided by Glenn Christmas https://github.com/ministryofjustice/data-platform-support/issues/429
For reference, follow up monitoring work link to legacy ticket: https://github.com/ministryofjustice/cloud-platform/issues/4538
Background
Find out in what scenarios a pod can increase underlying node CPU usage.
RShiny problems were tracked down to liveness probe hitting an endpoint regularly that opened a new session which then never closed. Can we recreate this scenario with an app?
What we want to recreate here is a node's CPU becoming critical and breaking workloads on the node, and then k8s services failing (like metrics server, calico).
DOD
Link to notes for Rshiny app issues: https://docs.google.com/document/d/1qAxCYFzDQta00l4v3IZ1CUyjOECuWAXDh3oAqNF0UtA/edit
Proposed user journey
Approach
Which part of the user docs does this impact
Communicate changes
Questions / Assumptions
Definition of done
Reference
How to write good user stories