2i2c-org / infrastructure

Infrastructure for configuring and deploying our community JupyterHubs.
https://infrastructure.2i2c.org
BSD 3-Clause "New" or "Revised" License
104 stars 64 forks source link

Evaluate executable runbooks for SRE use #2319

Open pnasrat opened 1 year ago

pnasrat commented 1 year ago

Context

Runbooks are documented procedures for troubleshooting and/or operational tasks such as support tasks.

Currently some of these procedures are documented in the infrastructure guide but rely on copy and pasting or writing new deployer commands. As new monitoring and alerting are added, initial runbooks may evolve as we get familiar with classes of problem.

https://hackernoon.com/simplify-devops-with-jupyter-notebook-c700fb6b503c

Potential benefits

Gives SRE a consistent environment to run production debugging from in a cluster (eg if someones work laptop could potentially fix from a personal laptop with just a web browser avoiding setting up deployer, etc) Builds sets of runnable playbooks that can be converted into automation if needed

Notes

This could be either in a central 2i2c-org cluster and access remote credentials using deployer, or potentially per hub.

Proposal

This is a feasibility investigation and evaluation by team story to see if such

Out of scope: runbook creation

Some implementations of executable runbooks are:

Nurtch/rubix as used by GitLab

Related: https://damianavila.github.io/blog/posts/binder-%2B-nikola-%2B-jupyter-%2B-github-blogging-resourceless.html

While Google Cloud Shell AWS CloudShell and Azure Cloud Shell creating our own on top of JupyterHub encourages using our own infrastructure to debug.

Limitations:

Still need to be able to debug from SRE workstations in cases where an outage doesn't allow a hub to be running.

Updates and actions

No response

damianavila commented 1 year ago

I certainly love this idea (I am biased, I know 😉) and I would like to hear what others in the team think about it, so summoning the whole @2i2c-org/engineering to provide initial feedback on the idea that @pnasrat brought!