jupyterhub / mybinder.org-deploy

Deployment config files for mybinder.org
https://mybinder-sre.readthedocs.io/en/latest/index.html
BSD 3-Clause "New" or "Revised" License
76 stars 74 forks source link

Prometheus incident notes #1908

Open minrk opened 3 years ago

minrk commented 3 years ago

I don't have time to write a full incident report with grant deadlines rapidly approaching, but putting down some notes here first:

minrk commented 3 years ago

Update: resize failed in a way that looks like it's not going to recover, presumably due to an OVH cluster configuration error out of our control (cc @mael-le-gal):

$ kubectl describe pvc ovh-prometheus-server
  Warning  ExternalExpanding   27m                volume_expand                              Ignoring the PVC: didn't find a plugin capable of expanding the volume; waiting for an external controller to process this PVC.
  Warning  VolumeResizeFailed  25m (x9 over 27m)  external-resizer cinder.csi.openstack.org  resize volume ovh-managed-kubernetes-wnucc3-pvc-3743fa2e-6b64-4286-91de-5294284f0952 failed: rpc error: code = Internal desc = Could not resize volume "dcad7bcc-f918-4a48-8d6d-610e0fc4f485" to size 50: Expected HTTP response code [202] when accessing [POST https://volume.compute.gra5.cloud.ovh.net/v3/2bc16af8026e45c6a34cd6c9c4c1703a/volumes/dcad7bcc-f918-4a48-8d6d-610e0fc4f485/action], but got 406 instead
{"computeFault": {"message": "Version 3.42 is not supported by the API. Minimum is 3.0 and maximum is 3.15.", "code": 406}}
  Normal  Resizing                  21m (x10 over 27m)  external-resizer cinder.csi.openstack.org  External resizer is resizing volume ovh-managed-kubernetes-wnucc3-pvc-3743fa2e-6b64-4286-91de-5294284f0952
  Normal  FileSystemResizeRequired  21m                 external-resizer cinder.csi.openstack.org  Require file system resize of volume on node

so deleted the pvc (kubectl delete pvc ovh-prometheus-server), restarted the pod, and restarted deployment action. This means a loss of prometheus data on OVH, but that data is ephemeral anyway.