I-GUIDE / CI_Platform

iGUIDE CI Platform Deployment
Apache License 2.0
0 stars 0 forks source link

JupyterHub frequently goes down due to NFS server issues #11

Closed rkalyanapurdue closed 3 months ago

rkalyanapurdue commented 3 months ago
  1. The hub pod for JupyterHub seems to die when it is unable to mount the hub-db-dir volume from the NFS server.
  2. The NFS server VM seems to frequently go down without warning and takes a long time to restart.
  3. Hub is functional again when the NFS server is back up and the NFS service is restarted.
  4. Debug the cause of the node going down and whether we need to migrate to a larger/new VM on Jetstream2
rkalyanapurdue commented 3 months ago

Ticket submitted to Jetstream2 team. They are taking a look at the VM logs now.

rkalyanapurdue commented 3 months ago

Issue was due to one of the network interfaces going down on the server behind this VM. Jetstream has routed all traffic to the working interface and will look into fixing the other one.