dandi / dandi-hub

Infrastructure and code for the dandihub
https://hub.dandiarchive.org
Other
11 stars 23 forks source link

Responses with 503 on a regular basis #133

Open yarikoptic opened 8 months ago

yarikoptic commented 8 months ago

since being added yesterday to our upptime testing it was 8 times returning 503. See https://github.com/dandi/upptime/issues?q=is%3Aissue+Hub

Do we have logs stored somewhere to investigate ??

edit: not that much of information in CI logs on that, e.g. (now we have con/tinous collecting logs)

2024/03/26/github/cron/20240326T080755/13ffeee/Uptime CI/105/Check status/3_Check endpoint status.txt:2024-03-26T08:08:11.2021058Z Checking https://hub.dandiarchive.org
2024/03/26/github/cron/20240326T080755/13ffeee/Uptime CI/105/Check status/3_Check endpoint status.txt-2024-03-26T08:08:11.2033172Z Current status hub up 2024-03-25T14:48:58.302Z
2024/03/26/github/cron/20240326T080755/13ffeee/Uptime CI/105/Check status/3_Check endpoint status.txt-2024-03-26T08:08:11.5208532Z Result from test 503 0.315574
2024/03/26/github/cron/20240326T080755/13ffeee/Uptime CI/105/Check status/3_Check endpoint status.txt-2024-03-26T08:08:11.5805887Z Result from test 503 0.058448
2024/03/26/github/cron/20240326T080755/13ffeee/Uptime CI/105/Check status/3_Check endpoint status.txt-2024-03-26T08:08:11.6411204Z Result from test 503 0.05931
2024/03/26/github/cron/20240326T080755/13ffeee/Uptime CI/105/Check status/3_Check endpoint status.txt-2024-03-26T08:08:11.8301897Z [master 060d500] 🟥 Hub is down (503 in 316 ms) [skip ci] [upptime]
asmacdo commented 8 months ago

The first thing to look at would be is the hub deployment functioning? Are there errors in the Event log?

kubectl describe deployment.apps/hub -n jupyterhub

If the hub deployment is working, next up would be to check the logs of the hub pod

kubectl logs deployment.apps/hub -n jupyterhub