Closed foolip closed 6 years ago
Looking now.
I noticed because of https://bit.ly/ecosystem-infra-status which @mdittmer created. Yay monitoring!
Yay monitoring. Also, getting a different error now. SSHing in to take a look.
Load is very high, restarted some services and I'll keep an eye on it for the next 15 minutes and make sure things come back up.
Things are back. Going to continue to monitor this machine until our meeting in about 20 minutes.
postgres is having some issues on this machine, and downtime is intermittent. Keeping this open for now until I can completely resolve.
Noticed the site was down again. Have increased the resources available on this VM from 512mb RAM to 2GB RAM, also upgraded postgresql to the latest Ubuntu packages. Will continue to keep an eye on it.
There was 10 minutes of downtime today as well: https://bit.ly/ecosystem-infra-status
Things are looking good. We're monitoring it from Uptrends and https://github.com/w3c/wpt-pullresults/pull/40 is the PR to allow the site to use an external DB (RDS in this case)
Is there a public dashboard for that uptrends monitoring? Would be interesting to see if it disagrees in any way with https://bit.ly/ecosystem-infra-status, which I'd kind of expect since status cake has a 15 minute ping interval.
504 Gateway Time-out. Oops?