Closed mwengren closed 5 months ago
Catalog has been up and down since I posted this issue from a few spot checks. @benjwadams mentioned it was traffic related.
What can we do to make this service more stable? Do we need a larger instance size? Tuning of the CKAN init file to enable more threads, etc?
https://github.com/ioos/catalog-docker-base/commit/3fbeede0550ad0e40515355865da5ebe93f15025 helps with this some, but I will want to update the build, as noted in https://github.com/ioos/catalog-docker-base/issues/49 to see if that helps.
Stability issues do not appear to be caused by exhaustion of database connections, so that can be ruled out.
Existing uptime monitoring for data.ioos.us still going to Glider DAC team, @benjwadams to change config to go to Catalog team/@mwengren instead.
Monitoring implemented using PRTG, outages for https://data.ioos.us service should be sent to micah.wengren@noaa.gov ben.adams@tetratech.com.
Now that we have the IOOS Catalog on CKAN 2.9 performing better, let's make sure that we have an uptime monitoring/alerting service set up to ensure it's as available as possible for users.
Adding this issue because Catalog currently appears to be down with a
502 Bad Gateway
error and I've heard of a few other instances lately where users have tried to access it but have gotten the same error.