ioos / ckanext-ioos-theme

IOOS Catalog as a CKAN extension
GNU Affero General Public License v3.0
7 stars 14 forks source link

Uptime monitoring for https://data.ioos.us #246

Closed mwengren closed 5 months ago

mwengren commented 1 year ago

Now that we have the IOOS Catalog on CKAN 2.9 performing better, let's make sure that we have an uptime monitoring/alerting service set up to ensure it's as available as possible for users.

Adding this issue because Catalog currently appears to be down with a 502 Bad Gateway error and I've heard of a few other instances lately where users have tried to access it but have gotten the same error.

mwengren commented 1 year ago

Catalog has been up and down since I posted this issue from a few spot checks. @benjwadams mentioned it was traffic related.

What can we do to make this service more stable? Do we need a larger instance size? Tuning of the CKAN init file to enable more threads, etc?

benjwadams commented 11 months ago

https://github.com/ioos/catalog-docker-base/commit/3fbeede0550ad0e40515355865da5ebe93f15025 helps with this some, but I will want to update the build, as noted in https://github.com/ioos/catalog-docker-base/issues/49 to see if that helps.

Stability issues do not appear to be caused by exhaustion of database connections, so that can be ruled out.

mwengren commented 7 months ago

Existing uptime monitoring for data.ioos.us still going to Glider DAC team, @benjwadams to change config to go to Catalog team/@mwengren instead.

mwengren commented 5 months ago

Monitoring implemented using PRTG, outages for https://data.ioos.us service should be sent to micah.wengren@noaa.gov ben.adams@tetratech.com.