DFE-Digital / technical-guidance

Principles, standards and guidance for digital delivery teams
https://technical-guidance.education.gov.uk/
Other
21 stars 18 forks source link

Guidance around external monitoring #17

Open petenorth opened 5 years ago

petenorth commented 5 years ago

The gov.uk service manual

https://www.gov.uk/service-manual/technology/monitoring-the-status-of-your-service

states that sites should have both internal and external monitoring.

External monitoring is the monitoring you should set up outside of your service which keeps checking your systems even if your infrastructure goes down.

Status pages have been implemented for existing services using ‘status as a service’ offerings, the most popular within DfE being

petenorth commented 5 years ago

@himal-mandalia Looking at statuscake's price list it seems odd that status checking isn't something covered by a organisation wide subscription.

A single project that needs team support (mandatory?) could pay $12.49/month to check a single endpoint URL or the organisation could pay $41.66/month that would cover 300 service endpoints.

himal-mandalia commented 5 years ago

@petenorth added ticket on the dev meeting board:

https://trello.com/c/lciEUi3n/19-external-monitoring

We should discuss, reach a consensus and see about getting a subscription in place - it might just be for Teacher Services rather than department-wide but that's good enough for now.

petenorth commented 5 years ago

@himal-mandalia missed the last bi weekly meeting has this moved along at all?

himal-mandalia commented 5 years ago

@petenorth I missed it too - worth asking @misaka. Probably best in Slack#teacher-services-devs

petenorth commented 5 years ago

@himal-mandalia Am going down an Application Insights route. This gives us multi-region availability monitoring of a URL at a low price, so in theory if a region where our infrastructure/services are located goes down then we should get alerted.

It is a much cheaper option.

Also it can be created via a template so plays nicely with CI/CD .

https://docs.microsoft.com/en-us/azure/azure-monitor/app/monitor-web-app-availability#create-a-url-ping-test

himal-mandalia commented 5 years ago

@petenorth nice. Looks like the path of least resistance right now. I'll socialise via the developers Slack channel, then we should document this.