HHS / simpler-grants-gov

https://simpler.grants.gov
Other
42 stars 13 forks source link

Search error logging #2375

Open mxk0 opened 4 weeks ago

mxk0 commented 4 weeks ago

Summary

To meet our quad 1 commitments for search, we need charting that allows us to prove:

The monthly average error rate remains below 5%, meaning fewer than five 5xx errors per 100 API and five 5xx errors per 100 page requests in production.

Acceptance criteria

coilysiren commented 3 weeks ago

Image

mxk0 commented 3 weeks ago

We have the logging we need for the API. We can view these logs in CloudWatch today, and can use a future monitoring tool (e.g. NewRelic) in the future.

Still need to check what we have for the frontend.

coilysiren commented 6 days ago

https://docs.aws.amazon.com/opensearch-service/latest/developerguide/monitoring-pipeline-logs.html

mxk0 commented 5 days ago

@mdragon do we have alerts like this in place already? (Copied from the acceptance criteria above.)

Ticket to set alerts for error rates >5% (both API and frontend)

mdragon commented 3 days ago

Confirming we have the same level of detail for Frontend as they're both ELB and those natively report out their Target server and ELB level 5xx error metrics. Image