Open cceckman opened 3 months ago
I'm used to doing log-based alerting based on severity level, which I think GCP supports, so I'm setting that up in #95.
The next step is to start sending errors to myself by clicking through all that config :)
Wait, GitHub, too fast! We still have to configure the alerts!
I finally followed the guide for configuring a log-based alert, and now we have this alert policy for any log line with severity >= ERROR
.
This should support multiple notification channels, so anyone with GCP access can add a new one and append it to the list for this alert.
The next step is to arrange for "things that shouldn't fail" to emit ERROR-level log messages. I'm preparing a PR that happens to arrange that for all the cron job handlers, so that should be solved soon :slightly_smiling_face:
(As noted in this issue: )
On significant / "should never happen" errors, it would be nice to alert the maintainers; e.g. "failed end-of-batch processing", or "couldn't send a message". Manually checking logs is not the best observability experience. :)