rust-lang / crates-io-heroku-metrics

Heroku metrics collector for crates.io
Apache License 2.0
3 stars 6 forks source link

Summary of errors in logs that are not yet monitored #3

Open jtgeibel opened 3 years ago

jtgeibel commented 3 years ago

Here is a summary of error="" entries in our logs that we may want to monitor more closely in our metrics. We may want to do like Heroku does and assign code values to these error cases. We should ensure these all have an at=error prefix so that they can be easily ingested from logs.

Additionally, we may want to add an at=warn prefix that could be used to flag slow requests and other operationally interesting events that aren't strictly errors.

Turbo87 commented 3 years ago

While logging certainly makes sense, I'm wondering if it would be better to use Sentry more for these things 🤔

jtgeibel commented 3 years ago

I think we should do both where possible. My original motivation for investigating was to make sure we capture Heroku platform level error codes, where the request may not make it to the backed, or where the backend completes successfully but for some reason the user still sees an error. Then by adopting the existing prefix, we can ensure that all levels of errors end up in at least one place together.