tpitale / mail_room

Forward mail from gmail IMAP to a callback URL or job worker, simply.
MIT License
195 stars 51 forks source link

Add a Prometheus metrics endpoint #99

Open jtslear opened 5 years ago

jtslear commented 5 years ago

Currently it's difficult to view the performance of the mail_room gem. If the gem emitted metrics, we can start measuring performance and other relatable data in order to alert Engineers if things are going wrong. This can also be useful as an endpoint could be utilized for monitoring the health of the service in general, whether it be a docker daemon or Kubernetes healthcheck.

Also consider that it may be wise to include metrics of monitoring the mailbox for which mail_room is configured to look at. This prevents the need for persons to stand up a secondary system such as https://github.com/camptocamp/imap-mailbox-exporter to monitor the mailbox.

tpitale commented 5 years ago

What do you mean by “performance”? What metrics would you ask be included?

Output of stats to a log file was recently added to introspect certain behaviors. This may provide the information you’re looking for.

tpitale commented 5 years ago

https://github.com/tpitale/mail_room/blob/master/README.md#logging

jtslear commented 5 years ago

Performance meaning the functionality of mail_room in general. Here is an initial proposal:

Labels for each of these would probably include some identifier so we know which inbox we were processing in cases where there may exist multiple inboxes mail_room is monitoring

The count of processed emails should also have labels to indicate which delivery mechanism is used in cases where multiples might be configured.

The use of count in the above proposal allows us to query items and create charts for the rate at which actions are completing.

The important items to us would be the actions that mail_room takes and ensuring we have metrics for those actions. And I propose Prometheus as this is something we already heavily utilize inside of our infrastructure today.

tpitale commented 5 years ago

For the purpose of alerting engineers, you could consume the json logs that were added. That should give you a decent amount of information, and perhaps let you aggregate and come up with the rate of processing.

tpitale commented 5 years ago

I'm willing to consider adding the ability for mail_room to emit events (ala https://github.com/beam-telemetry/telemetry) internally. I would then be able to accept a PR to add the interface to something like prometheus.

I don't know that there will be events for all of the things you listed, I'll have to look more closely.

tpitale commented 4 years ago

Related. I created this: https://github.com/tpitale/telemetry-ruby