aptible / supercronic

Cron for containers
MIT License
1.84k stars 112 forks source link

Initialize prometheus counters with 0 #83

Open azhelev opened 3 years ago

azhelev commented 3 years ago

Hello,

i'm trying to create an alert based on supercronic prometheus metrics that will notify me if a job execution fails. If i have exactly one failed execution of a cron job the alert doesn't fire. Reason seems to be that supercronic_failed_executions counter isn't initialized when the process starts so the metric isn't available in prometheus until at least one failure. And when the first failure happens i cannot detect a change because there's nothing to compare to, the metric has a single value. With at least 2 failures the alert fires but i would really like to know about the first failure too. Initializing all counters with 0 will solve this. Also see the first part of this blog post https://blog.doit-intl.com/making-peace-with-prometheus-rate-43a3ea75c4cf.

My alert rule is supercronic_failed_executions - supercronic_failed_executions offset 10m > 1. Doing something like supercronic_failed_executions > 1 will work but isn't useful because such alert will fire until the process is restarted even if there are successful executions afterwards.

azhelev commented 3 years ago

Forgot to mention that i'm using supercronic 0.1.11