fstab / grok_exporter

Export Prometheus metrics from arbitrary unstructured log data.
Apache License 2.0
891 stars 152 forks source link

Always report counter metric (initialize to zero) #186

Closed tgriffitts-ns closed 1 year ago

tgriffitts-ns commented 1 year ago

Hey guys. Thank you so much for a basic, lightweight, log to metric facility. I think this is exactly what we need for reporting error counts metrics from syslog.

We simply wish to add alert expressions when errors show up on syslog.

Here's my issue. We have a grok_exporter counter configured (log_syslog_conntrack_error_total) to count error lines which match a string in syslog, which mostly works fine for our purposes, along with a promql alert expr, e.g., : increase(log_syslog_conntrack_error_total[3h]) >= 1

Intention is to report if we've had a conntrack error seen in the last 3 hours.

The problem I've found is that grok_exporter only begins to report a metric when it matches the first line from the log. Thus I have no metrics when there are no errors. Once I have an error show up, I will get the expected:

log_syslog_conntrack_error_total 1

But this doesn't work for our use case. When the metric shows up, increase(log_syslog_conntrack_error_total[3h]) is still 0 because there was no 0 metric reported previously. i.e., the first error does not produce an 'increase'. When the counter metric changes from 1 -> 2, everything works just fine, but not null -> 1.

Is there a grok_exporter metric configuration flag I can set on my counter to tell grok_exporter to always report the metric, even if no match has yet been seen? e.g.,

log_syslog_conntrack_error_total 0

Or do you have a suggestion to better solve my problem with grok_exporter? Thank you again for making your work available to us and for any advice!

tgriffitts-ns commented 1 year ago

Digging through the source code, I have discovered that metrics with labels are treated different from metrics without labels.

Metrics without labels each have 1 counter and are initialized to 0 and always report. This is the behavior I requested above.

Metrics with labels have a vector of counters-- one for each unique label combination. Each time a log line matches the metric expression the label expression is processed and the metric vector is queried to see if there is already a counter for that label combination. If not, one is created and initialized to 0. Then the counter for the label combination is incremented by the value derived from the log line (or +1 if no value expression is declared for the counter metric). This vector of counters is sane and makes sense; how can you know all possible future combinations of labels and initialize those to 0 and report immediately? You can't and thus no metric is reported until a log line matches and the labels are derived from that log line. Fine.

I had a default configuration for grok_exporter which included the log file path as a label to all of our metrics. This caused them all to have the "labeled metric" behavior. Since my metrics do not derived their label from the matching log line, I expected the behavior of metrics without labels because my labels never change. But it still has a label and is dynamically generated (from the log path being tailed which had a matching metric expression). Fine. I don't expect grok_exporter to consider if the label expression will ever report different values. I mean, it actually could report different values if we had more than one input path and that expression matched from 2 different input paths. Anyway, all is good. I have removed the unnecessary label.

Case closed. My apologies for not initially understanding.

Thanks for the great tool!