fluent / fluent-plugin-prometheus

A fluent plugin that collects metrics and exposes for Prometheus.
Apache License 2.0
257 stars 80 forks source link

`output_status_retry_wait` means the elapsed time from the first retry #217

Open daipom opened 1 year ago

daipom commented 1 year ago

The document says output_status_retry_wait means current retry_wait computed from last retry time and next retry time.

https://github.com/fluent/fluent-plugin-prometheus/blob/41fa2df366ceef7b46de154859c852926cba48a9/README.md#L122-L123

However, it actually means the elapsed time from the first retry. We need to fix the value or the document.

https://github.com/fluent/fluent-plugin-prometheus/blob/41fa2df366ceef7b46de154859c852926cba48a9/lib/fluent/plugin/in_prometheus_output_monitor.rb#L194-L207

It is calculated as next_time - start. start is the time of the first retry, so this value means the elapsed time from the first retry.

raulgupto commented 1 year ago

thanks for your effort @daipom. This fix/clarification is important imo as many dashboard metrics depend on prometheus and can cause noise if someone interprets this in a wrong way.

raulgupto commented 1 year ago

thanks for your effort @daipom. This fix/clarification is important imo as many dashboard metrics depend on prometheus and can cause noise if someone interprets this in a wrong way.