metrics-rs / metrics

A metrics ecosystem for Rust.
MIT License
1.1k stars 151 forks source link

prometheus exporter: idle metrics not re-exported when becoming active again #372

Open dezyh opened 1 year ago

dezyh commented 1 year ago

Issue

During periods of inactivity, after the idle_timeout period, metrics are correctly no longer exported. However, if these same metrics become active again, they are not re-exported.

This is problematic for my use case since my server provides long-lived streams with periods of activity lasting 15 minutes and then long periods of inactivity (>90 minutes). Additionally, even without supporting long-lived streams, clients can connect anytime before data is available, which would trigger the same idle timeout -> not re-exported issue.

Example

Currently I'm constructing and installing a prometheus push gateway exporter (with VictoriaMetrics instead of Prometheus):

PrometheusBuilder::new()
    .with_push_gateway(addr, Duration::from_secs(1))
    .expect("invalid push gateway endpoint")
    .idle_timeout(
        MetricKindMask::COUNTER | MetricKindMask::HISTOGRAM | MetricKindMask::GAUGE,
        Some(Duration::from_secs(30 * 60)), // 30 minutes
    )
    .install()
    .expect("failed to install prometheus exporter");

Timeline

Crate Versions

I'm currently using the following (which are a tiny bit out of date if that matters):

metrics = "0.20.1"
metrics-exporter-prometheus = { version = "0.11.0", features = ["push-gateway"] }
metrics-util = "0.14.0"
tobz commented 1 year ago

@dezyh Hmmm, interesting.

This sounds like it's potentially related to #314.

Does your code only use the emission macros directly (i.e. only ever using counter!(...), histogram!(...), etc) or do you register metrics and interact with the handle types (Counter/Gauge/Histogram) themselves?

dezyh commented 1 year ago

I only register metrics and interact with the handle types. Let me do some testing and build some reproduction cases and I'll let you know if I find anything interesting.

tobz commented 1 year ago

If you're using the handle types directly, then the linked issue is definitively the cause.

dezyh commented 9 months ago

I just saw that https://github.com/metrics-rs/metrics/pull/394 was merged which would mean there's no work-around for this issue anyone.

Just to clarify from your last message, here is how we we're currently using metrics. I believe this is "using the handle types directly".

use metrics::register_counter;

let messages_received = register_counter!("messages_received", "stream_id" => id);

messages_received.increment(1);
tobz commented 9 months ago

Just to clarify: your workaround is still possible. Instead of register_counter!, you'd now just call counter!.

On Sun, Nov 26, 2023, 10:38 PM Ben Mitchell @.***> wrote:

I just saw that #394 https://github.com/metrics-rs/metrics/pull/394 was merged which would mean there's no work-around for this issue anyone.

Just to clarify from your last message, here is how we were using the handle types before.

let sm = StreamMetrics { messages_received: register_counter!("messages_received", "stream" => id);};

sm.messages_received.increment(1);

— Reply to this email directly, view it on GitHub https://github.com/metrics-rs/metrics/issues/372#issuecomment-1827077366, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABWLF6ASSYRKCVJNV3SES3YGQDLZAVCNFSM6AAAAAAYOSBLAOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMRXGA3TOMZWGY . You are receiving this because you commented.Message ID: @.***>

dezyh commented 9 months ago

The not re-exported issue occurs when using the handle types (register_counter!). The solution was therefore to use the emission macros (counter!) instead.

I believe this is no longer possible as #394 because the current emission macro (counter!) is being removed and replaced with the macro which returns the handle type (register_counter! -> counter!).

I might be confused on the recent changes or about what is a handler type vs emission macro, so maybe best to wait until the next release that includes #394 and I'll try it out then and let you know.

_(My previous message was just clarifying that register_counter! was the handler type mentioned previously. Sorry if there was any confusion)_

tobz commented 9 months ago

I understand the confusion now.

Under the hood, a metric always has to be registered before it can be used, whether you're holding on to it long-term (such as keeping it as a field in a struct) or just immediately performing the operation (increment, etc) and then moving on.

When calling register_*!, you get back the return value from Recorder::register_counter, so in that case, the owned Counter handle type representing your counter. When calling counter!, we still register the counter... we just use the returned handle type as a temporary to perform the operation, and then it's dropped.

Now, circling back to my comment prior, the change in #394 just means that let counter = counter!("name"); is the new way to do what you previously would have called let counter = register_counter!("name"); to do. For emission, you call the operation method inline, so instead of counter!("name", 42);, you would just do counter!("name").increment(42);.