caas-team / sparrow

A monitoring tool to gather infrastructure network information
Apache License 2.0
6 stars 4 forks source link

Bug: Removed targets remain as prometheus metrics #77

Closed niklastreml closed 2 weeks ago

niklastreml commented 8 months ago

Is there an existing issue for this?

Current Behavior

Sparrow starts with two targets:

https://google.com
https://example.com

this creates two prometheus for those targets:

sparrow_latency_duration_seconds{status="200",target="https://google.com"} 0.014013145
sparrow_latency_duration_seconds{status="200",target="https://example.com"} 0.003617327
...

If the config gets updated and one target gets removed, the last prometheus metric for the deleted target remains as the latest prometheus metrics. New targets:

https://google.com

Prometheus metrics:

sparrow_latency_duration_seconds{status="200",target="https://google.com"} 0.014013145 <- This remains and is not cleaned up
sparrow_latency_duration_seconds{status="200",target="https://example.com"} 0.007915321

Expected Behavior

Lets discuss how (if) we should handle this case.

Maybe it's possible to remove metrics from the prometheus registry

Steps To Reproduce

  1. Deploy sparrow with two targets for any check using http loader
  2. Let it run for a check cycle
  3. Remove one target from the config
  4. HTTP GET /metrics

Relevant logs and/or screenshots, environment information, etc.

Screenshot shows what happened after google.com was removed from the config, compared to a target that is currently active

image

Since the sparrow metrics api still exposes the value for google.com, prometheus stores it on every scrape

Who can address the issue?

@puffitos @y-eight @lvlcn-t Let's discuss how we handle this next week

Anything else?

No response

lvlcn-t commented 7 months ago

This can be done by a simple (*MetricVec).Delete in the SetConfig method of each check.