PromPHP / prometheus_client_php

Prometheus instrumentation library for PHP applications
https://prometheus.io/docs/concepts/metric_types/
Apache License 2.0
415 stars 91 forks source link

How can I delete / remove / un-register a specific series? #146

Open consolibyte opened 7 months ago

consolibyte commented 7 months ago

We are using this lib to push a metric called task_up that has a number of labels:

# HELP chargeover_platform_task_up Whether or not this scheduled task is working correctly (1 for yes, 0 for no)
# TYPE chargeover_platform_task_up gauge
platform_task_up{task="task1",device_id="1"} 1
platform_task_up{task="task1",device_id="2"} 0
platform_task_up{task="task1",device_id="3"} 1

This is working really great, and has already proven successful in letting us closely monitor the status of these tasks.

The problem we're running into, is that occasionally a device may fail, and be replaced, and this is expected. It's going to happen.

The issue though is that e.g. if {device_id="2"} fails and is removed and replaced by {device_id="4"}, the {device_id="2"} task metric is forever "stuck" at 0, and continues to trigger alerting. That time series is essentially "cached" in Redis forever.

In an ideal world, we could remove that series from the metrics, so that series doesn't get shown on the next scrape, gets considered stale, and no alert is fired.

Is there any planned support for this? (If not, I could create a pull request if you might be able to provide any guidance on the data structure in Redis so I can find and remove the metric?).

LKaemmerling commented 7 months ago

Hey @consolibyte,

you could use the wipeStorage method, which would remove all entries. At the moment there is no support for deleting only a specific metric.

pluk77 commented 7 months ago

I am in need of this same feature as well (specifically for Redis). Once I found a solution I will report back. The Storage\Redis::wipeStorage method seems to loop through the keys using a wildcard, so my first port of call will be to investigate if I can re-use the code with a specific key instead of the wildcard.

pluk77 commented 7 months ago

I have managed to find a solution to wipe a specific metric that worked for me. Feel free to turn it into a PR if it works for you too (other storage besides Redis still has to be done).

I copied the wipeStorage() (in Storage/RedisNG and Storage/Redis) and modified it slightly:

    public function wipeKey(string $type, string $key)
    {
        $this->ensureOpenConnection();

        $searchPattern = "";

        $globalPrefix = $this->redis->getOption(\Redis::OPT_PREFIX);
        // @phpstan-ignore-next-line false positive, phpstan thinks getOptions returns int
        if (is_string($globalPrefix)) {
            $searchPattern .= $globalPrefix;
        }

        $searchPattern .= self::$prefix;
        $searchPattern .= ":" . $type . ":" . $key;

        $this->redis->eval(
            <<<LUA
redis.replicate_commands()
local cursor = "0"
repeat
    local results = redis.call('SCAN', cursor, 'MATCH', ARGV[1])
    cursor = results[1]
    for _, key in ipairs(results[2]) do
        redis.call('DEL', key)
    end
until cursor == "0"
LUA
            ,
            [$searchPattern],
            0
        );
    }

and I added this method to CollectorRegistry

    public function wipeKey(string $type, string $name): void
    {
        $this->storageAdapter->wipeKey($type, $name);
    }

In your code, you can delete a gauge before populating it with fresh data:

$gauge = $this->collectorRegistry->getOrRegisterGauge(............);
$this->collectorRegistry->wipeKey($gauge->getType(), $gauge->getName());
$gauge->set(.........);
pluk77 commented 7 months ago

Other option would be to add a method to each Collector, which would call the emptyKey() method on Storage.

Something like this:


class Gauge extends Collector
{
   public function empty(): void
    {
        $this->storageAdapter->emptyKey(self::TYPE, $this->name);
    }
}
LKaemmerling commented 7 months ago

@pluk77 feel free to open up a MR with your changes :) I would prefer to keep it on the Storage "only".

pluk77 commented 5 months ago

I have submitted #150 but it only implements the function in Redis. It would be great if someone else could implement the others.

nmoskovkin commented 1 week ago

pluk77 Will this code work also for other types, for example histogram, or summary?

pluk77 commented 1 week ago

As far as I know it does.