sinkingpoint / prometheus-gravel-gateway

A Prometheus Aggregation Gateway for FAAS applications
GNU Lesser General Public License v3.0
115 stars 10 forks source link

Use per-job aggregators #28

Closed gagbo closed 1 year ago

gagbo commented 1 year ago

Hello,

Currently, there is a single metrics aggregator used for the whole gateway process. That means that if 2 different jobs push metrics to the same gateway, and they want to use metrics with the same name (let's say "http_requests"), it's an issue:

When that happens, either Job 1 or 2 (the last one to push metrics) won't be able to ever push metrics because there would be different labels for the single metric family http_requests.

It would be nice if aggregating would only error on "non-matching labels" if:

This way the gateway could receive series from multiple different jobs, and report a time series for each job when scraped by Prometheus later:

http_requests{job="one", code="500"} 2
http_requests{job="two", status_code="500"} 1

I think this could be allowed by using a HashMap of Aggregator instead of a single one, using the job name as a key. And I suppose that answering to scraping would come down to merging all existing aggregators together. It's also somewhat related to #8 , since we could have a less destructive endpoint if we could only ask to wipe a single job's aggregator

sinkingpoint commented 1 year ago

Generally the prometheus model here is to namespace your metrics, so that no two jobs have the same metric name. e.g. you wouldn't have http_requests from two different services, but have service1_http_requests_total and service2_http_requests_total. That saves a log of headaches on the Prometheus side as well because you don't have to worry about alerts being poisoned by a new service treading on the metrics from an old one.

That being said, a per job delete job does make sense in #8