Vonng / pg_exporter

Advanced PostgreSQL & Pgbouncer Metrics Exporter for Prometheus
https://pigsty.io
Apache License 2.0
164 stars 42 forks source link

Selective scrape via query parameters #42

Open ringerc opened 1 month ago

ringerc commented 1 month ago

I'm looking at adopting pg_exporter as a potential replacement for the CloudNative-PG built-in scraper I currently use.

CNPG doesn't support this either, but I was considering implementing it for CNPG and wonder if I should do so for pg_exporter instead.

Do you have any advice on how I might go about adding support for scraping selected subsets of metrics, specified by tag-set, with query parameters?

I want to scrape different subsets at different frequencies without having to deploy multiple instances of the scraper. This can't be done with command-line or env-var level tags.

E.g. I want to scrape /metrics?tags=expensive_metrics,wide_cardinality_metrics every 5min, and /metrics?cheap_narrow_metrics every 30s.

The scraper's caching feature makes this a less pressing need than it is in other scrapers, but since it already has tagging I'm wondering if it might be fairly easy to add. I can't immediately see how to access the query params from/within promhttp in order to apply them selectively though. Do you know if this might be possible, and have any ideas about how? If so, I can see if I can cook up a patch if my initial PoC with this scraper works out.

(See related https://github.com/Vonng/pg_exporter/issues/41, #43 )

ringerc commented 1 month ago

Looks like sql_exporter has a variant of this in https://github.com/burningalchemist/sql_exporter/pull/433 (https://github.com/burningalchemist/sql_exporter/issues/178) with the jobs[] parameter.

Query params are accessible as e.g. req.URL.Query()["jobs[]"] on the request object passed from promhttp, so filtering by tags looks entirely feasible.

Vonng commented 1 month ago

Currently, I'm not sure how to add Param support to the /metrics endpoint.

However, I think it would be fine if we use a different endpoint, like /inspect. We could implement it similarly to health check functions, providing an inspection interface for the internal data structures of the Exporter/Server/Collector. This way, we could retrieve a subset of the metrics.

If you're interested in adding any implementation, I'd be more than happy to collaborate.

ringerc commented 4 weeks ago

Thanks. I aim to do just that.

ringerc commented 4 weeks ago

Looks like managing multiple registries is the preferred way to handle grouping: https://pkg.go.dev/github.com/prometheus/client_golang/prometheus#Registry . It can also be used for things like parallel scrape from multiple goroutines, if required.

To select the group to return, a custom http.Handler for /metrics can be registered, and it can delegate to a promhttp.HandlerFor a specific Registry. Then wrap with InstrumentMetricHandler.

The downside of that is that the registries should probably be pre-created. So filtering by generic tags wouldn't work so well. I'd suggest either adding group:somename tags. Any metric with group: tag(s) would not register on the default Registry, it would get its own registry for that group. During scrape, ?groups= could be specified to scrape non-default registries.

Another option is to use Gatherers to dynamically group them, but then you lose registry validation. See https://pkg.go.dev/github.com/prometheus/client_golang/prometheus#Gatherers . This would make more sense when querying via arbitrary sets of metric query tags, but it has reduced validation abilities. Using registries seems safer.

Edit: I see that pg_exporter uses a single prometheus.Collector implementation of type Exporter, which iteratively scrapes each Server, and that in turn iteratively scrapes each Collector to run a Query. Since the entire scrape is done under a single imperative run, there's no easy way to use multiple registries, collectorFor, etc. It might be preferable for pg_exporter to delegate the scrape through the prometheus APIs in future, but for now it won't work.

I'll look at a wrapper http handler instead.

The Prometheus API is not great - it doesn't pass context.Context into Collect for example, so it's hard to pass down call-specific information in a race-free manner that can tolerate concurrent scrapes. But it looks like pg_exporter already uses locking to restrict concurrent scrapes, so perhaps putting it in Exporter state is OK for now.

ringerc commented 4 weeks ago

Raised https://github.com/prometheus/client_golang/issues/1538 with prom client. If prom propagated context.Context from the http.Request through to Collect(...) it'd all be so much cleaner and easier.

As it is, it looks like a custom http.Handler for /metrics is needed to wrap the promhttp.Handler, and either:

It's so ugly that I feel like I must be missing some obvious way this can be done correctly. There doesn't seem to be any sane, concurrency-safe way to pass request context through promhttp to the Collector handling the request.

But here's the working-if-ugly PoC branch https://github.com/ringerc/pg_exporter_patches/tree/pr-selective-scrape

ringerc commented 3 weeks ago

I haven't forgotten this, and will try to clean up and upstream my patch when I get the chance. Just under the pump for some internal work.

Vonng commented 2 weeks ago

;) I'll make an alpha release for v0.7.0-a1; once merged and tested, 0.7.0 will be released. Already invite you as a collaborator, so feel free to commit & merge to master. 😊

ringerc commented 2 weeks ago

Thanks. As it stands my working branch is way too dirty to merge. I'll try to get back to cleaning it up and finishing it off.