Closed ringerc closed 5 months ago
I think predicate query is an excellent idea! It could bring a lot of flexibility.
Here are some of my rough thoughts:
predicate
field to the Collector definition. If this field is present, the corresponding predicate query will be executed, and branch logic will be applied. Personally, I feel that maintaining a separate set of reusable Predicate Queries might be redundant because most default Collectors should use the existing mechanism.Of course, these are just my rough ideas. I’m very open to further discussion, and I would greatly welcome any implementations or contributions. 😊
I've written a proof of concept PR for this: https://github.com/Vonng/pg_exporter/pull/47
I'm looking at adopting
pg_exporter
as a potential replacement for the CloudNative-PG built-in scraper I currently use. It looks pretty impressive other than the lack of test cover and test facilities.One of the few things it doesn't do that the CNPG scraper does is permit the use of a generic boolean expression query to decide "should I run this metric query or not" (added by https://github.com/cloudnative-pg/cloudnative-pg/pull/4503).
pg_exporter
has tags, but the tags must be specified from the outside by whatever provisions the service, and apply to all DBs being scraped.I'm wondering if you'd be open to submission of a patch to implement that, assuming that my initial PoC with this scraper finds it otherwise a good fit for my needs.
I need to express things like "collect this metric set if the foo extension is installed and is at least version 2.1", "collect this metric only when running on (some specific vendor postgres flavour)" or "only try to collect this if this specific table exists and has this specific column".
This isn't possible in plain postgres SQL dialect due to its lack of bare procedural blocks and the inability of a
DO
block to return a result set. Hacks with refcursors can be used to get a result from aDO
block but only with multiple statements, which this scraper doesn't support (and probably shouldn't as it opens a whole can 'o worms).I propose an extension to the syntax like:
The predicate queries would be run in the order specified. If any returns zero rows or a 1-column boolean false, the metric collection is skipped. If it returns true, collection proceeds to check the next predicate (if any) or to collect the metric. A multi-column result or multi-row result logs an error the first time it runs then disables the metric for the lifetime of the process.
The purpose of multiple queries is to permit things like (a bit contrived):
where it's first necessary to know that some database object exists before attempting to access it, since Pg requires that names must resolve at parse/plan time so you can't lazy-bind possible-undefined names in e.g. a
CASE
.Predicate query results would support the
ttl
key and if set, use the cache. If the main metric query reports an error, that would automatically invalidate the predicate query cache. E.g.I'm also tempted to add separate positive and negative cache TTLs since it's common to have a metric initially not scrape-able due to an extension not yet being created, but once it's scrape-able once it stays that way. It's probably unnecessary though, and exponential backoff might be better if it proves to be required.
If I get the chance I may be able to add named query indirection so one predicate query set can be used for multiple scrapers to reduce verbosity. I'm not sure if it'll be worth it, will see.
Thoughts? The in-house implementation of this logic turned out to be very simple to do, and should translate well to this scraper.