Closed wbollock closed 3 weeks ago
cc: @prymitive let me know what you think. happy to contribute this
As mentioned in https://github.com/cloudflare/pint/pull/985#issuecomment-2136889028 I don't believe that just throwing warnings every time someone uses topk
is a valid approach.
Churn in recording rules might be undesired but there's nothing fundamentally wrong with it.
With alerting rules it's different, I agree that if someone has a rule like:
- alert: ...
expr: topk(10, foo)
then it is likely to cause flapping alerts, since topk()
might return different time series on each evaluation. But one can stabilise the results by stripping churning labels, for example by stripping all labels and reporting min or max value: min(topk(10, foo))
.
So this would need a bit more advanced logic to find only queries that "leak" labels from topk() into final results with no constrains and don't use for:
or keep_firing_for:
to avoid churn.
I need a function that gives me the source of labels in a query and this check is a good use case for testing it - so expect it to be added in the next release.
I believe a new check to warn users about the use of
topk
orbottomk
in recording rules would be useful, as those rules will typically churn quite often and create many new time series.This is fairly opinionated and could vary a lot of on the nature of the metric being aggregated so interested in discussion and definitely think it fits as a non-default rule