Open gillg opened 2 years ago
hello,
thanks for your message. I plan to propose optionally backing up Prometheus config with a consistent storage, e.g. etcd, at the next dev summit. We'll see what happens then but it could be a first step to address this issue.
Great ! That's true it could be better for a k8s model where write config files will not persist...
Why not use a dedicated rules system folder in the data dir as first approach also ? We know data will persist in any architecture. And if for any reason it's not the case, you just use regular configs and not "remote rules management".
An additional flag --alerts.remote-management-enabled=true would be useful
I'd like to avoid mangling files on disk. If we were to support something like this, it would be hard to remove it later.
I think it makes more sense to go directly to a database rather than letting the users access the local filesystem.
Sounds good.
I'd like to know is there any roadmap to support post/put/get/delete api for rules in Prometheus natively? Managing rule files is kind of hard and inconvenient.
@roidelapluie
Proposal
Is your proposal related to a problem?
Grafana since v8 introduced a very useful alerting UI for end users. They rely in background on an internal alertmanager not able to receive alerts from outside, native Alertmanager APIs to manage a remote instance, and some Cortex metrics APIs to manage all alerting from 1 place.
Using Grafana to evaluate alerting rules is a SPOF and late in alerting stack, so you can also use this API natively https://cortexmetrics.io/docs/api/#set-rule-group Is there any plans to implement this endpoint ?
Describe the solution you'd like
The best would be to handle POST and DELETE verbs on the existing rules API
/api/v1/rules
. We should be able to configure prometheus config file to check rules in a folder like/etc/prometheus/remote-rules.d/*.yml
At command-line args we could have--rules.remote-management.store-path=/etc/prometheus/remote-rules.d/
then one yaml file is created by "namespace" (see cortexmetrics API doc)Describe alternatives you've considered
The only homemade alternative is a reverse proxy with handles POST / DELETEs on
/api/v1/rules
and a tool in charge to create files in a/etc/prometheus/remote-rules.d/*.yml
folderThis proposal partialy duplicates https://github.com/thanos-io/thanos/issues/5168 but both are usefull if you want to dispatch alerts at different levels.