Open rosogon opened 3 years ago
This relates to the problems raised by a multi-tenant Prometheus. https://github.com/cherti/PromAuthProxy is a project that could help on that.
The other alternative is to modify the approach and use one Prometheus per deployment.
I've tested this under Kubernetes for the Edge cases as well, and have had success with placing the prometheus config in a Kubernetes configmap and injecting a monitoring sidecar that dispatches a POST to the prometheus server config reload endpoint whenever the configuration changes. The process is roughly described here: https://www.weave.works/blog/prometheus-configmaps-continuous-deployment/
At the moment the ruleserver offers a REST API to add alerts and remove them from the Prometheus server, there is a description of its functioning in the readme.
The M18 Rule-based refactorer used static rules to alert on high|load cpu usage. These should be dynamic in the sense that a new application should add its own rules. According to https://prometheus.io/docs/prometheus/latest/configuration/configuration/, rule_files is a glob of files, so this could be addresed as:
Still, the generation of the rule files from the application SLA is needed, but to be addressed in other ticket.