Open rodionovid opened 1 year ago
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.
Did you manage to figure it out @rodionovid?
I'm facing the exact same issue here.
@harisrg, actually I found a solution. Thank you for reminding about this post. I will use my example from above with single test rule. in order to apply this rule to prometheus you need to create a yaml (in my case values.yaml) file with following content:
additionalPrometheusRulesMap:
rule-name:
groups:
- name: Node
rules:
- alert: HostOutOfMemory
expr: (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes * 100 < 50) * on(instance) group_left (nodename) node_uname_info{ nodename=~".+" }
for: 2m
labels:
severity: warning
annotations:
summary: Host out of memory (instance { { $labels.instance } })
description: "Node memory is filling up (< 50% left)\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"
After that you need to upgrade your chart using this file:
helm upgrade prometheus -f values.yaml prometheus-community/kube-prometheus-stack -n prometheus-community
As a result, a single group of rules called "Node" that contains a single alert rule called "HostOutOfMemory" will be created. One remark. Each time you update helm chart another params (admin password, smtp configs,..) may be overridden by default values. To avoid that you need to specify this params in values.yaml or another file that you will apply while updating your chart. To find out how to specify this params you need to get full list of overridable params in an application. In order to do that execute the following command:
helm show values prometheus-community/kube-prometheus-stack > default-values.yaml
It will generates default-values.yaml file that contains all overridable params. After that you need to search along this file configs that you want to change. It may take a lot of efforts because of tons of this configs. Also there is no standarts for naming - it's only depends from chart developers and how they called a config.
That should work for creating a new rule, how about changing the existing ones?
UPDATE: I see that there is a flag to disable rules on a case by case basis. As helm does not merge rules, there is not much options to update existing rules, but disabling the "stock" one, and readding it as needed, the way it is noted above, yet by using another group name (entries cannot be merged with existing groups, I believe)
In the latest versions of the chart, I found the customRules section in the configuration.
I assume that to override a rule, it's now easier to simply rewrite the rule and set the severity to "none" in the default rule.
customRules:
NodeHighNumberConntrackEntriesUsed:
severity: "none"
additionalPrometheusRulesMap:
# rewrite basic rules
rewrite-rules:
groups:
- name: node-exporter
rules:
- alert: NodeHighNumberConntrackEntriesUsed
annotations:
description: "{{ $value | humanizePercentage }} of conntrack entries are used."
runbook_url: "https://runbooks.prometheus-operator.dev/runbooks/node/nodehighnumberconntrackentriesused"
summary: "Number of conntrack are getting close to the limit."
expr: (node_nf_conntrack_entries{job="node-exporter"} / node_nf_conntrack_entries_limit) > 0.80
labels:
severity: warning
Hello. I want to add some extra prometheus alerting rules using helm. I can add this rules manually via prometheus or grafana UI but this method doesn't suit for me. So, the question is: how can I add/update prometheus alerting rules in kubernetes cluster using local yaml files?
I tried to upgrade promtehteus release using helm upgrade command. For this purpose I created a local configuration file prometheus.yaml and copied into it prometheus configuration from prometheus UI control panel (status/config section in the navigation panel). Also I added in rule_files section path to my local alert_config.yaml:
File alert_config.yaml is pretty simple and contains single additional rule:
Then I executed helm upgrade:
helm upgrade -f prometheus.yaml prometheus prometheus-community/kube-prometheus-stack -n prometheus-community
And got successful output:
But after that new prometheus alert rule didn't appear. Recreating the pods using deprecatd helm flag --recreate-pods or by manual scaling didn't help for me too. I will appreciate any help with this question.