Closed slashpai closed 6 months ago
@slashpai: This pull request references MON-3621 which is a valid jira issue.
Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set.
/skip
/retest-required
@simonpasquier addressed comment
/retest-required
/retest-required
@simonpasquier I tested the change in cluster-bot. Added the findings in comments in https://issues.redhat.com/browse/MON-3256
/cc @Tai-RedHat
/test e2e-agnostic-operator
@simonpasquier can you review again
@slashpai Hi, when I test this PR with cluster-bot, I can see
% oc -n openshift-user-workload-monitoring get prometheus user-workload -ojsonpath='{.spec.enableFeatures}' |jq
[
"extra-scrape-metrics"
]
but when I follow your steps at here, when I apply PrometheusRule it shows :
The "prometheusrules" is invalid: : group "general.rules", rule 2, "ApproachingEnforcedSamplesLimit": annotation "message": template: __alert_ApproachingEnforcedSamplesLimit:1: unexpected "|" in command
did I use the correct config?
% oc -n ns1 apply -f -<<EOF
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
labels:
prometheus: k8s
role: alert-rules
name: monitoring-stack-alerts
namespace: ns1
spec:
groups:
- name: general.rules
rules:
- alert: TargetDown
annotations:
message: '{{ printf "%.4g" $value }}% of the {{ $labels.job }}/{{ $labels.service
}} targets in {{ $labels.namespace }} namespace are down.'
expr: 100 * (count(up == 0) BY (job, namespace, service) / count(up) BY (job,
namespace, service)) > 10
for: 10m
labels:
severity: warning
- alert: ApproachingEnforcedSamplesLimit
annotations:
message: '{{ $labels.container }} container of the {{ $labels.pod }} pod in the {{ $labels.namespace }} namespace consumes {{ $value | humanizePercentage }} of the samples limit budget.'
expr: (scrape_samples_post_metric_relabeling/(scrape_sample_limit > 0)) > 0.9
for: 10m
labels:
severity: warning
EOF
@slashpai Hi, when I test this PR with cluster-bot, I can see
% oc -n openshift-user-workload-monitoring get prometheus user-workload -ojsonpath='{.spec.enableFeatures}' |jq [ "extra-scrape-metrics" ]
but when I follow your steps at here, when I apply PrometheusRule it shows :
The "prometheusrules" is invalid: : group "general.rules", rule 2, "ApproachingEnforcedSamplesLimit": annotation "message": template: __alert_ApproachingEnforcedSamplesLimit:1: unexpected "|" in command
did I use the correct config?
% oc -n ns1 apply -f -<<EOF apiVersion: monitoring.coreos.com/v1 kind: PrometheusRule metadata: labels: prometheus: k8s role: alert-rules name: monitoring-stack-alerts namespace: ns1 spec: groups: - name: general.rules rules: - alert: TargetDown annotations: message: '{{ printf "%.4g" $value }}% of the {{ $labels.job }}/{{ $labels.service }} targets in {{ $labels.namespace }} namespace are down.' expr: 100 * (count(up == 0) BY (job, namespace, service) / count(up) BY (job, namespace, service)) > 10 for: 10m labels: severity: warning - alert: ApproachingEnforcedSamplesLimit annotations: message: '{{ $labels.container }} container of the {{ $labels.pod }} pod in the {{ $labels.namespace }} namespace consumes {{ $value | humanizePercentage }} of the samples limit budget.' expr: (scrape_samples_post_metric_relabeling/(scrape_sample_limit > 0)) > 0.9 for: 10m labels: severity: warning EOF
Can you add the contents in file and try. I think when there is $ in manifest, shell may not be parsing correctly.
@slashpai it works now, I will add the QE label after @simonpasquier review again.
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: simonpasquier, slashpai
The full list of commands accepted by this bot can be found here.
The pull request process is described here
/label qe-approved
@slashpai: This pull request references MON-3621 which is a valid jira issue.
Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set.
/jira refresh
@slashpai: This pull request references MON-3621 which is a valid jira issue.
@slashpai: all tests passed!
Full PR test history. Your PR dashboard.
[ART PR BUILD NOTIFIER]
This PR has been included in build cluster-monitoring-operator-container-v4.16.0-202404120544.p0.g7f498b4.assembly.stream.el9 for distgit cluster-monitoring-operator. All builds following this will include this PR.
Update Prometheus user-workload to enable additional scrape metrics As part of epic MON-3256