Open tr3mor opened 3 weeks ago
Hey, yes it was introduced to have correct metrics independent of the pod size.
If you need to customize this behavior I would appreciate your contributions.
Hey, I was able to achieve what I wanted with the current version of the chart, but it required manual changes to the dashboards. Not sure if you want to make these changes defaults or just keep it as a workaround. I did:
serviceMonitor:
relabelings:
- action: replace
sourceLabels:
- __meta_kubernetes_pod_name
targetLabel: pod
replacing
sum(policy_report_result{policy=~"$policy", category=~"$category", severity=~"$severity", source=~"$source", kind=~"$kind", exported_namespace=~"$namespace" } > 0) by (status, exported_namespace)
with
max(sum(policy_report_result{policy=~"$policy", category=~"$category", severity=~"$severity", source=~"$source", kind=~"$kind", exported_namespace=~"$namespace" } > 0) by (status, exported_namespace, pod)) by (status, exported_namespace)
While Grafana change is fully backward compatible (it will work the same if you dont have pod label) and will work with any number of pods, I dont believe pod label should be kept by default, since it might break a lot of dashboards/alerts people created. So I see three options here:
serviceMonitor.relabelings
.serviceMonitor.relabelings
and manual changes to Grafana dashboards (if you don't expect people to use it).Please let me know which option sounds better for you. I can bring PR if needed =)
Hey, thanks a lot for your effort and details about solutions, very appreciate it. I will take a deeper look in the next days and reach back to you.
I think we could introduce a "breaking change" in the 3.x major version and update metrics and dashboards accordingly to keep the pod label, also to make use of the other go metrics about the actual policy-reporter pod metrics.
Hello, We are using this exporter in HA mode (2 replicas) and scrape metrics using Prometheus (installed with kube-prometheus-stack) We are seeing the following warns in prometheus logs due to duplicated metrics
The main reason for this is that in ServiceMonitor any label, which distinguish two targets, is dropped. https://github.com/kyverno/policy-reporter/blob/35175775bb9c0b2c76acf336349b382eeabc42a9/charts/policy-reporter/charts/monitoring/templates/servicemonitor.yaml#L49-L57 And since labeldrop declared before
.Values.serviceMonitor.relabelings
used, user cant override this behavior. Looking at the code, I guess it was done this way for dashboards to work with any number of pods, but it triggers alerts on kube-prometheus-stack side https://github.com/prometheus-community/helm-charts/blob/68ba986b2a6283efd3f743f0cf7859d93b615b64/charts/kube-prometheus-stack/templates/prometheus/rules-1.14/prometheus.yaml#L339 I understand how current behaviour works for most cases, but I would like to have the ability to override it/disable labeldrop. I can create PR if needed. Thank you!