Prometheus: Out-of-order samples

tr3mor commented 3 weeks ago

Hello, We are using this exporter in HA mode (2 replicas) and scrape metrics using Prometheus (installed with kube-prometheus-stack) We are seeing the following warns in prometheus logs due to duplicated metrics

ts=2024-08-16T12:59:56.213Z caller=scrape.go:1741 level=warn component="scrape manager" scrape_pool=serviceMonitor/kyverno-system/kyverno-policy-reporter-monitoring/0 target=http://172.26.2.131:8080/metrics msg="Error on ingesting out-of-order samples" num_dropped=1776

The main reason for this is that in ServiceMonitor any label, which distinguish two targets, is dropped. https://github.com/kyverno/policy-reporter/blob/35175775bb9c0b2c76acf336349b382eeabc42a9/charts/policy-reporter/charts/monitoring/templates/servicemonitor.yaml#L49-L57 And since labeldrop declared before .Values.serviceMonitor.relabelings used, user cant override this behavior. Looking at the code, I guess it was done this way for dashboards to work with any number of pods, but it triggers alerts on kube-prometheus-stack side https://github.com/prometheus-community/helm-charts/blob/68ba986b2a6283efd3f743f0cf7859d93b615b64/charts/kube-prometheus-stack/templates/prometheus/rules-1.14/prometheus.yaml#L339 I understand how current behaviour works for most cases, but I would like to have the ability to override it/disable labeldrop. I can create PR if needed. Thank you!

fjogeleit commented 3 weeks ago

Hey, yes it was introduced to have correct metrics independent of the pod size.

If you need to customize this behavior I would appreciate your contributions.

tr3mor commented 3 weeks ago

Hey, I was able to achieve what I wanted with the current version of the chart, but it required manual changes to the dashboards. Not sure if you want to make these changes defaults or just keep it as a workaround. I did:

Kept pod name as a label to distinguish data from 2 pods:

serviceMonitor:
relabelings:
- action: replace
  sourceLabels:
    - __meta_kubernetes_pod_name
  targetLabel: pod

Update Grafana dashboards to work with any number of pods (group by pod first, then choose max). For example,

replacing

sum(policy_report_result{policy=~"$policy", category=~"$category", severity=~"$severity", source=~"$source", kind=~"$kind", exported_namespace=~"$namespace" } > 0) by (status, exported_namespace)

with

max(sum(policy_report_result{policy=~"$policy", category=~"$category", severity=~"$severity", source=~"$source", kind=~"$kind", exported_namespace=~"$namespace" } > 0) by (status, exported_namespace, pod)) by (status, exported_namespace)

While Grafana change is fully backward compatible (it will work the same if you dont have pod label) and will work with any number of pods, I dont believe pod label should be kept by default, since it might break a lot of dashboards/alerts people created. So I see three options here:

Update Grafana dashboards, and introduce a switch to keep pod label on metrics to avoid metric duplication.
Update Grafana dashboards, and make a note on how to keep pod label using serviceMonitor.relabelings.
Change nothing, and explain how it could be achieved with serviceMonitor.relabelings and manual changes to Grafana dashboards (if you don't expect people to use it).

Please let me know which option sounds better for you. I can bring PR if needed =)

fjogeleit commented 3 weeks ago

Hey, thanks a lot for your effort and details about solutions, very appreciate it. I will take a deeper look in the next days and reach back to you.

fjogeleit commented 1 week ago

I think we could introduce a "breaking change" in the 3.x major version and update metrics and dashboards accordingly to keep the pod label, also to make use of the other go metrics about the actual policy-reporter pod metrics.

kyverno / policy-reporter

Prometheus: Out-of-order samples #471