kubernetes-monitoring / kubernetes-mixin

A set of Grafana dashboards and Prometheus alerts for Kubernetes.
Apache License 2.0
2.1k stars 599 forks source link

Warning many-to-many matching not allowed #119

Open rca0 opened 5 years ago

rca0 commented 5 years ago
level=warn ts=2018-11-09T12:24:37.99045291Z caller=manager.go:343 component="rule manager" group=kubernetes msg="Evaluating rule failed" rule="record: namespace_name:kube_pod_container_resource_requests_memory_bytes:sum\nexpr: sum by(namespace, label_name) (sum by(namespace, pod) (kube_pod_container_resource_requests_memory_bytes{job=\"kube-state-metrics\"})\n  * on(namespace, pod) group_left(label_name) label_replace(kube_pod_labels{job=\"kube-state-metrics\"},\n  \"pod_name\", \"$1\", \"pod\", \"(.*)\"))\n" err="many-to-many matching not allowed: matching labels must be unique on one side"

I got this warning in prometheus version v2.3.2 i've change the expressions of kube_pod_container_resource_requests_memory_bytes, kube_pod_container_resource_requests_cpu_cores and node_num_cpu adding ignoring instead on This is the code:

    - record: node:node_num_cpu:sum
      expr: count by (node) (sum by (node, cpu) (node_cpu_seconds_total{job="node-exporter"} * ignoring (namespace, pod) group_left(node) node_namespace_pod:kube_pod_info:))

    - record: "namespace_name:kube_pod_container_resource_requests_memory_bytes:sum"
      expr: sum by (namespace, label_name) (sum(kube_pod_container_resource_requests_memory_bytes{job="kube-state-metrics"}) by (namespace, pod) * ignoring (namespace, pod) group_left(label_name) label_replace(kube_pod_labels{job="kube-state-metrics"}, "pod_name", "$1", "pod", "(.*)"))

    - record: "namespace_name:kube_pod_container_resource_requests_cpu_cores:sum"
      expr: sum by (namespace, label_name) (sum(kube_pod_container_resource_requests_cpu_cores{job="kube-state-metrics"} and on(pod) kube_pod_status_scheduled{condition="true"}) by (namespace, pod) * ignoring (namespace, pod) group_left(label_name) label_replace(kube_pod_labels{job="kube-state-metrics"}, "pod_name", "$1", "pod", "(.*)"))

I know that the ignoring operator will remove the labels that is inside of brackets, but, the warning it's solved now, but i'm not sure that it's working, i'm trying to testing this alerts

Can someone validate it to me?

metalmatze commented 5 years ago

If this is just a warning that is printed within the first few minutes after restarting Prometheus, then this is expected and nothing to worry about.

rca0 commented 5 years ago

unfortunately the log is constant... can I worry about this?

paskal commented 5 years ago

Unique warning I get after start:

caller=manager.go:389 component="rule manager" group=alertmanager.rules msg="Evaluating rule failed" rule="alert: AlertmanagerConfigInconsistent\nexpr: count_values by(service) (\"config_hash\", alertmanager_config_hash{job=\"prometheus-operator-alertmanager\"})\n  / on(service) group_left() label_replace(prometheus_operator_spec_replicas{controller=\"alertmanager\",job=\"prometheus-operator-operator\"},\n  \"service\", \"alertmanager-$1\", \"name\", \"(.*)\") != 1\nfor: 5m\nlabels:\n  severity: critical\nannotations:\n  message: The configuration of the instances of the Alertmanager cluster `{{$labels.service}}`\n    are out of sync.\n" err="many-to-many matching not allowed: matching labels must be unique on one side"
caller=manager.go:389 component="rule manager" group=k8s.rules msg="Evaluating rule failed" rule="record: namespace_name:container_cpu_usage_seconds_total:sum_rate\nexpr: sum by(namespace, label_name) (sum by(namespace, pod_name) (rate(container_cpu_usage_seconds_total{container_name!=\"\",image!=\"\",job=\"kubelet\"}[5m]))\n  * on(namespace, pod_name) group_left(label_name) label_replace(kube_pod_labels{job=\"kube-state-metrics\"},\n  \"pod_name\", \"$1\", \"pod\", \"(.*)\"))\n" err="many-to-many matching not allowed: matching labels must be unique on one side"
caller=manager.go:389 component="rule manager" group=k8s.rules msg="Evaluating rule failed" rule="record: namespace_name:container_memory_usage_bytes:sum\nexpr: sum by(namespace, label_name) (sum by(pod_name, namespace) (container_memory_usage_bytes{container_name!=\"\",image!=\"\",job=\"kubelet\"})\n  * on(namespace, pod_name) group_left(label_name) label_replace(kube_pod_labels{job=\"kube-state-metrics\"},\n  \"pod_name\", \"$1\", \"pod\", \"(.*)\"))\n" err="many-to-many matching not allowed: matching labels must be unique on one side"
caller=manager.go:389 component="rule manager" group=k8s.rules msg="Evaluating rule failed" rule="record: namespace_name:kube_pod_container_resource_requests_cpu_cores:sum\nexpr: sum by(namespace, label_name) (sum by(namespace, pod) (kube_pod_container_resource_requests_cpu_cores{job=\"kube-state-metrics\"}\n  and on(pod) kube_pod_status_scheduled{condition=\"true\"}) * on(namespace, pod) group_left(label_name)\n  label_replace(kube_pod_labels{job=\"kube-state-metrics\"}, \"pod_name\", \"$1\", \"pod\",\n  \"(.*)\"))\n" err="many-to-many matching not allowed: matching labels must be unique on one side"
caller=manager.go:389 component="rule manager" group=k8s.rules msg="Evaluating rule failed" rule="record: namespace_name:kube_pod_container_resource_requests_memory_bytes:sum\nexpr: sum by(namespace, label_name) (sum by(namespace, pod) (kube_pod_container_resource_requests_memory_bytes{job=\"kube-state-metrics\"})\n  * on(namespace, pod) group_left(label_name) label_replace(kube_pod_labels{job=\"kube-state-metrics\"},\n  \"pod_name\", \"$1\", \"pod\", \"(.*)\"))\n" err="many-to-many matching not allowed: matching labels must be unique on one side"
faheem-nadeem commented 5 years ago

Seeing a similar regular warning after upgrade to prometheus-operator release 1.7.0, Prometheus v2.5.0.

level=warn ts=2019-01-16T15:05:48.970506555Z caller=manager.go:408 component="rule manager" group=k8s.rules msg="Evaluating rule failed" rule="record: namespace_name:container_cpu_usage_seconds_total:sum_rate\nexpr: sum by(namespace, label_name) (sum by(namespace, pod_name) (rate(container_cpu_usage_seconds_total{container_name!=\"\",image!=\"\",job=\"kubelet\"}[5m]))\n  * on(namespace, pod_name) group_left(label_name) label_replace(kube_pod_labels{job=\"kube-state-metrics\"},\n  \"pod_name\", \"$1\", \"pod\", \"(.*)\"))\n" err="many-to-many matching not allowed: matching labels must be unique on one side"
level=warn ts=2019-01-16T15:05:48.974490062Z caller=manager.go:408 component="rule manager" group=k8s.rules msg="Evaluating rule failed" rule="record: namespace_name:container_memory_usage_bytes:sum\nexpr: sum by(namespace, label_name) (sum by(pod_name, namespace) (container_memory_usage_bytes{container_name!=\"\",image!=\"\",job=\"kubelet\"})\n  * on(namespace, pod_name) group_left(label_name) label_replace(kube_pod_labels{job=\"kube-state-metrics\"},\n  \"pod_name\", \"$1\", \"pod\", \"(.*)\"))\n" err="many-to-many matching not allowed: matching labels must be unique on one side"
level=warn ts=2019-01-16T15:05:48.976632336Z caller=manager.go:408 component="rule manager" group=k8s.rules msg="Evaluating rule failed" rule="record: namespace_name:kube_pod_container_resource_requests_memory_bytes:sum\nexpr: sum by(namespace, label_name) (sum by(namespace, pod) (kube_pod_container_resource_requests_memory_bytes{job=\"kube-state-metrics\"})\n  * on(namespace, pod) group_left(label_name) label_replace(kube_pod_labels{job=\"kube-state-metrics\"},\n  \"pod_name\", \"$1\", \"pod\", \"(.*)\"))\n" err="many-to-many matching not allowed: matching labels must be unique on one side"
level=warn ts=2019-01-16T15:05:48.97984908Z caller=manager.go:408 component="rule manager" group=k8s.rules msg="Evaluating rule failed" rule="record: namespace_name:kube_pod_container_resource_requests_cpu_cores:sum\nexpr: sum by(namespace, label_name) (sum by(namespace, pod) (kube_pod_container_resource_requests_cpu_cores{job=\"kube-state-metrics\"}\n  and on(pod) kube_pod_status_scheduled{condition=\"true\"}) * on(namespace, pod) group_left(label_name)\n  label_replace(kube_pod_labels{job=\"kube-state-metrics\"}, \"pod_name\", \"$1\", \"pod\",\n  \"(.*)\"))\n" err="many-to-many matching not allowed: matching labels must be unique on one side"
rca0 commented 5 years ago

did you find some solution @faheem-cliqz ?

Hashfyre commented 5 years ago

We are seeing the exact issue in our cluster with the following recording rules provided by kubernetes-mixin (using kube-prometheus)

Versions:

k8s logs -l "prometheus=kube-prometheus" -c prometheus | grep "Evaluating rule failed" | gcut -d' ' -f1,2,3,4,5,6,7,8,9 --complement | sort -u | cut -d":" -f3
node_cpu_saturation_load1
node_memory_utilisation
container_cpu_usage_seconds_total
container_memory_usage_bytes
kube_pod_container_resource_requests_cpu_cores
kube_pod_container_resource_requests_memory_bytes
node_cpu_utilisation
node_disk_saturation
node_disk_utilisation
node_memory_bytes_available
node_memory_bytes_total
node_memory_swap_io_bytes
node_net_saturation
node_net_utilisation
node_num_cpu

All of the above rules have the error:

"many-to-many matching not allowed: matching labels must be unique on one side"

Raw Log

``` rule="record: 'node:node_cpu_saturation_load1:'\nexpr: sum by(node) (node_load1{job=\"node-exporter\"} * on(namespace, pod) group_left(node)\n node_namespace_pod:kube_pod_info:) / node:node_num_cpu:sum\n" err="many-to-many matching not allowed: matching labels must be unique on one side" rule="record: 'node:node_memory_utilisation:'\nexpr: 1 - sum by(node) ((node_memory_MemFree_bytes{job=\"node-exporter\"} + node_memory_Cached_bytes{job=\"node-exporter\"}\n + node_memory_Buffers_bytes{job=\"node-exporter\"}) * on(namespace, pod) group_left(node)\n node_namespace_pod:kube_pod_info:) / sum by(node) (node_memory_MemTotal_bytes{job=\"node-exporter\"}\n * on(namespace, pod) group_left(node) node_namespace_pod:kube_pod_info:)\n" err="many-to-many matching not allowed: matching labels must be unique on one side" rule="record: namespace_name:container_cpu_usage_seconds_total:sum_rate\nexpr: sum by(namespace, label_name) (sum by(namespace, pod_name) (rate(container_cpu_usage_seconds_total{container_name!=\"\",image!=\"\",job=\"kubelet\"}[5m]))\n * on(namespace, pod_name) group_left(label_name) label_replace(kube_pod_labels{job=\"kube-state-metrics\"},\n \"pod_name\", \"$1\", \"pod\", \"(.*)\"))\n" err="many-to-many matching not allowed: matching labels must be unique on one side" rule="record: namespace_name:container_memory_usage_bytes:sum\nexpr: sum by(namespace, label_name) (sum by(pod_name, namespace) (container_memory_usage_bytes{container_name!=\"\",image!=\"\",job=\"kubelet\"})\n * on(namespace, pod_name) group_left(label_name) label_replace(kube_pod_labels{job=\"kube-state-metrics\"},\n \"pod_name\", \"$1\", \"pod\", \"(.*)\"))\n" err="many-to-many matching not allowed: matching labels must be unique on one side" rule="record: namespace_name:kube_pod_container_resource_requests_cpu_cores:sum\nexpr: sum by(namespace, label_name) (sum by(namespace, pod) (kube_pod_container_resource_requests_cpu_cores{job=\"kube-state-metrics\"}\n and on(pod) kube_pod_status_scheduled{condition=\"true\"}) * on(namespace, pod) group_left(label_name)\n label_replace(kube_pod_labels{job=\"kube-state-metrics\"}, \"pod_name\", \"$1\", \"pod\",\n \"(.*)\"))\n" err="many-to-many matching not allowed: matching labels must be unique on one side" rule="record: namespace_name:kube_pod_container_resource_requests_memory_bytes:sum\nexpr: sum by(namespace, label_name) (sum by(namespace, pod) (kube_pod_container_resource_requests_memory_bytes{job=\"kube-state-metrics\"})\n * on(namespace, pod) group_left(label_name) label_replace(kube_pod_labels{job=\"kube-state-metrics\"},\n \"pod_name\", \"$1\", \"pod\", \"(.*)\"))\n" err="many-to-many matching not allowed: matching labels must be unique on one side" rule="record: node:node_cpu_utilisation:avg1m\nexpr: 1 - avg by(node) (rate(node_cpu_seconds_total{job=\"node-exporter\",mode=\"idle\"}[1m])\n * on(namespace, pod) group_left(node) node_namespace_pod:kube_pod_info:)\n" err="many-to-many matching not allowed: matching labels must be unique on one side" rule="record: node:node_disk_saturation:avg_irate\nexpr: avg by(node) (irate(node_disk_io_time_weighted_seconds_total{device=~\"(sd|xvd|nvme).+\",job=\"node-exporter\"}[1m])\n / 1000 * on(namespace, pod) group_left(node) node_namespace_pod:kube_pod_info:)\n" err="many-to-many matching not allowed: matching labels must be unique on one side" rule="record: node:node_disk_utilisation:avg_irate\nexpr: avg by(node) (irate(node_disk_io_time_seconds_total{device=~\"(sd|xvd|nvme).+\",job=\"node-exporter\"}[1m])\n * on(namespace, pod) group_left(node) node_namespace_pod:kube_pod_info:)\n" err="many-to-many matching not allowed: matching labels must be unique on one side" rule="record: node:node_memory_bytes_available:sum\nexpr: sum by(node) ((node_memory_MemFree_bytes{job=\"node-exporter\"} + node_memory_Cached_bytes{job=\"node-exporter\"}\n + node_memory_Buffers_bytes{job=\"node-exporter\"}) * on(namespace, pod) group_left(node)\n node_namespace_pod:kube_pod_info:)\n" err="many-to-many matching not allowed: matching labels must be unique on one side" rule="record: node:node_memory_bytes_total:sum\nexpr: sum by(node) (node_memory_MemTotal_bytes{job=\"node-exporter\"} * on(namespace,\n pod) group_left(node) node_namespace_pod:kube_pod_info:)\n" err="many-to-many matching not allowed: matching labels must be unique on one side" rule="record: node:node_memory_swap_io_bytes:sum_rate\nexpr: 1000 * sum by(node) ((rate(node_vmstat_pgpgin{job=\"node-exporter\"}[1m]) + rate(node_vmstat_pgpgout{job=\"node-exporter\"}[1m]))\n * on(namespace, pod) group_left(node) node_namespace_pod:kube_pod_info:)\n" err="many-to-many matching not allowed: matching labels must be unique on one side" rule="record: node:node_net_saturation:sum_irate\nexpr: sum by(node) ((irate(node_network_receive_drop_total{device=\"eth0\",job=\"node-exporter\"}[1m])\n + irate(node_network_transmit_drop_total{device=\"eth0\",job=\"node-exporter\"}[1m]))\n * on(namespace, pod) group_left(node) node_namespace_pod:kube_pod_info:)\n" err="many-to-many matching not allowed: matching labels must be unique on one side" rule="record: node:node_net_utilisation:sum_irate\nexpr: sum by(node) ((irate(node_network_receive_bytes_total{device=\"eth0\",job=\"node-exporter\"}[1m])\n + irate(node_network_transmit_bytes_total{device=\"eth0\",job=\"node-exporter\"}[1m]))\n * on(namespace, pod) group_left(node) node_namespace_pod:kube_pod_info:)\n" err="many-to-many matching not allowed: matching labels must be unique on one side" rule="record: node:node_num_cpu:sum\nexpr: count by(node) (sum by(node, cpu) (node_cpu_seconds_total{job=\"node-exporter\"}\n * on(namespace, pod) group_left(node) node_namespace_pod:kube_pod_info:))\n" err="many-to-many matching not allowed: matching labels must be unique on one side" ```
fritchie commented 5 years ago

I am also experiencing this error on some clusters.

Has anyone found a way to pinpoint which pods are causing the error? Increasing the Prometheus log level to debug doesn't seem to help.

fritchie commented 5 years ago

fwiw, my issue was due to prometheus discovering redundant services in another namespace

awangc commented 4 years ago

Any updates on this issue?

napbiotec commented 1 year ago

I'm getting the same issue when deploy "kube-prometheus-stack" version 48.3.1 using Helm on Google Kubernetes Engine (GKE).