[kube-prometheus-stack] PrometheusRuleFailures

TJM commented 4 years ago

Describe the bug After upgrading to 9.4.10 I am seeing PrometheusRuleFailures:


{endpoint="web",instance="10.244.5.51:9090",job="kube-prometheus-stack-prometheus",namespace="monitoring",pod="prometheus-kube-prometheus-stack-prometheus-0",rule_group="/etc/prometheus/rules/prometheus-kube-prometheus-stack-prometheus-rulefiles-0/monitoring-kube-prometheus-stack-kubelet.rules.yaml;kubelet.rules",service="kube-prometheus-stack-prometheus"} | 30
-- | --
{endpoint="web",instance="10.244.5.51:9090",job="kube-prometheus-stack-prometheus",namespace="monitoring",pod="prometheus-kube-prometheus-stack-prometheus-0",rule_group="/etc/prometheus/rules/prometheus-kube-prometheus-stack-prometheus-rulefiles-0/monitoring-kube-prometheus-stack-kubernetes-system-kubelet.yaml;kubernetes-system-kubelet",service="kube-prometheus-stack-prometheus"}

Version of Helm and Kubernetes:

Helm Version:

$ helm version
version.BuildInfo{Version:"v3.2.4", GitCommit:"0ad800ef43d3b826f31a5ad8dfbb4fe05d143688", GitTreeState:"clean", GoVersion:"go1.13.12"}

Kubernetes Version:

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.6", GitCommit:"dff82dc0de47299ab66c83c626e08b245ab19037", GitTreeState:"clean", BuildDate:"2020-07-15T16:58:53Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.2", GitCommit:"f6278300bebbb750328ac16ee6dd3aa7d3549568", GitTreeState:"clean", BuildDate:"2019-08-05T09:15:22Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}

Which chart: prometheus-kube-stack Which version of the chart: 9.4.10

What happened: After upgrade, we have new alerts that probably aren't supposed to be there?

What you expected to happen: No PrometheusRuleFailures ?

How to reproduce it (as minimally and precisely as possible): Upgrade to 9.4.10

alertmanager:
  config:
    route:
      group_by: [alertname]
      group_wait: 15s
      group_interval: 1m
      repeat_interval: 1h
      receiver: gchat-notify
      routes:
      - receiver: gchat-notify
        match_re:
          severity: info|warning|critical
      # - receiver: hpomi-notify
      #   match_re:
      #     alertname: "(PrometheusDown|KubeAPIDown|KubeControllerManagerDown|KubeSchedulerDown|etcdNoLeader|etcdInsufficientMembers|SystemNodeOutOfRequestedCPUResourcesCritical|SystemNodeOutOfMemoryResourcesCritical)"
      - receiver: opsgenie-notify
        match:
          alertname: TestOpsGenie
      - receiver: chat-dev
        group_by:
        - alertname
        # - profile
        - severity
        match_re:
          # profile: ^(?:(dev))$
          severity: ^(?:^(INFO|WARNING|CRITICAL)$)$
        continue: true
      - receiver: blackhole
        match:
          alertname: Watchdog
      - receiver: blackhole
        match:
          alertname: etcdHighNumberOfFailedGRPCRequests

    receivers:
    - name: gchat-notify
      webhook_configs:
        - url: "http://calert:6000/create?room_name=Kubernetes-Alerts-NonProd"
          send_resolved: true

    - name: chat-dev
      webhook_configs:
        - url: "http://calert:6000/create?room_name=kubernetes-cwow-test"
          send_resolved: true

    - name: blackhole

  ingress:
    enabled: true
    hosts:
      - alertmanager-den3test.company.com
    paths:
      - /
  alertmanagerSpec:
    replicas: 1
    # storage:
    #   volumeClaimTemplate:
    #     spec:
    #       accessModes: ["ReadWriteOnce"]
    #       resources:
    #         requests:
    #           storage: 10Gi

grafana:
  #enabled: false
  ingress:
    enabled: true
    hosts:
      - grafana-den3test.company.com
  grafana.ini:
    auth.anonymous:
      enabled: true
      org_name: Main Org.
      org_role: Viewer

kubeEtcd:
  # Using etcd as a pod, should not need endpoints.
  # endpoints:
  #   - 10.9.25.178
  #   - 10.9.25.179
  #   - 10.9.25.180
  serviceMonitor:
    scheme: https
    caFile:   /etc/prometheus/secrets/etcd-certs/ca.crt
    certFile: /etc/prometheus/secrets/etcd-certs/client.crt
    keyFile:  /etc/prometheus/secrets/etcd-certs/client.key

prometheus:
  ingress:
    enabled: true
    hosts:
      - prometheus-den3test.company.com
    paths:
      - /
  prometheusSpec:
    replicas: 1
    externalLabels:
      cluster: den3test
    secrets:
      - etcd-certs
    # storageSpec:
    #   volumeClaimTemplate:
    #     spec:
    #       storageClassName: vsan-default
    #       accessModes: ["ReadWriteOnce"]
    #       resources:
    #         requests:
    #           storage: 100Gi
  server:
    global:
      scrape_timeout: 30s

prometheus-node-exporter:
  nodeSelector:
    beta.kubernetes.io/arch: amd64
    beta.kubernetes.io/os: linux

Anything else we need to know:

We were not seeing these on 9.3.4?

Strangerxxx commented 4 years ago

the same was on 9.4.4 and reproduces on 10.0.2

TJM commented 4 years ago

Prometheus is constantly outputting the following errors:

level=warn ts=2020-10-13T15:25:49.854Z caller=manager.go:577 component="rule manager" group=kubelet.rules msg="Evaluating rule failed" rule="record: node_quantile:kubelet_pleg_relist_duration_seconds:histogram_quantile\nexpr: histogram_quantile(0.5, sum by(instance, le) (rate(kubelet_pleg_relist_duration_seconds_bucket[5m]))\n  * on(instance) group_left(node) kubelet_node_name{job=\"kubelet\",metrics_path=\"/metrics\"})\nlabels:\n  quantile: \"0.5\"\n" err="found duplicate series for the match group {instance=\"10.9.25.189:10250\"} on the right hand-side of the operation: [{__name__=\"kubelet_node_name\", endpoint=\"https-metrics\", instance=\"10.9.25.189:10250\", job=\"kubelet\", metrics_path=\"/metrics\", namespace=\"kube-system\", node=\"den3l5kubew05.company.corp\", service=\"kube-prometheus-stack-kubelet\"}, {__name__=\"kubelet_node_name\", endpoint=\"https-metrics\", instance=\"10.9.25.189:10250\", job=\"kubelet\", metrics_path=\"/metrics\", namespace=\"kube-system\", node=\"den3l5kubew05.company.corp\", service=\"prometheus-operator-kubelet\"}];many-to-many matching not allowed: matching labels must be unique on one side"
level=warn ts=2020-10-13T15:25:53.291Z caller=manager.go:577 component="rule manager" group=kubernetes-system-kubelet msg="Evaluating rule failed" rule="alert: KubeletPodStartUpLatencyHigh\nexpr: histogram_quantile(0.99, sum by(instance, le) (rate(kubelet_pod_worker_duration_seconds_bucket{job=\"kubelet\",metrics_path=\"/metrics\"}[5m])))\n  * on(instance) group_left(node) kubelet_node_name{job=\"kubelet\",metrics_path=\"/metrics\"}\n  > 60\nfor: 15m\nlabels:\n  severity: warning\nannotations:\n  message: Kubelet Pod startup 99th percentile latency is {{ $value }} seconds on\n    node {{ $labels.node }}.\n  runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeletpodstartuplatencyhigh\n" err="found duplicate series for the match group {instance=\"10.9.25.189:10250\"} on the right hand-side of the operation: [{__name__=\"kubelet_node_name\", endpoint=\"https-metrics\", instance=\"10.9.25.189:10250\", job=\"kubelet\", metrics_path=\"/metrics\", namespace=\"kube-system\", node=\"den3l5kubew05.company.corp\", service=\"kube-prometheus-stack-kubelet\"}, {__name__=\"kubelet_node_name\", endpoint=\"https-metrics\", instance=\"10.9.25.189:10250\", job=\"kubelet\", metrics_path=\"/metrics\", namespace=\"kube-system\", node=\"den3l5kubew05.company.corp\", service=\"prometheus-operator-kubelet\"}];many-to-many matching not allowed: matching labels must be unique on one side"

Or specifically:

many-to-many matching not allowed: matching labels must be unique on one side

Perhaps that helps?

wi1dcard commented 4 years ago

Possible workaround: go to kube-system namespace, check if there're multiple services named *-kube-prometheus-stack-kubelet or *-prometheus-operator-kubelet, and remove the unnecessary ones:

$ kubectl get service | grep kubelet
prom-kube-prometheus-stack-kubelet      ClusterIP   None           <none>        10250/TCP,10255/TCP,4194/TCP   91m
prometheus-operator-kubelet             ClusterIP   None           <none>        10250/TCP,10255/TCP,4194/TCP   104d

$ kubectl delete service prometheus-operator-kubelet
service "prometheus-operator-kubelet" deleted

How it works:

I got this issue after I migrated fromstable/prometheus-operator chart. It seems that helm didn't remove the services it installed in the kube-system namespace when I uninstall the deprecated chart. So the ServiceMonitor collects the same metrics from multiple services that have the same endpoints. Since some of the prometheus record rules, for example:

histogram_quantile(0.9,
  sum by(instance, le) (rate(kubelet_pleg_relist_duration_seconds_bucket[5m])) * on(instance)
  group_left(node) kubelet_node_name{job="kubelet",metrics_path="/metrics"})

requires only one metric kubelet_node_name{job="kubelet",metrics_path="/metrics"} per instance. Therefore, all I had to do is to delete the redaudent services.

I'm not sure if this is a bug of helm, however it worked for me in this specific context.

TJM commented 4 years ago

You got it! We had about 5 different ones. There are multiple operators installed on this cluster because we have prometheus monitoring applications outside prometheus and found that there were SOO MANY jobs that it was bogging down prometheus, so we split it up to multiple prometheuses (or is that prometheii?). Anyhow the initial installations may have had some "indentation" issues when they tried to disable all the extra stuff to just get prometheus/grafana, and they left behind some svcs.

So, I do see it mentioned in the docs: https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack#migrating-from-stableprometheus-operator-chart

Why does it create a service that it doesn't remove? (Is that fixed in future versions?)
Why doesn't it use a "template" (?) to select the service it is looking for instead of a wildcard?

Tommy

crosbymichael1 commented 3 years ago

I saw this same issue as well, thanks for the great explanation @wi1dcard

hdhruna commented 3 years ago

@wi1dcard did you retain the chart name by any chance while migrating to the new version?

wi1dcard commented 3 years ago

@wi1dcard did you retain the chart name by any chance while migrating to the new version?

No, I didn't. Is that the Helm release name related to the issue?

mpanthofer commented 3 years ago

*> Why does it create a service that it doesn't remove? (Is that fixed in future versions?)

Why doesn't it use a "template" (?) to select the service it is looking for instead of a wildcard?**

~Isn't the point of the templates + values is to properly configure the services. Why then do we have to delete anything after start up? Can't we configure the chart to start with necessary services so the queries return accurate results out of the box?~

Update: My problem was related from VMware Tanzu Mission Control being installed in the same cluster. It works just fine out of the box.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

TJM commented 3 years ago

I think we are still waiting on...

Why does it create a service that it doesn't remove? (Is that fixed in future versions?)
Why doesn't it use a "template" (?) to select the service it is looking for instead of a wildcard?

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

gboor commented 3 years ago

I also have this problem. I checked for double services, but I cannot find any... I looked at the rule definitions and ran some prometheus queries and found this;

prometheus_rule_evaluation_failures_total{container="prometheus", endpoint="web", instance="10.0.1.72:9090", job="kube-prometheus-stack-prometheus", namespace="monitoring", pod="prometheus-kube-prometheus-stack-prometheus-0", rule_group="/etc/prometheus/rules/prometheus-kube-prometheus-stack-prometheus-rulefiles-0/monitoring-kube-prometheus-stack-kube-apiserver-availability.rules.yaml;kube-apiserver-availability.rules", service="kube-prometheus-stack-prometheus"} | 113
-- | --
prometheus_rule_evaluation_failures_total{container="prometheus", endpoint="web", instance="10.0.1.72:9090", job="kube-prometheus-stack-prometheus", namespace="monitoring", pod="prometheus-kube-prometheus-stack-prometheus-0", rule_group="/etc/prometheus/rules/prometheus-kube-prometheus-stack-prometheus-rulefiles-0/monitoring-kube-prometheus-stack-kube-apiserver.rules.yaml;kube-apiserver.rules", service="kube-prometheus-stack-prometheus"} | 134

It's those 2 that keep increasing, occasionally triggering the alert. Is this the same thing, or something completely different? Happens on 2 GKE clusters running the latest version of kube-prometheus-stack.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

stale[bot] commented 3 years ago

This issue is being automatically closed due to inactivity.

PeteMac88 commented 3 years ago

Any news here? We have the same problem in an AWS EKS cluster. I cannot find any duplicate service but the alert is still firing.

holooloo commented 3 years ago

I have such error in each Kubernetes installation which i had.

WesleyKlop commented 2 years ago

I think I have managed to fix it by setting prometheusOperator.kubeletService.enabled to false in the values.yaml.

For example: https://github.com/WesleyKlop/infrastructure/commit/63ca4e667a6a3dfce20029aaeede2fa0e98f3d39.

You might also be able to fix it by setting prometheusOperator.kubeletService.name to just kubelet since the other service which also says it is managed by prometheus operator has that name. (I did not test that)

prometheus-community / helm-charts

[kube-prometheus-stack] PrometheusRuleFailures #192