kubernetes-sigs / prometheus-adapter

An implementation of the custom.metrics.k8s.io API using Prometheus
Apache License 2.0
1.9k stars 551 forks source link

Prometheus Adapter Can't connect to prometheus, Getting connection reset by peer error #607

Open robin-coac opened 1 year ago

robin-coac commented 1 year ago

Problem Statement

Hi everyone, I had a working custom-autoscaling setup with prometheus adapter. However, for some reason, it's not working suddently. After struggling for two days, I am totally out of ideas.

The required custom metric is scraped by prometheus as visible from Prometheus UI.

But, kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1 gives empty output :

{"kind":"APIResourceList","apiVersion":"v1","groupVersion":"custom.metrics.k8s.io/v1beta1","resources":[]}

This is the error log in prometheus adapter.**

E0905 13:29:57.806549       1 provider.go:229] unable to update list of all metrics: unable to fetch metrics for query "{namespace!=\"\", service!=\"\" }": Get "http://prometheus-server-kube-pro-prometheus.my_ns.svc:9090/api/v1/series?match%5B%5D=%7Bnamespace%21%3D%22%22%2C+service%21%3D%22%22+%7D&start=1693920537.802": read tcp 10.42.1.248:46672->10.43.131.216:9090: read: connection reset by peer

Prometheus server installation

helm upgrade --install -n my_ns prometheus-server prometheus-community/kube-prometheus-stack -f https://raw.githubusercontent.com/kyma-project/examples/main/prometheus/values.yaml --set grafana.enabled=false

Prometheus Adapter installation

helm upgrade --install --namespace my_ns prometheus-adapter prometheus-community/prometheus-adapter -f ./prometheus-adapter-values.yaml

values.yaml for Prometheus Adapter

prometheus:
  url: http://prometheus-server-kube-pro-prometheus.my_ns.svc.cluster.local
  port: 9090
  path: ""

rules:
  default: false 

  custom: 
    - seriesQuery: 'jetstream_stream_total_messages{namespace!="", service!="" }'
      resources:
        overrides:
          namespace: {resource: "namespace"}
          service: {resource: "service"}
      name:
        matches: "jetstream_stream_total_messages"
        as: "jetstream_stream_total_messages" ##
      metricsQuery: sum(<<.Series>>{<<.LabelMatchers>>, stream_name="sample-stream" }) by (<<.GroupBy>>)

Output of kubectl get svc -n my_ns

nats-exporter-prometheus-nats-exporter       ClusterIP   10.43.160.239   <none>        80/TCP                       7h41m
prometheus-server-kube-state-metrics         ClusterIP   10.43.154.79    <none>        8080/TCP                     7h18m
prometheus-server-kube-pro-alertmanager      ClusterIP   10.43.71.250    <none>        9093/TCP,8080/TCP            7h18m
prometheus-server-kube-pro-operator          ClusterIP   10.43.107.51    <none>        443/TCP                      7h18m
prometheus-server-prometheus-node-exporter   ClusterIP   10.43.216.251   <none>        9101/TCP                     7h18m
prometheus-server-kube-pro-prometheus        ClusterIP   10.43.131.216   <none>        9090/TCP,8080/TCP            7h18m
prometheus-operated                          ClusterIP   None            <none>        9090/TCP                     7h18m
prometheus-adapter                           ClusterIP   10.43.133.163   <none>        443/TCP                      120m
dgrisonnet commented 1 year ago

/triage accepted /remove-kind bug /kind support /assign

k8s-triage-robot commented 1 week ago

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted