kubernetes-sigs / prometheus-adapter

An implementation of the custom.metrics.k8s.io API using Prometheus
Apache License 2.0
1.9k stars 551 forks source link

Issue fetching external metric #642

Open tusharInferQ opened 6 months ago

tusharInferQ commented 6 months ago

What happened?:

I'm trying to implement HPA with a custom metric from my application. I am able to query the metric via curl, and I am also able to view the metric in the Prometheus UI. However, when i describe my HPA, it gives me the error as follows:

Warning FailedComputeMetricsReplicas 18m (x12 over 21m) horizontal-pod-autoscaler invalid metrics (1 invalid out of 1), first error is: failed to get active_connections_total external metric value: failed to get active_connections_total external metric: unable to get external metric default/active_connections_total/nil: unable to fetch metrics from external metrics API: the server could not find the requested resource (get active_connections_total.external.metrics.k8s.io) Warning FailedGetExternalMetric 94s (x81 over 21m) horizontal-pod-autoscaler unable to get external metric default/active_connections_total/nil: unable to fetch metrics from external metrics API: the server could not find the requested resource (get active_connections_total.external.metrics.k8s.io) Warning FailedGetExternalMetric <invalid> (x2 over <invalid>) horizontal-pod-autoscaler unable to get external metric default/active_connections_total/nil: unable to fetch metrics from external metrics API: the server could not find the requested resource (get active_connections_total.external.metrics.k8s.io) Warning FailedComputeMetricsReplicas <invalid> (x2 over <invalid>) horizontal-pod-autoscaler invalid metrics (1 invalid out of 1), first error is: failed to get active_connections_total external metric value: failed to get active_connections_total external metric: unable to get external metric default/active_connections_total/nil: unable to fetch metrics from external metrics API: the server could not find the requested resource (get active_connections_total.external.metrics.k8s.io)

What did you expect to happen?:

I expected that my HPA would be able to receive the current number of active_connections_total (my external metric), and accordingly scale pods up or down.

Please provide the prometheus-adapter config:

``` apiVersion: v1 kind: ConfigMap metadata: name: adapter-config namespace: prometheus data: config.yaml: |- rules: - seriesQuery: | active_connections_total resources: template: pod name: matches: "^(.*)_total" as: "$1" metricsQuery: | sum by (app) ( active_connections_total{app="eamm"} ) ```

Please provide the HPA resource used for autoscaling:

HPA yaml ``` apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: hpa-connection-based spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: eamm-deployment-v1 minReplicas: 1 maxReplicas: 4 metrics: - type: External external: metric: name: active_connections_total target: type: Value averageValue: 1 ```

Please provide the HPA status:

Warning FailedComputeMetricsReplicas 18m (x12 over 21m) horizontal-pod-autoscaler invalid metrics (1 invalid out of 1), first error is: failed to get active_connections_total external metric value: failed to get active_connections_total external metric: unable to get external metric default/active_connections_total/nil: unable to fetch metrics from external metrics API: the server could not find the requested resource (get active_connections_total.external.metrics.k8s.io) Warning FailedGetExternalMetric 94s (x81 over 21m) horizontal-pod-autoscaler unable to get external metric default/active_connections_total/nil: unable to fetch metrics from external metrics API: the server could not find the requested resource (get active_connections_total.external.metrics.k8s.io) Warning FailedGetExternalMetric <invalid> (x2 over <invalid>) horizontal-pod-autoscaler unable to get external metric default/active_connections_total/nil: unable to fetch metrics from external metrics API: the server could not find the requested resource (get active_connections_total.external.metrics.k8s.io) Warning FailedComputeMetricsReplicas <invalid> (x2 over <invalid>) horizontal-pod-autoscaler invalid metrics (1 invalid out of 1), first error is: failed to get active_connections_total external metric value: failed to get active_connections_total external metric: unable to get external metric default/active_connections_total/nil: unable to fetch metrics from external metrics API: the server could not find the requested resource (get active_connections_total.external.metrics.k8s.io)

Please provide the prometheus-adapter logs with -v=6 around the time the issue happened:

prometheus-adapter logs `I0220 09:37:35.228067 1 httplog.go:132] "HTTP" verb="GET" URI="/apis/custom.metrics.k8s.io/v1beta1" latency="570.664µs" userAgent="Go-http-client/2.0" audit-ID="8c6994a7-d7dc-48bc-9b96-247bb5d3afb3" srcIP="192.168.149.219:50060" resp=200 I0220 09:37:35.228124 1 httplog.go:132] "HTTP" verb="GET" URI="/apis/custom.metrics.k8s.io/v1beta1" latency="713.014µs" userAgent="Go-http-client/2.0" audit-ID="87e85bdb-0ede-46e6-8d70-bdd792ba15b4" srcIP="192.168.149.219:50060" resp=200 I0220 09:37:35.228140 1 httplog.go:132] "HTTP" verb="GET" URI="/apis/custom.metrics.k8s.io/v1beta1" latency="756.121µs" userAgent="Go-http-client/2.0" audit-ID="51d7804b-673d-44a4-bf8b-23c0753943fc" srcIP="192.168.149.219:50060" resp=200 I0220 09:37:35.228140 1 httplog.go:132] "HTTP" verb="GET" URI="/apis/custom.metrics.k8s.io/v1beta1" latency="732.822µs" userAgent="Go-http-client/2.0" audit-ID="b41a1e37-88d9-4295-9f60-7959c1ef15f2" srcIP="192.168.149.219:50060" resp=200 I0220 09:37:35.228279 1 httplog.go:132] "HTTP" verb="GET" URI="/apis/custom.metrics.k8s.io/v1beta1" latency="659.56µs" userAgent="Go-http-client/2.0" audit-ID="886ff402-e061-4477-adfa-21661f9d4108" srcIP="192.168.149.219:50060" resp=200 I0220 09:37:38.178959 1 httplog.go:132] "HTTP" verb="GET" URI="/healthz" latency="182.979µs" userAgent="kube-probe/1.29+" audit-ID="22264f9a-4fe7-497f-94aa-2bbc8f7a7610" srcIP="192.168.58.94:53526" resp=200 I0220 09:37:38.179392 1 httplog.go:132] "HTTP" verb="GET" URI="/healthz" latency="164.012µs" userAgent="kube-probe/1.29+" audit-ID="b5e922a1-9851-4154-a479-992efd76b100" srcIP="192.168.58.94:53524" resp=200 I0220 09:37:40.503400 1 httplog.go:132] "HTTP" verb="GET" URI="/apis/custom.metrics.k8s.io/v1beta1" latency="502.409µs" userAgent="Go-http-client/2.0" audit-ID="6caf0e14-77f1-4b5b-9020-9c52b59d55f0" srcIP="192.168.182.198:59732" resp=200 I0220 09:37:40.503638 1 httplog.go:132] "HTTP" verb="GET" URI="/apis/custom.metrics.k8s.io/v1beta1" latency="524.354µs" userAgent="Go-http-client/2.0" audit-ID="f69db820-2a6c-45d7-b6c3-f1fb9732cc67" srcIP="192.168.182.198:59732" resp=200 I0220 09:37:40.503712 1 httplog.go:132] "HTTP" verb="GET" URI="/apis/custom.metrics.k8s.io/v1beta1" latency="596.033µs" userAgent="Go-http-client/2.0" audit-ID="4b7b0cf3-7279-4642-a071-d26482c1fdd8" srcIP="192.168.182.198:59732" resp=200 I0220 09:37:40.503980 1 httplog.go:132] "HTTP" verb="GET" URI="/apis/custom.metrics.k8s.io/v1beta1" latency="603.228µs" userAgent="Go-http-client/2.0" audit-ID="701b8235-e921-405d-a2e6-e77cfc501435" srcIP="192.168.182.198:59732" resp=200 I0220 09:37:40.504956 1 httplog.go:132] "HTTP" verb="GET" URI="/apis/custom.metrics.k8s.io/v1beta1" latency="479.687µs" userAgent="Go-http-client/2.0" audit-ID="f00a427d-3742-4ad2-b3a4-d1a5e40f23f0" srcIP="192.168.182.198:59732" resp=200 I0220 09:37:48.179353 1 httplog.go:132] "HTTP" verb="GET" URI="/healthz" latency="183.054µs" userAgent="kube-probe/1.29+" audit-ID="91b0c6be-d1e0-4217-b146-67d558c7c963" srcIP="192.168.58.94:38734" resp=200 I0220 09:37:48.179771 1 httplog.go:132] "HTTP" verb="GET" URI="/healthz" latency="144.857µs" userAgent="kube-probe/1.29+" audit-ID="274e8954-ba95-4b84-a52a-b4307c60b291" srcIP="192.168.58.94:38736" resp=200 I0220 09:37:49.783147 1 httplog.go:132] "HTTP" verb="GET" URI="/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/%2A/http_requests?labelSelector=app%3Dsample-app" latency="12.029261ms" userAgent="kube-controller-manager/v1.29.0 (linux/amd64) kubernetes/787475c/system:serviceaccount:kube-system:horizontal-pod-autoscaler" audit-ID="c56c906f-224f-4d0b-9e72-b3ce12d7e816" srcIP="192.168.149.219:50062" resp=404 I0220 09:37:58.178718 1 httplog.go:132] "HTTP" verb="GET" URI="/healthz" latency="187.6µs" userAgent="kube-probe/1.29+" audit-ID="20f67fcb-8295-4d62-a8d8-42def158fc4c" srcIP="192.168.58.94:53796" resp=200 I0220 09:37:58.179198 1 httplog.go:132] "HTTP" verb="GET" URI="/healthz" latency="225.774µs" userAgent="kube-probe/1.29+" audit-ID="7998a065-1836-4983-b792-557af3cbddf7" srcIP="192.168.58.94:53794" resp=200 `

Anything else we need to know?:

Environment:

dashpole commented 6 months ago

/assign @dgrisonnet /triage accepted