kubernetes-sigs / prometheus-adapter

An implementation of the custom.metrics.k8s.io API using Prometheus
Apache License 2.0
1.92k stars 554 forks source link

external metrics - how to use? #219

Closed alex-sainer closed 3 years ago

alex-sainer commented 5 years ago

i have the Following Rule-Definition:

rules:
  default: "false"

  external:
    - seriesQuery: 'rabbitmq_queue_messages_published_total{queue=~".*", vhost=~".*"}'
      resources:
        template: rabbit_<<.Group>>_<<.Resource>>
      name:
        matches: "^(.*)_total"
        as: "${1}_per_second"
      metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m])) by (<<.GroupBy>>)

and querying the metrics-api gives me the Following:

# kubectl get --raw /apis/external.metrics.k8s.io/v1beta1 | jq .
{
  "kind": "APIResourceList",
  "apiVersion": "v1",
  "groupVersion": "external.metrics.k8s.io/v1beta1",
  "resources": [
    {
      "name": "rabbitmq_queue_messages_published_per_second",
      "singularName": "",
      "namespaced": true,
      "kind": "ExternalMetricValueList",
      "verbs": [
        "get"
      ]
    }
  ]
}

but how can i debug / test the stuff?

i'm Trying to get separate Metrics for every Queue / Vhost combination, but i've absolutely no idea how to query that values...

i've tried this query: kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/*/rabbitmq_queue_messages_published_per_second" | jq .

but it returns only an empty value:

{
  "kind": "ExternalMetricValueList",
  "apiVersion": "external.metrics.k8s.io/v1beta1",
  "metadata": {
    "selfLink": "/apis/external.metrics.k8s.io/v1beta1/namespaces/%2A/rabbitmq_queue_messages_published_per_second"
  },
  "items": []
}

would be great if anyone could help me finding the correct way to get this working :)

wanghaokk commented 5 years ago

It might be wrong configured with seriesQuery and resources.template, would you provide more prometheus-adapter logs?

bzon commented 5 years ago

@alex-sainer I want to share an example config that worked for me but I'm using custom metrics since my rabbitmq is running inside Kubernetes. You could probably do the same by creating a k8s service of type external service.

Rules:

    - seriesQuery: '{__name__=~"^rabbitmq_.*"}'
      seriesFilters:
      - is: ^rabbitmq_queue_messages$
      resources:
        overrides:
          namespace:
            resource: namespace
          service:
            resource: service
          pod:
            resource: pod
      name:
        matches: "^(.*)_messages$"
        as: "rabbitmq_queue_messages_foo"
      metricsQuery: 'rabbitmq_queue_messages{queue="foo"}'

Testing:

$ kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/production/services/*/rabbitmq_queue_messages_foo" | jq '.'

{
  "kind": "MetricValueList",
  "apiVersion": "custom.metrics.k8s.io/v1beta1",
  "metadata": {
    "selfLink": "/apis/custom.metrics.k8s.io/v1beta1/namespaces/production/services/%2A/rabbitmq_queue_messages_jobs"
  },
  "items": [
    {
      "describedObject": {
        "kind": "Service",
        "namespace": "production",
        "name": "rabbitmqs",
        "apiVersion": "/v1"
      },
      "metricName": "rabbitmq_queue_messages_foo",
      "timestamp": "2019-07-15T14:09:28Z",
      "value": "105"
    }
  ]
}

Here is my sample HPA:

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: worker-hpa
  namespace: production
spec:
  maxReplicas: 120
  minReplicas: 30
  scaleTargetRef:
    apiVersion: extensions/v1beta1
    kind: Deployment
    name: hpa-worker-foo
  metrics:
  - type: Object
    object:
      metricName: rabbitmq_queue_messages_foo
      target:
        apiVersion: v1
        kind: Service
        name: rabbitmqs # This is my rabbitmq service name
      targetValue: 2000

rabbitmq is a headless cluster IP:

$ kubectl get svc

NAME                                       TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                                          AGE
rabbitmqs                                  ClusterIP   None            <none>        25672/TCP,15672/TCP,5672/TCP,9090/TCP,4369/TCP   54d
s-urbaniak commented 5 years ago

@alex-sainer checking the logs of the prometheus-adapter is the way to go to debug the actual query executed against prometheus. If you bump verbosity to --v=6, you will see output similar to:

I0716 09:22:51.553029       1 api.go:74] GET http://prometheus-k8s.monitoring.svc:9090/api/v1/query?query=sum%28node%3Anode_memory_bytes_total%3Asum%7Bnode%3D%22kind-control-plane%22%7D+-+node%3Anode_memory_bytes_available%3Asum%7Bnode%3D%22kind-control-plane%22%7D%29+by+%28node%29&time=1563268971.551 200 OK

This is just an example for resource metrics. You can url-decode the query and debug it in the Prometheus UI yourself. Based on this you can start tweaking the templates and check what is wrong from there.

alex-sainer commented 5 years ago

thanks all!!!

got the metrics working with following series-query-config :D

    - seriesQuery: '{__name__=~"^rabbitmq_queue_messages_published_total"}'
      resources:
        overrides:
          namespace:
            resource: namespace
          service:
            resource: service
          pod:
            resource: pod
      name:
        matches: "^(.*)_total$"
        as: "rabbitmq_queue_message_rate"
      metricsQuery: 'sum(rate(rabbitmq_queue_messages_published_total{queue=~".*", vhost=~".*"} [5m])) by (vhost, queue)'

and getting following metrics from the API: (there are more items, but i think one should be enough)

{
  "kind": "ExternalMetricValueList",
  "apiVersion": "external.metrics.k8s.io/v1beta1",
  "metadata": {
    "selfLink": "/apis/external.metrics.k8s.io/v1beta1/namespaces/default/rabbitmq_queue_message_rate"
  },
  "items": [
    {
      "metricName": "rabbitmq_queue_message_rate",
      "metricLabels": {
        "queue": "image_renderer_rpc",
        "vhost": "shop"
      },
      "timestamp": "2019-07-16T12:09:32Z",
      "value": "900m"
    },
    {...},{...},{...}
  ]
}

do i need the resource-overrides in the series-query?

s-urbaniak commented 5 years ago

do i need the resource-overrides in the series-query?

I would recommend to do so, yes. Despite being repetitive it is very explicit which resources you are mapping.

dragonsmith commented 5 years ago

Hey!

I'm confused with external metrics too: they are labeled as namespaced, but my prometheus adapter sends prometheus query that lacks namespace label.

Prometheus adapter config:

      external:
        - seriesQuery: '{__name__="jobs_waiting_count"}'
          resources:
            overrides:
              namespace:
                resource: namespace
          metricsQuery: min(sum(jobs_waiting_count{<<.LabelMatchers>>}) by (pod))

kubectl get --raw /apis/external.metrics.k8s.io/v1beta1/namespaces/default/jobs_waiting_count | jq

{
  "kind": "ExternalMetricValueList",
  "apiVersion": "external.metrics.k8s.io/v1beta1",
  "metadata": {
    "selfLink": "/apis/external.metrics.k8s.io/v1beta1/namespaces/default/jobs_waiting_count"
  },
  "items": [
    {
      "metricName": "",
      "metricLabels": {},
      "timestamp": "2019-10-11T23:18:00Z",
      "value": "0"
    }
  ]
}

Prometheus-adapter logs:

GET http://prometheus-operated.monitoring.svc:9090/api/v1/query?query=min%28sum%28jobs_waiting_count%7B%7D%29+by+%28pod%29%29&time=1570835880.045 200 OK

Is it my mistake somewhere?

Thanks a lot!

lianghao208 commented 4 years ago

I am a little confussed, can you tell me where I can put my configuration( seriesQuery... etc.) in the adapter?

CrashLaker commented 4 years ago

Just got it to work. thank you all 2!! @alex-sainer 's version help me a lot.

we can also use kubernetes_namespace

 - seriesQuery: '{__name__=~"^my_queue_len$"}'
    resources:
      #template: <<.Resource>>
      overrides:
        kubernetes_namespace: {resource: "namespace"}
        kubernetes_pod_name: {resource: "pod"}
    name:
      matches: ""
      as: "my_queue"
    metricsQuery: sum(my_queue_len)

@lianghao208 I just started exploring kubernetes but what I did was. bundle everything up inside your prometheus-adapter yaml config

values.yml ```yaml prometheus: url: http://prometheus-server.default.svc port: 80 rules: external: - seriesQuery: '{__name__=~"^my_queue_len$"}' resources: #template: <<.Resource>> overrides: kubernetes_namespace: {resource: "namespace"} kubernetes_pod_name: {resource: "pod"} name: matches: "" as: "my_queue" metricsQuery: sum(my_queue_len) ```

then run helm upgrade -f values.yaml my-release stable/prometheus-adapter or helm apply.. if you are to create a new one

then you r good to go:

$ kubectl get --raw /apis/external.metrics.k8s.io/v1beta1/namespaces/*/my_queue | jq
{
  "kind": "ExternalMetricValueList",
  "apiVersion": "external.metrics.k8s.io/v1beta1",
  "metadata": {
    "selfLink": "/apis/external.metrics.k8s.io/v1beta1/namespaces/%2A/my_queue"
  },
  "items": [
    {
      "metricName": "my_queue",
      "metricLabels": {},
      "timestamp": "2020-03-08T19:21:45Z",
      "value": "3"
    }
  ]
}
yiyijin commented 4 years ago

@alex-sainer @CrashLaker, great you make it working! I am facing a similar issue, would really appreciate your help:

my prometheus metrics looks like:

my_metrics{kubernetes_namespace="my-namespace",kubernetes_pod_name="my-pod-name",pod_template_hash="5895666c56",someId="someId"}

and here is my config.yaml:

apiVersion: v1
data:
  config.yaml: |
    rules:
    - seriesQuery: '{__name__=~"my_metrics.*",someId="someId"}'
      resources:
        overrides:
          kubernetes_namespace:
            resource: namespace
          kubernetes_pod_name:
            resource: pod
      name: {as: "this-is-a-different-name"}
      metricsQuery: 'sum({__name__=~"my_metrics.*",someId="someId"})'

kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1 | jq gives the output:

{
  "kind": "APIResourceList",
  "apiVersion": "v1",
  "groupVersion": "custom.metrics.k8s.io/v1beta1",
  "resources": [
    {
      "name": "namespaces/this-is-a-different-name",
      "singularName": "",
      "namespaced": false,
      "kind": "MetricValueList",
      "verbs": [
        "get"
      ]
    },
    {
      "name": "pods/this-is-a-different-name",
      "singularName": "",
      "namespaced": true,
      "kind": "MetricValueList",
      "verbs": [
        "get"
      ]
    }
  ]
}

but when I did kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1/namespaces/my-namespace/pods/*/this-is-a-different-name | jq

{
  "kind": "MetricValueList",
  "apiVersion": "custom.metrics.k8s.io/v1beta1",
  "metadata": {
    "selfLink": "/apis/custom.metrics.k8s.io/v1beta1/namespaces/my-namespace/pods/%2A/this-is-a-different-name"
  },
  "items": []
}

so I have 2 questions:

Thanks in advance!

fejta-bot commented 3 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

fejta-bot commented 3 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten

fejta-bot commented 3 years ago

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-contributor-experience at kubernetes/community. /close

k8s-ci-robot commented 3 years ago

@fejta-bot: Closing this issue.

In response to [this](https://github.com/kubernetes-sigs/prometheus-adapter/issues/219#issuecomment-811187905): >Rotten issues close after 30d of inactivity. >Reopen the issue with `/reopen`. >Mark the issue as fresh with `/remove-lifecycle rotten`. > >Send feedback to sig-contributor-experience at [kubernetes/community](https://github.com/kubernetes/community). >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
haniffm commented 9 months ago

I know this is quite old issue, but I'm facing somewhat similar issue but non of the described solutions here are working for me. I'm thinking maybe there have been some changes to the k8s API.

I've also created a question on stack overflow that describes my issue in detail: https://stackoverflow.com/questions/77861109/how-to-get-value-of-external-metrics-in-k8s-and-prometheus-adapter