Closed alex-sainer closed 3 years ago
It might be wrong configured with seriesQuery
and resources.template
, would you provide more prometheus-adapter logs?
@alex-sainer I want to share an example config that worked for me but I'm using custom metrics since my rabbitmq is running inside Kubernetes. You could probably do the same by creating a k8s service of type external service.
Rules:
- seriesQuery: '{__name__=~"^rabbitmq_.*"}'
seriesFilters:
- is: ^rabbitmq_queue_messages$
resources:
overrides:
namespace:
resource: namespace
service:
resource: service
pod:
resource: pod
name:
matches: "^(.*)_messages$"
as: "rabbitmq_queue_messages_foo"
metricsQuery: 'rabbitmq_queue_messages{queue="foo"}'
Testing:
$ kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/production/services/*/rabbitmq_queue_messages_foo" | jq '.'
{
"kind": "MetricValueList",
"apiVersion": "custom.metrics.k8s.io/v1beta1",
"metadata": {
"selfLink": "/apis/custom.metrics.k8s.io/v1beta1/namespaces/production/services/%2A/rabbitmq_queue_messages_jobs"
},
"items": [
{
"describedObject": {
"kind": "Service",
"namespace": "production",
"name": "rabbitmqs",
"apiVersion": "/v1"
},
"metricName": "rabbitmq_queue_messages_foo",
"timestamp": "2019-07-15T14:09:28Z",
"value": "105"
}
]
}
Here is my sample HPA:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: worker-hpa
namespace: production
spec:
maxReplicas: 120
minReplicas: 30
scaleTargetRef:
apiVersion: extensions/v1beta1
kind: Deployment
name: hpa-worker-foo
metrics:
- type: Object
object:
metricName: rabbitmq_queue_messages_foo
target:
apiVersion: v1
kind: Service
name: rabbitmqs # This is my rabbitmq service name
targetValue: 2000
rabbitmq is a headless cluster IP:
$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
rabbitmqs ClusterIP None <none> 25672/TCP,15672/TCP,5672/TCP,9090/TCP,4369/TCP 54d
@alex-sainer checking the logs of the prometheus-adapter is the way to go to debug the actual query executed against prometheus. If you bump verbosity to --v=6
, you will see output similar to:
I0716 09:22:51.553029 1 api.go:74] GET http://prometheus-k8s.monitoring.svc:9090/api/v1/query?query=sum%28node%3Anode_memory_bytes_total%3Asum%7Bnode%3D%22kind-control-plane%22%7D+-+node%3Anode_memory_bytes_available%3Asum%7Bnode%3D%22kind-control-plane%22%7D%29+by+%28node%29&time=1563268971.551 200 OK
This is just an example for resource metrics. You can url-decode the query and debug it in the Prometheus UI yourself. Based on this you can start tweaking the templates and check what is wrong from there.
thanks all!!!
got the metrics working with following series-query-config :D
- seriesQuery: '{__name__=~"^rabbitmq_queue_messages_published_total"}'
resources:
overrides:
namespace:
resource: namespace
service:
resource: service
pod:
resource: pod
name:
matches: "^(.*)_total$"
as: "rabbitmq_queue_message_rate"
metricsQuery: 'sum(rate(rabbitmq_queue_messages_published_total{queue=~".*", vhost=~".*"} [5m])) by (vhost, queue)'
and getting following metrics from the API: (there are more items, but i think one should be enough)
{
"kind": "ExternalMetricValueList",
"apiVersion": "external.metrics.k8s.io/v1beta1",
"metadata": {
"selfLink": "/apis/external.metrics.k8s.io/v1beta1/namespaces/default/rabbitmq_queue_message_rate"
},
"items": [
{
"metricName": "rabbitmq_queue_message_rate",
"metricLabels": {
"queue": "image_renderer_rpc",
"vhost": "shop"
},
"timestamp": "2019-07-16T12:09:32Z",
"value": "900m"
},
{...},{...},{...}
]
}
do i need the resource-overrides in the series-query?
do i need the resource-overrides in the series-query?
I would recommend to do so, yes. Despite being repetitive it is very explicit which resources you are mapping.
Hey!
I'm confused with external metrics too: they are labeled as namespaced, but my prometheus adapter sends prometheus query that lacks namespace label.
Prometheus adapter config:
external:
- seriesQuery: '{__name__="jobs_waiting_count"}'
resources:
overrides:
namespace:
resource: namespace
metricsQuery: min(sum(jobs_waiting_count{<<.LabelMatchers>>}) by (pod))
kubectl get --raw /apis/external.metrics.k8s.io/v1beta1/namespaces/default/jobs_waiting_count | jq
{
"kind": "ExternalMetricValueList",
"apiVersion": "external.metrics.k8s.io/v1beta1",
"metadata": {
"selfLink": "/apis/external.metrics.k8s.io/v1beta1/namespaces/default/jobs_waiting_count"
},
"items": [
{
"metricName": "",
"metricLabels": {},
"timestamp": "2019-10-11T23:18:00Z",
"value": "0"
}
]
}
Prometheus-adapter logs:
GET http://prometheus-operated.monitoring.svc:9090/api/v1/query?query=min%28sum%28jobs_waiting_count%7B%7D%29+by+%28pod%29%29&time=1570835880.045 200 OK
Is it my mistake somewhere?
Thanks a lot!
I am a little confussed, can you tell me where I can put my configuration( seriesQuery... etc.) in the adapter?
Just got it to work. thank you all 2!! @alex-sainer 's version help me a lot.
we can also use kubernetes_namespace
- seriesQuery: '{__name__=~"^my_queue_len$"}'
resources:
#template: <<.Resource>>
overrides:
kubernetes_namespace: {resource: "namespace"}
kubernetes_pod_name: {resource: "pod"}
name:
matches: ""
as: "my_queue"
metricsQuery: sum(my_queue_len)
@lianghao208 I just started exploring kubernetes but what I did was. bundle everything up inside your prometheus-adapter yaml config
then run helm upgrade -f values.yaml my-release stable/prometheus-adapter
or helm apply..
if you are to create a new one
then you r good to go:
$ kubectl get --raw /apis/external.metrics.k8s.io/v1beta1/namespaces/*/my_queue | jq
{
"kind": "ExternalMetricValueList",
"apiVersion": "external.metrics.k8s.io/v1beta1",
"metadata": {
"selfLink": "/apis/external.metrics.k8s.io/v1beta1/namespaces/%2A/my_queue"
},
"items": [
{
"metricName": "my_queue",
"metricLabels": {},
"timestamp": "2020-03-08T19:21:45Z",
"value": "3"
}
]
}
@alex-sainer @CrashLaker, great you make it working! I am facing a similar issue, would really appreciate your help:
my prometheus metrics looks like:
my_metrics{kubernetes_namespace="my-namespace",kubernetes_pod_name="my-pod-name",pod_template_hash="5895666c56",someId="someId"}
and here is my config.yaml:
apiVersion: v1
data:
config.yaml: |
rules:
- seriesQuery: '{__name__=~"my_metrics.*",someId="someId"}'
resources:
overrides:
kubernetes_namespace:
resource: namespace
kubernetes_pod_name:
resource: pod
name: {as: "this-is-a-different-name"}
metricsQuery: 'sum({__name__=~"my_metrics.*",someId="someId"})'
kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1 | jq gives the output:
{
"kind": "APIResourceList",
"apiVersion": "v1",
"groupVersion": "custom.metrics.k8s.io/v1beta1",
"resources": [
{
"name": "namespaces/this-is-a-different-name",
"singularName": "",
"namespaced": false,
"kind": "MetricValueList",
"verbs": [
"get"
]
},
{
"name": "pods/this-is-a-different-name",
"singularName": "",
"namespaced": true,
"kind": "MetricValueList",
"verbs": [
"get"
]
}
]
}
but when I did kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1/namespaces/my-namespace/pods/*/this-is-a-different-name | jq
{
"kind": "MetricValueList",
"apiVersion": "custom.metrics.k8s.io/v1beta1",
"metadata": {
"selfLink": "/apis/custom.metrics.k8s.io/v1beta1/namespaces/my-namespace/pods/%2A/this-is-a-different-name"
},
"items": []
}
so I have 2 questions:
aaa
, and we are using opencensus agent collect metrics, and prometheus will scrapped from opencensus agent directly, so the kubernetes_namespace and kubernetes_pod_name will be the opencensus agent namespace and name, not my pod_1 name and pod_1 namespace aaa
Thanks in advance!
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen
.
Mark the issue as fresh with /remove-lifecycle rotten
.
Send feedback to sig-contributor-experience at kubernetes/community. /close
@fejta-bot: Closing this issue.
I know this is quite old issue, but I'm facing somewhat similar issue but non of the described solutions here are working for me. I'm thinking maybe there have been some changes to the k8s API.
I've also created a question on stack overflow that describes my issue in detail: https://stackoverflow.com/questions/77861109/how-to-get-value-of-external-metrics-in-k8s-and-prometheus-adapter
i have the Following Rule-Definition:
and querying the metrics-api gives me the Following:
but how can i debug / test the stuff?
i'm Trying to get separate Metrics for every Queue / Vhost combination, but i've absolutely no idea how to query that values...
i've tried this query:
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/*/rabbitmq_queue_messages_published_per_second" | jq .
but it returns only an empty value:
would be great if anyone could help me finding the correct way to get this working :)