Closed oussama-mechlaoui closed 2 years ago
I have tried the following:
- metricsQuery: '{namespace!="",__name__=~"^aws_applicationelb_request_.*",load_balancer="app/k8s-eksmonitoring-1bb56c3370/16ae9f70d2e5fad4"}'
name:
as: ''
matches: '^aws_(.*)_sum$'
resources:
overrides:
namespace:
resource: 'namespace'
pod:
resource: 'pod'
seriesQuery: '{namespace!="",__name__=~"^aws_applicationelb_request_.*",load_balancer="app/k8s-eksmonitoring-1bb56c3370/16ae9f70d2e5fad4"}'
The list of custom metrics is empty.
kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1/ | jq . { "kind": "APIResourceList", "apiVersion": "v1", "groupVersion": "custom.metrics.k8s.io/v1beta1", "resources": [] }
I have enabled debug for prometheus-adapter (v=10), I can see the following in the logs:
I0412 15:35:51.303854 1 api.go:74] GET http://prometheus-k8s.monitoring.svc.cluster.local:9090/api/v1/series?match%5B%5D=%7Bnamespace%21%3D%22%22%2C__name__%3D~%22%5Eaws_applicationelb_request_.%2A%22%2Cload_balancer%3D%22app%2Fk8s-eksmonitoring-1bb56c3370%2F16ae9f70d2e5fad4%22%7D&start=1618241691.302 200 OK I0412 15:35:51.303924 1 api.go:93] Response Body: {"status":"success","data":[]} I0412 15:35:51.303973 1 provider.go:279] Set available metric list from Prometheus to: [[]]
When I try to curl directly the prometheus-server, I with same query link generated by the adapter I get the following results:
curl -s http://prometheus-k8s.monitoring.svc.cluster.local:9090/api/v1/series?match%5B%5D=%7Bnamespace%21%3D%22%22%2C__name__%3D~%22%5Eaws_applicationelb_request_.%2A%22%2Cload_balancer%3D%22app%2Fk8s-eksmonitoring-1bb56c3370%2F16ae9f70d2e5fad4%22%7D | jq .
{
"status": "success",
"data": [
{
"name": "aws_applicationelb_request_count_sum",
"container": "prometheus-cloudwatch-exporter",
"endpoint": "http",
"exported_job": "aws_applicationelb",
"instance": "192.168.2.127:9106",
"job": "prometheus-cloudwatch-exporter",
"load_balancer": "app/k8s-eksmonitoring-1bb56c3370/16ae9f70d2e5fad4",
"namespace": "monitoring",
"pod": "cw-prometheus-exporter-prometheus-cloudwatch-exporter-9fb7dc9ct",
"service": "cw-prometheus-exporter-prometheus-cloudwatch-exporter"
}
]
}
Any insights, how can I use cloudwatch metrics with prometheus-adapter?
Metrics from AWS Cloudwatch in Prometheus have a time lag. see timestamps Adapter do not create resource in APIResourceList If Prometheus return null. Default Helm timestamp value - 1m.
I added ARG --metrics-max-age=15m. This parameter isn't documented anywhere and also the Helm chart doesn't make use of it. If --metrics-max-age is not explicitly set, set it equal to --metrics-relist-interval. In this case, it is as if the --metrics-max-age.
My prometheus-adapter deployment:
containers:
- args:
- /adapter
- --secure-port=6443
- --cert-dir=/tmp/cert
- --logtostderr=true
- --prometheus-url=http://prometheus-operated:9090
- --metrics-relist-interval=1m
- --v=4
- --config=/etc/adapter/config.yaml
- --metrics-max-age=90m
My ConfigMap prometheus-adapter-custom:
rules:
- seriesQuery: 'aws_sqs_approximate_number_of_messages_visible_average{container="prometheus-cloudwatch-exporter",exported_job="aws_sqs",queue_name="mysqs"}'
seriesFilters: []
resources:
overrides:
namespace:
resource: namespace
pod:
resource: pod
name:
as: "aws_sqs_approximate_number_of_messages_visible_average_mysqs"
matches: ""
metricsQuery: max_over_time(aws_sqs_approximate_number_of_messages_visible_average{<<.LabelMatchers>>}[15m])
Result:
kubectl -n monitoring get --raw="/apis/custom.metrics.k8s.io/v1beta1" | jq .
{
"kind": "APIResourceList",
"apiVersion": "v1",
"groupVersion": "custom.metrics.k8s.io/v1beta1",
"resources": [
{
"name": "namespaces/aws_sqs_approximate_number_of_messages_visible_average_mysqs",
"singularName": "",
"namespaced": false,
"kind": "MetricValueList",
"verbs": [
"get"
]
},
{
"name": "pods/aws_sqs_approximate_number_of_messages_visible_average_mysqs",
"singularName": "",
"namespaced": true,
"kind": "MetricValueList",
"verbs": [
"get"
]
}
]
}
❯ kubectl -n monitoring get --raw="/apis/custom.metrics.k8s.io/v1beta1/namespaces/monitoring/pods/*/aws_sqs_approximate_number_of_messages_visible_average_mysqs" | jq .
{
"kind": "MetricValueList",
"apiVersion": "custom.metrics.k8s.io/v1beta1",
"metadata": {
"selfLink": "/apis/custom.metrics.k8s.io/v1beta1/namespaces/monitoring/pods/%2A/aws_sqs_approximate_number_of_messages_visible_average_mysqs"
},
"items": [
{
"describedObject": {
"kind": "Pod",
"namespace": "monitoring",
"name": "cloudwatch-exporter-prometheus-cloudwatch-exporter-78f5d7cjst2z",
"apiVersion": "/v1"
},
"metricName": "aws_sqs_approximate_number_of_messages_visible_average_shalb",
"timestamp": "2021-05-20T08:26:32Z",
"value": "39",
"selector": null
}
]
}
It's working.
Hello @sergdpi , I'm trying to do almost the same query that you did with AWS SQS number of messages visible. Mine actually looks like this:
external:
- seriesQuery: 'aws_sqs_approximate_number_of_messages_visible_average{queue_name="mysqs"}'
seriesFilters: []
resources:
overrides:
kubernetes_namespace:
resource: namespace
name:
as: "approximate_number_of_messages_visible_mysqs"
matches: ""
metricsQuery: max_over_time(aws_sqs_approximate_number_of_messages_visible_average{<<.LabelMatchers>>}[15m])
Sadly, I get the following in my adapter logs
I0811 10:22:09.562542 1 api.go:74] GET http://prometheus:80/api/v1/series?match%5B%5D=aws_sqs_approximate_number_of_messages_visible_average%7Bqueue_name%3D%22mysqs%22%7D&start=1628677269.56 200 OK
I0811 10:22:09.562618 1 api.go:93] Response Body: {"status":"success","data":[]}
The GET to prometheus returns an empty result even though there are metrics in Prometheus itself
aws_sqs_approximate_number_of_messages_visible_average{app="prometheus-cloudwatch-exporter", app_kubernetes_io_managed_by="Helm", chart="prometheus-cloudwatch-exporter-0.16.0", exported_job="aws_sqs", heritage="Helm", instance="10.x.x.x:9106", job="kubernetes-service-endpoints", kubernetes_name="prometheus-cloudwatch-exporter", kubernetes_namespace="monitoring", kubernetes_node="kube-worker-1", queue_name="mysqs", release="prometheus-cloudwatch-exporter"}
So, I'm not able to get those metrics in k8s external apis (even though I have other external metrics taken from prometheus, i.e. the number of http_requests
that I use for another type of HPA)
k get --raw /apis/external.metrics.k8s.io/v1beta1 | jq .
{
"kind": "APIResourceList",
"apiVersion": "v1",
"groupVersion": "external.metrics.k8s.io/v1beta1",
"resources": [
{
"name": "http_server_requests_rate_10m",
"singularName": "",
"namespaced": true,
"kind": "ExternalMetricValueList",
"verbs": [
"get"
]
},
{
"name": "http_server_requests_seconds_count",
"singularName": "",
"namespaced": true,
"kind": "ExternalMetricValueList",
"verbs": [
"get"
]
},
{
"name": "http_server_requests_rate_5m",
"singularName": "",
"namespaced": true,
"kind": "ExternalMetricValueList",
"verbs": [
"get"
]
}
]
}
Do some of you have any clue on why I'm not able to get those metrics to work?
I'm using the latest prom adapter version 0.8.4 and I've also set the --metrics-max-age=15m
argument in the deploy.
Thanks for your help!
I've found the solution to my problems myself and I'm posting them here in case someone else might incur in the same issues.
I'll start from the helm chart values.yaml
parameters that I've added:
image:
repository: gcr.io/k8s-staging-prometheus-adapter/prometheus-adapter
tag: master
pullPolicy: IfNotPresent
metricsRelistInterval: 5m
...
- seriesQuery: 'aws_sqs_approximate_number_of_messages_visible_average{kubernetes_namespace!="",queue_name!=""}'
resources:
namespaced: false
overrides:
kubernetes_namespace:
resource: namespace
metricsQuery: max_over_time(<<.Series>>{<<.LabelMatchers>>}[10m])
So, the adapter is fairly straightforward: I've been using the stage image as suggested in another issue because, this way, I've been able to set the metric with the namespaced: false
parameter (I need it because I have a generic "monitoring" namespace that gathers all the metrics for the applications running in their namespaces). Then, I've added the --metrics-relist-interval=5m
because, as said by @sergdpi , CloudWatch takes some minutes to converge metrics before they're available for scraping. I didn't set the other mentioned parameter (--metrics-max-age
) because it should be automatically equal to the first one.
Finally, in the HPA configuration, I did the following:
metrics:
- type: External
external:
metricName: aws_sqs_approximate_number_of_messages_visible_average
metricSelector:
matchLabels:
queue_name: mysqs
kubernetes_namespace: monitoring
targetValue: 10
Now my HPA is finally able to collect metrics (remember that they are in a different namespace) from kube external metrics API. My new metric is showed below:
{
"name": "aws_sqs_approximate_number_of_messages_visible_average",
"singularName": "",
"namespaced": true,
"kind": "ExternalMetricValueList",
"verbs": [
"get"
]
},
Thanks everyone for the help and the maintainers for their work on this tool!
Great stuff, thank you for posting the solution 🎉
Can this issue be closed now?
Metrics from AWS Cloudwatch in Prometheus have a time lag. see timestamps Adapter do not create resource in APIResourceList If Prometheus return null. Default Helm timestamp value - 1m.
I added ARG --metrics-max-age=15m. This parameter isn't documented anywhere and also the Helm chart doesn't make use of it. If --metrics-max-age is not explicitly set, set it equal to --metrics-relist-interval. In this case, it is as if the --metrics-max-age.
My prometheus-adapter deployment:
containers: - args: - /adapter - --secure-port=6443 - --cert-dir=/tmp/cert - --logtostderr=true - --prometheus-url=http://prometheus-operated:9090 - --metrics-relist-interval=1m - --v=4 - --config=/etc/adapter/config.yaml - --metrics-max-age=90m
My ConfigMap prometheus-adapter-custom:
rules: - seriesQuery: 'aws_sqs_approximate_number_of_messages_visible_average{container="prometheus-cloudwatch-exporter",exported_job="aws_sqs",queue_name="mysqs"}' seriesFilters: [] resources: overrides: namespace: resource: namespace pod: resource: pod name: as: "aws_sqs_approximate_number_of_messages_visible_average_mysqs" matches: "" metricsQuery: max_over_time(aws_sqs_approximate_number_of_messages_visible_average{<<.LabelMatchers>>}[15m])
Result:
kubectl -n monitoring get --raw="/apis/custom.metrics.k8s.io/v1beta1" | jq . { "kind": "APIResourceList", "apiVersion": "v1", "groupVersion": "custom.metrics.k8s.io/v1beta1", "resources": [ { "name": "namespaces/aws_sqs_approximate_number_of_messages_visible_average_mysqs", "singularName": "", "namespaced": false, "kind": "MetricValueList", "verbs": [ "get" ] }, { "name": "pods/aws_sqs_approximate_number_of_messages_visible_average_mysqs", "singularName": "", "namespaced": true, "kind": "MetricValueList", "verbs": [ "get" ] } ] }
❯ kubectl -n monitoring get --raw="/apis/custom.metrics.k8s.io/v1beta1/namespaces/monitoring/pods/*/aws_sqs_approximate_number_of_messages_visible_average_mysqs" | jq . { "kind": "MetricValueList", "apiVersion": "custom.metrics.k8s.io/v1beta1", "metadata": { "selfLink": "/apis/custom.metrics.k8s.io/v1beta1/namespaces/monitoring/pods/%2A/aws_sqs_approximate_number_of_messages_visible_average_mysqs" }, "items": [ { "describedObject": { "kind": "Pod", "namespace": "monitoring", "name": "cloudwatch-exporter-prometheus-cloudwatch-exporter-78f5d7cjst2z", "apiVersion": "/v1" }, "metricName": "aws_sqs_approximate_number_of_messages_visible_average_shalb", "timestamp": "2021-05-20T08:26:32Z", "value": "39", "selector": null } ] }
It's working.
It's working too!
Thanks for pointing this out, it now documented in the prometheus-adapter docs
--metrics-max-age=
Note: We recommend setting this only if you understand what is happening. For example, this setting could be useful in cases where the scrape duration is over a network call, e.g. pulling metrics from AWS CloudWatch, or Google Monitoring, more specifically, Google Monitoring sometimes have delays on when data will show up in their system after being sampled. This means that even if you scraped data frequently, they might not show up soon. If you configured the relist interval to a short period but without configuring this, you might not be able to see your metrics in the adapter in certain scenarios.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/reopen
/remove-lifecycle rotten
Please send feedback to sig-contributor-experience at kubernetes/community.
/close
@k8s-triage-robot: Closing this issue.
I want to use cloudwatch prometheus metrics as custom metrics. I have defined the custom rule as follows:
The list of custom metrics is empty.
kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1/ | jq . { "kind": "APIResourceList", "apiVersion": "v1", "groupVersion": "custom.metrics.k8s.io/v1beta1", "resources": [] }
I can query the cloudwatch metric from prometheus server directly.
curl -s 'http://prometheus-k8s.monitoring:9090/api/v1/series?match[]=aws_applicationelb_request_count_sum\{load_balancer="app/k8s-eksmonitoring-1bb56c3370/16ae9f70d2e5fad4",namespace!=""}' | jq .
{ "status": "success", "data": [ { "name": "aws_applicationelb_request_count_sum", "container": "prometheus-cloudwatch-exporter", "endpoint": "http", "exported_job": "aws_applicationelb", "instance": "192.168.2.127:9106", "job": "prometheus-cloudwatch-exporter", "load_balancer": "app/k8s-eksmonitoring-1bb56c3370/16ae9f70d2e5fad4", "namespace": "monitoring", "pod": "cw-prometheus-exporter-prometheus-cloudwatch-exporter-9fb7dc9ct", "service": "cw-prometheus-exporter-prometheus-cloudwatch-exporter" } ] }
Can you point out the issue?