luxas / kubeadm-workshop

Showcasing a bare-metal multi-platform kubeadm setup with persistent storage and monitoring
MIT License
679 stars 190 forks source link

Custom metrics value not propagating to hpa #19

Closed julianstephen closed 7 years ago

julianstephen commented 7 years ago

I am using the latest version (as of when this issue was reported. Commit 1e696ace333..). There seems to be an issue with hpa getting the value of the custom-metric exposed by sample-metrics-app. When I curl the custom-metrics api url https://${CM_API}/apis/custom-metrics.metrics.k8s.io/.....http_requests_total, the value field in the result result json is always 0. Even after I run the load generator. I checked kubectl logs custom-metrics-apiserver logs to find that the prometheus query issued by the metrics-apiserver is always getting an empty vector

validate%!(EXTRA string=http_requests_total, bool=true)sum(rate(http_requests_total{namespace="default",svc_name="sample-metrics-app"}[60s]))
http://sample-metrics-prom.default.svc:9090/api/v1/query?query=sum%28rate%28http_requests_total%7Bnamespace%3D%22default%22%2Csvc_name%3D%22sample-metrics-app%22%7D%5B60s%5D%29%29&time=2017-06-28T20%3A39%3A54.076714493Z
vector doesn't contain any elements
metricFor { services} default sample-metrics-app http_requests_total 0 60

The sample-metrics app is exposing the value correctly. When I curl sample-metrics-app-ip:9090/metrics, I can see

# HELP http_requests_total The amount of requests served by the server in total
# TYPE http_requests_total counter
http_requests_total 49642

When I curl the prometheus end-point, things get a little fishy. I see

# HELP http_requests_total Total number of HTTP requests made.
# TYPE http_requests_total counter
http_requests_total{code="200",handler="label_values",method="get"} 60
http_requests_total{code="200",handler="prometheus",method="get"} 8
http_requests_total{code="200",handler="query",method="get"} 161
http_requests_total{code="200",handler="status",method="get"} 797
http_requests_total{code="400",handler="query",method="get"} 1

These values doesn't seem to reflect the value exposed by sample-metrics-app. I am not sure if the link to Prometheus is broken from both sides (prom to app and prom to apiserver).

As a quick aside, with the upgrage to rc-1, custom metrics APIService needs two additional params:

The APIService "v1alpha1.custom-metrics.metrics.k8s.io" is invalid: 
* spec.groupPriorityMinimum: Invalid value: 0: must be positive and less than 20000
* spec.versionPriority: Invalid value: 0: must be positive and less than 1000

I don't believe this has any effect in what I am reporting, but for completeness sake, I set groupPriorityMinimum to 200 and versionPriority to 20.

[Update]: Just tried with the previous commit (before udpatating prom-operator version) and things seem fine 👍. Though in both versions, deleting the prometheus object (kubectl delete prometheus sample-metrics-prom) seem to slowdown all further queries to api-server.

luxas commented 7 years ago

@julianstephen Sorry for the state-of-flux right now, yes that's the case. For v1.7 I'm updating to use @DirectXMan12's adapter instead of my hand-hacked one: https://github.com/DirectXMan12/k8s-prometheus-adapter

Also, as pointed out, the priority scheme changed from AA alpha to beta, will fix. Feel free to send a PR any time though when you find things like this :)

julianstephen commented 7 years ago

Thanks. I will do that.