google / cadvisor

Analyzes resource usage and performance characteristics of running containers.
Other
16.93k stars 2.31k forks source link

/api/v2.0/appmetrics response empty #991

Open SidneyAn opened 8 years ago

SidneyAn commented 8 years ago

I am running cadvisor to gather app metrics, but I get empty response in /api/v2.0/appmetrics. The image I run as a kubernets replication controller is made following the Dockerfile below.

 FROM nginx
 RUN mkdir /var/cadvisor/
 ADD ADD nginx_config.json /var/cadvisor/nginx_config.json
 LABEL io.cadvisor.metric.nginx="/var/cadvisor/nginx_config.json"

Addition, when I try api/v1.3/subcontainers, I can find "has_custom_metrics:true" in one metrics term. And the "custom_metrics" describes as the same as that in nginx_config.json .But it returns {"/":{..."has_custom_metrics":false,...}} in /api/v2.0/spec.

Any answer here will be appreciate!

vishh commented 8 years ago

cc @rjnagal

rjnagal commented 8 years ago

Hey @SidneyAn, can you provide the output for api/v1.3/subcontainer and cadvisor logs?

The custom_metrics: false line is for the root container, which is expected. The custom metrics are only enabled for nginx container in this case.

SidneyAn commented 8 years ago

@rjnagal Thanks for your explain of "The custom_metrics".

The cadvisor logs is follows:

I1202 19:25:14.020715 32388 container.go:430] Failed to update stats for container "/docker/0d3867aa94d32bf2e0cb1fb19186c77df4e8f3cbab85e7a76599f5771470302b": Error 0: Error 0: No match found for regexp: Active connections: ([0-9]+) for metric 'activeConnections' in config,Error 1: No match found for regexp: Reading: ([0-9]+) .* for metric 'reading' in config, continuing to push custom stats

And the output for api/v1.3/subcontainer is

[ "name":"/docker/...71470302b", "aliases":["k8s_nginx.b483cfba_nginx-y8wovdefault...","0d386...2b" ], "namespace":"docker", "spec":{ "creation_time":"2015-11-30T10:29:22.244952008Z", "labels":{ "io.cadvisor.metric.nginx":"/var/cadvisor/nginx_test.json", "io.kubernetes.pod.name":"default/nginx-y8wov" }, "has_cpu":true, "cpu":{ "limit":2,"max_limit":0,"mask":"0-7"}, ... "has_custom_metrics":true, "custom_metrics":[ {"name":"activeConnections","type":"gauge","format":"int","units":"number of active connections"}, {"name":"reading","type":"gauge","format":"int","units":"number of reading connections" } ], "image":"192.168.111.98:5000/artest:v1" }, "stats":[ { "timestamp":"2015-12-01T17:52:05.139747749+08:00", ... ]

SidneyAn commented 8 years ago

@rjnagal the endpoints in nginx_config.json is "http://localhost:2520/nginx_status" I start a simple HTTP server at port 2520 in the host machine.

python -m SimpleHTTPServer 2520

rjnagal commented 8 years ago

cAdvisor expects the status to have the form : Active connections: ([0-9]+) etc. What do you see on the status page (http://localhost:2520/nginx_status)

zqfan commented 7 years ago

I come across same issue with a different reason, all 2.0 api cannot return custom metrics and has_custom_metrics is always false, while 1.3 api can return the custom metric properly!

the custom metric endpoint is heapster v1.2.0 api /metrics, the kubelet/cAdvisor log shows "Error 0: Get http://localhost:8082/metrics: EOF, continuing to push custom stats"

I cannot get help from google search and github issues, any hint? Thanks

zqfan commented 7 years ago

I find that if I specify to only fetch limited metrics such as add field: "metrics_config":["go_gc_duration_seconds"], then this issue will not happen, but if the specified metric is in the middle of the returned metrics (401 entries in total), then this issue will raise again

so I guess the http lib cadvisor used to fetch custom metric endpoint has problem when the response is a bit large (33K in my case), if I don't specify the metrics_config field, then only 41/401 metrics will be collected, with error message get ... EOF in log

zqfan commented 7 years ago

further investigation shows it seems related to the returned result, but I'm still confused

for example

# HELP go_gc_duration_seconds A summary of the GC invocation durations.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 0.000390958
go_gc_duration_seconds{quantile="0.25"} 0.0005863660000000001
go_gc_duration_seconds{quantile="0.5"} 0.000699541
go_gc_duration_seconds{quantile="0.75"} 0.097823592
go_gc_duration_seconds{quantile="1"} 0.199161746
go_gc_duration_seconds_sum 0.904787788
go_gc_duration_seconds_count 27

if I specify go_gc_duration_seconds then it works fine, without error message. But go_gc_duration_seconds_sum and go_gc_duration_seconds_count will not be collected

in another case

# HELP heapster_scraper_duration_microseconds Time spent scraping sources in microseconds.
# TYPE heapster_scraper_duration_microseconds summary
heapster_scraper_duration_microseconds{source="kubelet_summary:10.245.1.3:10255",quantile="0.5"} 17.336
heapster_scraper_duration_microseconds{source="kubelet_summary:10.245.1.3:10255",quantile="0.9"} 64.359
heapster_scraper_duration_microseconds{source="kubelet_summary:10.245.1.3:10255",quantile="0.99"} 64.359
heapster_scraper_duration_microseconds_sum{source="kubelet_summary:10.245.1.3:10255"} 241.91500000000002
heapster_scraper_duration_microseconds_count{source="kubelet_summary:10.245.1.3:10255"} 6

if I specify heapster_scraper_duration_microseconds, then it will raise error message Get ... EOF in log, however API v2.0 works fine, but again, the heapster_scraper_duration_microseconds_sum and heapster_scraper_duration_microseconds_count will not be collected

It seems go_gc_duration_seconds and heapster_scraper_duration_microseconds are both valid here but they are treated differently, and why sum and count metrics are not collected?

thanks

abushoeb commented 7 years ago

Hello, I have the same problem and don't see any values for custom metrics. When I check cAdvisor at http://:4194/api/v2.1/spec, it shows me this:

{ /: { creation_time: "2017-06-21T21:38:08.53Z", has_cpu: true, cpu: { limit: 1024, max_limit: 0, mask: "0-5" }, has_memory: true, memory: { limit: 10316423168, reservation: 9223372036854772000 }, has_custom_metrics: false, has_network: true, has_filesystem: true, has_diskio: true } }

Could somebody tell me how can I enable has_custom_metrics to true? Also how can I find cAdvisor logs and version?