Open isabelnoronha61 opened 4 years ago
Interesting... The "no such file or directory" errors usually indicate that the cgroup was created and then immediately removed before cAdvisor was able to process the event. Generally, they are safe to ignore.
You should probably also specify the cAdvisor version explicitly. We had to stop pushing images with the "latest" tag recently, as we implemented an immutable image policy recently. Do you know what version is actually running?
You should probably also specify the cAdvisor version explicitly. We had to stop pushing images with the "latest" tag recently, as we implemented an immutable image policy recently. Do you know what version is actually running? Today I pulled the latest image gcr.io/google_containers/cadvisor:v0.36.0 On my monitoring host this is the snippet of docker-compose file: cadvisor: image: gcr.io/google_containers/cadvisor:v0.36.0 volumes:
- /:/rootfs:ro
- /var/run:/var/run:rw
- /sys:/sys:ro
- /var/lib/docker/:/var/lib/docker:ro
- /cgroup/cpu:/cgroup/cpu
- /cgroup/cpuacct:/cgroup/cpuacct
- /cgroup/cpuset:/cgroup/cpuset
- /cgroup/memory:/cgroup/memory
- /cgroup/blkio:/cgroup/blkio
- /cgroup:/sys/fs/cgroup:ro
- /cgroup:/cgroup:ro privileged: true ports:
- 8080:8080 command:
- --allow_dynamic_housekeeping=true
- --housekeeping_interval=30s
- --global_housekeeping_interval=2m
- --disable_metrics=disk,tcp,udp
- --docker_only=true cadvisor logs: I0514 05:57:31.939039 1 manager.go:1148] Exiting thread watching subcontainers I0514 05:57:31.939072 1 manager.go:365] Exiting global housekeeping thread I0514 05:57:31.939092 1 cadvisor.go:231] Exiting given signal: terminated I0514 07:22:52.001142 1 manager.go:1148] Exiting thread watching subcontainers I0514 07:22:52.001212 1 manager.go:365] Exiting global housekeeping thread I0514 07:22:52.001281 1 cadvisor.go:231] Exiting given signal: terminated
However, the same docker-compose file running on a target which contains around 2K containers gives following log. F0514 07:48:34.137076 1 cadvisor.go:188] Failed to start container manager: inotify_add_watch /sys/fs/cgroup/cpu,cpuacct: no space left on device F0514 07:48:48.975485 1 cadvisor.go:188] Failed to start container manager: inotify_add_watch /sys/fs/cgroup/cpuset: no space left on device F0514 07:49:06.327658 1 cadvisor.go:188] Failed to start container manager: inotify_add_watch /sys/fs/cgroup/cpu,cpuacct: no space left on device F0514 07:49:35.141404 1 cadvisor.go:188] Failed to start container manager: inotify_add_watch /sys/fs/cgroup/devices: no space left on device F0514 07:50:45.864350 1 cadvisor.go:188] Failed to start container manager: inotify_add_watch /sys/fs/cgroup/pids: no space left on device F0514 07:51:00.594952 1 cadvisor.go:188] Failed to start container manager: inotify_add_watch /sys/fs/cgroup/memory: no space left on device F0514 07:51:22.240519 1 cadvisor.go:188] Failed to start container manager: inotify_add_watch /sys/fs/cgroup/devices: no space left on device F0514 07:51:37.171702 1 cadvisor.go:188] Failed to start container manager: inotify_add_watch /sys/fs/cgroup/pids: no space left on device F0514 07:51:52.068816 1 cadvisor.go:188] Failed to start container manager: inotify_add_watch /sys/fs/cgroup/cpu,cpuacct: no space left on device F0514 07:52:17.045079 1 cadvisor.go:188] Failed to start container manager: inotify_add_watch /sys/fs/cgroup/cpu,cpuacct: no space left on device F0514 07:52:32.120150 1 cadvisor.go:188] Failed to start container manager: inotify_add_watch /sys/fs/cgroup/devices: no space left on device F0514 07:52:47.124536 1 cadvisor.go:188] Failed to start container manager: inotify_add_watch /sys/fs/cgroup/pids: no space left on device F0514 07:53:05.982415 1 cadvisor.go:188] Failed to start container manager: inotify_add_watch /sys/fs/cgroup/devices: no space left on device F0514 07:53:20.969768 1 cadvisor.go:188] Failed to start container manager: inotify_add_watch /sys/fs/cgroup/cpu,cpuacct: no space left on device F0514 07:53:35.691449 1 cadvisor.go:188] Failed to start container manager: inotify_add_watch /sys/fs/cgroup/blkio: no space left on device F0514 07:53:50.408721 1 cadvisor.go:188] Failed to start container manager: inotify_add_watch /sys/fs/cgroup/cpu,cpuacct: no space left on device F0514 07:54:15.782048 1 cadvisor.go:188] Failed to start container manager: inotify_add_watch /sys/fs/cgroup/cpu,cpuacct: no space left on device F0514 07:54:30.962349 1 cadvisor.go:188] Failed to start container manager: inotify_add_watch /sys/fs/cgroup/memory: no space left on device F0514 07:54:45.685793 1 cadvisor.go:188] Failed to start container manager: inotify_add_watch /sys/fs/cgroup/devices: no space left on device F0514 07:55:00.551526 1 cadvisor.go:188] Failed to start container manager: inotify_add_watch /sys/fs/cgroup/cpu,cpuacct: no space left on device W0514 07:56:01.662330 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/memory/user.slice/user-0.slice/session-7648.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/memory/user.slice/user-0.slice/session-7648.scope: no such file or directory W0514 07:56:01.662576 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/devices/user.slice/user-0.slice/session-7648.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/devices/user.slice/user-0.slice/session-7648.scope: no such file or directory W0514 07:56:01.710082 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/pids/user.slice/user-0.slice/session-7648.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/pids/user.slice/user-0.slice/session-7648.scope: no such file or directory 2020/05/14 07:56:22 http: superfluous response.WriteHeader call from github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:306) I0514 07:56:32.876908 1 manager.go:1148] Exiting thread watching subcontainers I0514 07:56:32.876978 1 manager.go:365] Exiting global housekeeping thread I0514 07:56:32.877026 1 cadvisor.go:231] Exiting given signal: terminated W0514 07:57:09.568674 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/blkio/user.slice/user-0.slice/session-7649.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/blkio/user.slice/user-0.slice/session-7649.scope: no such file or directory W0514 07:57:09.568885 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/memory/user.slice/user-0.slice/session-7649.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/memory/user.slice/user-0.slice/session-7649.scope: no such file or directory W0514 07:57:09.568959 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/devices/user.slice/user-0.slice/session-7649.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/devices/user.slice/user-0.slice/session-7649.scope: no such file or directory W0514 07:57:09.569006 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/pids/user.slice/user-0.slice/session-7649.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/pids/user.slice/user-0.slice/session-7649.scope: no such file or directory W0514 07:58:04.142329 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/memory/user.slice/user-0.slice/session-7650.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/memory/user.slice/user-0.slice/session-7650.scope: no such file or directory W0514 07:58:04.223811 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/devices/user.slice/user-0.slice/session-7650.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/devices/user.slice/user-0.slice/session-7650.scope: no such file or directory W0514 07:58:04.255806 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/pids/user.slice/user-0.slice/session-7650.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/pids/user.slice/user-0.slice/session-7650.scope: no such file or directory W0514 08:00:02.283887 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/cpu,cpuacct/system.slice/sysstat-collect.service": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/cpu,cpuacct/system.slice/sysstat-collect.service: no such file or directory W0514 08:00:02.284123 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/blkio/system.slice/sysstat-collect.service": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/blkio/system.slice/sysstat-collect.service: no such file or directory W0514 08:00:02.284217 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/memory/system.slice/sysstat-collect.service": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/memory/system.slice/sysstat-collect.service: no such file or directory W0514 08:00:02.284277 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/devices/system.slice/sysstat-collect.service": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/devices/system.slice/sysstat-collect.service: no such file or directory W0514 08:00:02.284342 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/pids/system.slice/sysstat-collect.service": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/pids/system.slice/sysstat-collect.service: no such file or directory W0514 08:00:02.285099 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/cpu,cpuacct/user.slice/user-0.slice/session-7652.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/cpu,cpuacct/user.slice/user-0.slice/session-7652.scope: no such file or directory W0514 08:00:02.285342 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/blkio/user.slice/user-0.slice/session-7652.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/blkio/user.slice/user-0.slice/session-7652.scope: no such file or directory W0514 08:00:02.285560 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/memory/user.slice/user-0.slice/session-7652.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/memory/user.slice/user-0.slice/session-7652.scope: no such file or directory W0514 08:00:02.285812 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/devices/user.slice/user-0.slice/session-7652.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/devices/user.slice/user-0.slice/session-7652.scope: no such file or directory W0514 08:00:02.285943 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/pids/user.slice/user-0.slice/session-7652.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/pids/user.slice/user-0.slice/session-7652.scope: no such file or directory 2020/05/14 08:00:28 http: superfluous response.WriteHeader call from github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:306) W0514 08:01:02.167895 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/cpu,cpuacct/user.slice/user-0.slice/session-7653.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/cpu,cpuacct/user.slice/user-0.slice/session-7653.scope: no such file or directory W0514 08:01:03.218814 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/blkio/user.slice/user-0.slice/session-7653.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/blkio/user.slice/user-0.slice/session-7653.scope: no such file or directory W0514 08:01:03.218994 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/memory/user.slice/user-0.slice/session-7653.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/memory/user.slice/user-0.slice/session-7653.scope: no such file or directory W0514 08:01:03.219060 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/devices/user.slice/user-0.slice/session-7653.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/devices/user.slice/user-0.slice/session-7653.scope: no such file or directory W0514 08:01:03.219126 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/pids/user.slice/user-0.slice/session-7653.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/pids/user.slice/user-0.slice/session-7653.scope: no such file or directory W0514 08:02:02.624902 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/cpu,cpuacct/user.slice/user-0.slice/session-7654.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/cpu,cpuacct/user.slice/user-0.slice/session-7654.scope: no such file or directory W0514 08:02:02.688773 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/blkio/user.slice/user-0.slice/session-7654.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/blkio/user.slice/user-0.slice/session-7654.scope: no such file or directory W0514 08:02:02.689125 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/memory/user.slice/user-0.slice/session-7654.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/memory/user.slice/user-0.slice/session-7654.scope: no such file or directory W0514 08:02:02.689252 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/devices/user.slice/user-0.slice/session-7654.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/devices/user.slice/user-0.slice/session-7654.scope: no such file or directory W0514 08:02:02.770711 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/pids/user.slice/user-0.slice/session-7654.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/pids/user.slice/user-0.slice/session-7654.scope: no such file or directory 2020/05/14 08:04:22 http: superfluous response.WriteHeader call from github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:306) W0514 08:05:01.957659 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/devices/user.slice/user-0.slice/session-7657.scope": 0x40000100 == IN_CREATE|IN_ISDIR): open /sys/fs/cgroup/devices/user.slice/user-0.slice/session-7657.scope: no such file or directory W0514 08:05:01.958081 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/pids/user.slice/user-0.slice/session-7657.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/pids/user.slice/user-0.slice/session-7657.scope: no such file or directory W0514 08:06:02.058847 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/cpu,cpuacct/user.slice/user-0.slice/session-7658.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/cpu,cpuacct/user.slice/user-0.slice/session-7658.scope: no such file or directory W0514 08:06:02.106817 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/blkio/user.slice/user-0.slice/session-7658.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/blkio/user.slice/user-0.slice/session-7658.scope: no such file or directory W0514 08:06:02.442804 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/memory/user.slice/user-0.slice/session-7658.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/memory/user.slice/user-0.slice/session-7658.scope: no such file or directory W0514 08:06:02.787811 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/devices/user.slice/user-0.slice/session-7658.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/devices/user.slice/user-0.slice/session-7658.scope: no such file or directory W0514 08:06:03.747833 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/pids/user.slice/user-0.slice/session-7658.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/pids/user.slice/user-0.slice/session-7658.scope: no such file or directory 2020/05/14 08:06:23 http: superfluous response.WriteHeader call from github.com/prometheus/client_golang/prometheus/promhttp.httpError (http.go:306) W0514 08:07:02.943069 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/cpu,cpuacct/user.slice/user-0.slice/session-7659.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/cpu,cpuacct/user.slice/user-0.slice/session-7659.scope: no such file or directory W0514 08:07:02.945246 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/blkio/user.slice/user-0.slice/session-7659.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/blkio/user.slice/user-0.slice/session-7659.scope: no such file or directory W0514 08:07:02.945510 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/memory/user.slice/user-0.slice/session-7659.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/memory/user.slice/user-0.slice/session-7659.scope: no such file or directory W0514 08:07:02.945722 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/devices/user.slice/user-0.slice/session-7659.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/devices/user.slice/user-0.slice/session-7659.scope: no such file or directory W0514 08:07:02.946031 1 watcher.go:87] Error while processing event ("/sys/fs/cgroup/pids/user.slice/user-0.slice/session-7659.scope": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/pids/user.slice/user-0.slice/session-7659.scope: no such file or directory
I increased max_user_watches like below: sudo sysctl fs.inotify.max_user_watches=1048576
is cAdvisor capable of monitoring around 2K containers? When I use the top command I get cAdvisor exceeding CPU usage to 1000%. I have configured various run-time flags.
Try with --disable_metrics=percpu,hugetlb,sched,tcp,udp,advtcp,disk
After upgrading to v0.36.0 I don't get system/slice metrics anymore.
Try with --disable_metrics=percpu,hugetlb,sched,tcp,udp,advtcp,disk
Still, it's going beyond 500%!! Are there any tweaks? cadvisor:
image: gcr.io/google_containers/cadvisor:v0.36.0
volumes:
- /:/rootfs:ro
- /var/run:/var/run:rw
- /sys:/sys:ro
- /var/lib/docker/:/var/lib/docker:ro
#- /cgroup/cpu:/cgroup/cpu
#- /cgroup/cpuacct:/cgroup/cpuacct
#- /cgroup/cpuset:/cgroup/cpuset
#- /cgroup/memory:/cgroup/memory
#- /cgroup/blkio:/cgroup/blkio
#- /cgroup:/sys/fs/cgroup:ro
- /cgroup:/cgroup:ro
ports:
- 8080:8080
privileged: true
command:
- --allow_dynamic_housekeeping=true
- --housekeeping_interval=5m
- --global_housekeeping_interval=2m
- --disable_metrics=percpu,sched,tcp,udp,advtcp,disk,network
- --docker_only=true
restart: always
deploy:
mode: global
Is it because the number of containers is 2K?
I havent ever run with that many before, so it is definitely possible.
Okay is there any way I can do some kind of workaround and lower the CPU consumption? This is going to be not just in one server but at least 14 servers. I am using such a huge no. of containers for simulation.
Can cAdvisor do some kind of load balancing while doing discovery for the container metrics?
I'm not sure I understand the last comment. In general, I welcome any performance improvements, so if you can generate any perf flamegraphs or otherwise, I am happy to help identify the primary consumers of CPU time, and figure out how to optimize cadvisor to meet your needs.
@dashpole Is there any other way to get proper CPU stats ?
it almost looks like perf events are enabled... I can see __perf_event_task_schedule... and __intel_pmu_enable... cc @iwankgb.
@isabelnoronha61 can you try the previous version with the same args? v0.35.0 IIRC.
The actual cadvisor go code's use is on the left side. Serving requests (the far left) is about half of that usage, and the other portion is likely for scanning cgroups and collecting the metrics.
@dashpole @isabelnoronha61 I will take a look at this in the European evening but what seems to be weird to be is that we can't see perf_event_open
syscall (it would be 298 rather than 64).
@isabelnoronha61 is there any chance that there is another application collecting perf events running on the host? Or resctr
filesystem (memory bandwidth and cache allocation and monitoring) is used? It would explain all these writes to MSRs.
BTW - I don't thing @isabelnoronha61 enabled perf events so they should not be collected at all; unless there is some insane bug causing cAdvisor to collect some events when no configuration is provided.
@isabelnoronha61 if you were using perf tool to generate flame graph then you probably affected overall system performance: each time context switch occurred MSRs must have been read from and written to, otherwise counters would not store valid values. Is there any chance to zoom on the right half of the image? This is were the answer is, I think.
@dashpole I don't think it's related to perf in cAdvisor but it's definitely side effect of using perf in general. runtime.findrunnable()
is responsible for finding a goroutine waiting for execution and if writing to MSRs happens upper in the stack then it is related to measuring cAdvisor performance, I believe.
Ah, that makes sense. Sorry for the goose chase. @isabelnoronha61 if you could share the svg, that would be helpful.
Also, can you share cAdvisor's CPU usage, in cores during the run?
@iwankgb Yeah I'm not using perf in cadvisor. I'm taking the system stats. Here is the CPU usage based on cores When cadvisor is 500%
This is very odd I don't understand!
Even though cadvisor is just 123%
Ah, that makes sense. Sorry for the goose chase. @isabelnoronha61 if you could share the svg, that would be helpful.
Also, can you share the cAdvisor's CPU usage, in cores during the run?
I tried sending svg but git doesn't allow svg format.
I think that cAdvisor's CPU usage will depend on number of monitored containers. As @dashpole mentioned above 2000 is quite a large number. You can take a look at #1498 - some useful advice might be hidden there. You can try to rebuild cAdvisor with pprof support - it might get more useful information on what is causing the problem.
@iwankgb yeah sure. Could you have a look at this png?
It seems to me that Prometheus exposition format and HTTP response compression are your problem. You can try to use storage driver instead of Prometheus but I have no idea if it's feasible in your case and I won't promise you that it will help. If not then it might be possible to disabl compression in Prometheus client (you'll have to verify it).
Yep. Serving all of the metrics in prometheus format is the main user of CPU it seems. This is where something like the opentelemetry format would be useful to have...
The json endpoints are likely even worse. I'm not sure about the storage drivers.
So can the scraped metrics from cAdvisor be stored directly in VictoriaMetrics using storage driver flags instead of Prometheus TSDB? Then in prometheus config make use of remote_read: and continue with promql queries and render ing on grafana?
It looks doable judging by VictoriaMetrics readme.
docker-compose.yml version: '3.7' services: cadvisor: image: google/cadvisor volumes:
--docker_only=true restart: always deploy: mode: global
node-exporter: image: prom/node-exporter volumes:
9100:9100 restart: always deploy: mode: global
cAdvisor logs I0504 13:23:36.062368 1 manager.go:1212] Exiting thread watching subcontainers I0504 13:23:36.062425 1 manager.go:432] Exiting global housekeeping thread I0504 13:23:36.062475 1 cadvisor.go:212] Exiting given signal: terminated W0504 13:23:53.357364 1 manager.go:349] Could not configure a source for OOM detection, disabling OOM events: open /dev/kmsg: no such file or directory W0504 13:24:24.074584 1 manager.go:349] Could not configure a source for OOM detection, disabling OOM events: open /dev/kmsg: no such file or directory W0504 18:30:02.274749 1 raw.go:87] Error while processing event ("/sys/fs/cgroup/cpu,cpuacct/system.slice/unbound-anchor.service": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/cpu,cpuacct/system.slice/unbound-anchor.service: no such file or directory W0504 18:30:02.274954 1 raw.go:87] Error while processing event ("/sys/fs/cgroup/blkio/system.slice/unbound-anchor.service": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/blkio/system.slice/unbound-anchor.service: no such file or directory W0504 18:30:02.275047 1 raw.go:87] Error while processing event ("/sys/fs/cgroup/memory/system.slice/unbound-anchor.service": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/memory/system.slice/unbound-anchor.service: no such file or directory W0504 18:30:02.275104 1 raw.go:87] Error while processing event ("/sys/fs/cgroup/devices/system.slice/unbound-anchor.service": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/devices/system.slice/unbound-anchor.service: no such file or directory W0504 23:39:03.037073 1 raw.go:87] Error while processing event ("/sys/fs/cgroup/cpu,cpuacct/system.slice/dnf-makecache.service": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/cpu,cpuacct/system.slice/dnf-makecache.service: no such file or directory W0504 23:39:03.472951 1 raw.go:87] Error while processing event ("/sys/fs/cgroup/blkio/system.slice/dnf-makecache.service": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/blkio/system.slice/dnf-makecache.service: no such file or directory W0504 23:39:03.523233 1 raw.go:87] Error while processing event ("/sys/fs/cgroup/memory/system.slice/dnf-makecache.service": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/memory/system.slice/dnf-makecache.service: no such file or directory W0504 23:39:03.524163 1 raw.go:87] Error while processing event ("/sys/fs/cgroup/devices/system.slice/dnf-makecache.service": 0x40000100 == IN_CREATE|IN_ISDIR): inotify_add_watch /sys/fs/cgroup/devices/system.slice/dnf-makecache.service: no such file or directory W0505 03:40:10.252789 1 container.go:409] Failed to create summary reader for "/system.slice/dnf-makecache.service": none of the resources are being tracked.
CPU usage goes beyond 100%.