Closed eero-t closed 3 years ago
If I don't mount "/dev/kmsg" cAdvisor complains at start about not being able to read something which I had disabled ("oom_event"):
W0623 16:50:21.655644 1 manager.go:289] Could not configure a source for OOM detection, disabling OOM events: open /dev/kmsg: no such file or directory
PS. I actually want cAdvisor to have access only to sysfs, with no capabilities nor root i.e: --cap-drop ALL --user 65534:65534 --volume=/sys:/sys:ro
and it seems to be working ok, but it still does these extra reads.
I believe that this list is used only for Prometheus: https://github.com/google/cadvisor/blob/master/metrics/prometheus.go#L108
cAdvisor still collects these data but not serving metrics.
I'll try to change it :)
The reason for that is reading all cgroup stats, then keep only enabled.
The GetStats()
came from runc/libcontainer
package.
I'll try to introduce an internal function that will take into consideration enabled metrics.
Actually the issue I'm more worried about is there being no support for disabling rest of the metrics, ones that are redundant because kubelet already provides them (with its own vendored cAdvisor version).
And it seems that there are no options to disable extra endpoints (e.g. JSON API ones, if one is interested only on Prometheus metrics), like kubelet has. IMHO that is less of an issue that the redundant metrics though.
There are other tickets which touch on these subjects though, so I was hesitant to add my own for them...
https://github.com/google/cadvisor/pull/2900 changes look good to me. I'll test them later (hopefully today).
Looks much better with #2900 (this is all files accessed during 1 minute):
# strace -f -e openat -p $(pidof cadvisor) 2> cadvisor-trace.txt
^C
# grep openat cadvisor-trace.txt | grep -v "openat resumed" | cut -d'"' -f2 | sed 's%^.*/%%' | sort | uniq -c | sort -nr
2002 cpu.shares
2002 cpu.cfs_quota_us
2002 cpu.cfs_period_us
1976 pids.max
134 system.slice
133 wpa_supplicant.service
133 user.slice
133 upower.service
133 udisks2.service
133 thermald.service
133 system-systemd\\x2dfsck.slice
133 system-modprobe.slice
133 system-getty.slice
133 systemd-udevd.service
133 systemd-timesyncd.service
133 systemd-resolved.service
133 systemd-logind.service
133 systemd-journald.service
133 switcheroo-control.service
133 ssh.service
133 rtkit-daemon.service
133 rsyslog.service
133 rpc-statd.service
133 rpcbind.socket
133 rpcbind.service
133 rngd.service
133 polkit.service
133 NetworkManager.service
133 networkd-dispatcher.service
133 ModemManager.service
133 lightdm.service
133 kubepods.slice
133 kubepods-burstable.slice
133 kubepods-burstable-podf764b2f6_7ccc_467f_a506_d83705ab75d5.slice
133 kubepods-burstable-podde1b6d0f_d0b8_45ee_b392_fae13ffd25f2.slice
133 kubepods-burstable-podbb570f97_21cd_4a07_aed1_2db2111543e7.slice
133 kubepods-besteffort.slice
133 kubepods-besteffort-podf5976133_f5d4_46e1_9033_e65b64ebb6fd.slice
133 kubepods-besteffort-podaa345d81_8b41_42f2_866a_aeaab745c79a.slice
133 kubepods-besteffort-poda641626f_2edd_4b20_bf1a_534530313705.slice
133 kubepods-besteffort-pod94ed4ee4_5e0b_4882_a19f_17393cfdf6ce.slice
133 kubepods-besteffort-pod74a0e475_3bd6_452c_b6bd_705fd68cc204.slice
133 kubepods-besteffort-pod64452191_4eee_4aac_b027_902fe377d294.slice
133 kubepods-besteffort-pod51bb93fa_b1f7_4586_b9d2_0888026354c6.slice
133 kubepods-besteffort-pod492f22f8_d5f9_4542_8976_b490f1539ee0.slice
133 kubelet.service
133 irqbalance.service
133 docker.socket
133 docker.service
133 dbus.service
133 cron.service
133 containerd.service
133 colord.service
133 avahi-daemon.service
133 accounts-daemon.service
54 cpu,cpuacct
52 stat
52 os-release
52 meminfo
27 gpu
27 devices
26 limits
26 fd
4 kubepods-burstable-podbb570f97_21cd_4a07_aed1_2db2111543e7.slice:cri-containerd:d379fba6e1befb1df3d4ab201c97382c0fa8f5ae50eeaef8b21f7c056302752e
4 kubepods-burstable-podbb570f97_21cd_4a07_aed1_2db2111543e7.slice:cri-containerd:bcbd176f8fd124740b60e8a806fd0114ca1168a639e839ef8be51230f97ee5d5
4 kubepods-besteffort-podaa345d81_8b41_42f2_866a_aeaab745c79a.slice:cri-containerd:573e49a970a6bb43d1a1c0af196e8f4abe098dcda51f9bcf51682802be8f4757
4 kubepods-besteffort-podaa345d81_8b41_42f2_866a_aeaab745c79a.slice:cri-containerd:4e03e0ca2d9e15492f301b8c9f52b87b0ac5996136d46c09f8173655ee98a3eb
4 kubepods-besteffort-poda641626f_2edd_4b20_bf1a_534530313705.slice:cri-containerd:f2232a0b70bbb8b21dc89361f19784e214e3fcf1e5384360c29ed14a8e3739ef
4 kubepods-besteffort-poda641626f_2edd_4b20_bf1a_534530313705.slice:cri-containerd:85b8b47e77b49b15d6c28a15ab7236b84218cced1cce0850784e227100884ab5
4 kubepods-besteffort-pod94ed4ee4_5e0b_4882_a19f_17393cfdf6ce.slice:cri-containerd:714e5cbdce066b9efc5189a4fe08b3fbc0619c564546820a3f0822d0593ff9b1
4 kubepods-besteffort-pod94ed4ee4_5e0b_4882_a19f_17393cfdf6ce.slice:cri-containerd:19ff66188a3110cbd1569ab2187d84cc0938a8aa7e8632365bef255e0597b902
4 kubepods-besteffort-pod492f22f8_d5f9_4542_8976_b490f1539ee0.slice:cri-containerd:e7b531d0e5758213b7a4e811829f8928eff747484fba81b1bb4e720829c60614
4 kubepods-besteffort-pod492f22f8_d5f9_4542_8976_b490f1539ee0.slice:cri-containerd:bb49bd2162bd63f23bcd4d76ff27b23193082109d7f5be323b63e7981eb08030
4 kubepods-besteffort-pod492f22f8_d5f9_4542_8976_b490f1539ee0.slice:cri-containerd:563a3e209fd4aaef791496b6202894f5fd270261490de4cbb17f23038260fa51
3 sys-kernel-tracing.mount
3 sys-kernel-debug.mount
3 sys-kernel-config.mount
3 sys-fs-fuse-connections.mount
3 run-rpc_pipefs.mount
3 kubepods-burstable-podf764b2f6_7ccc_467f_a506_d83705ab75d5.slice:cri-containerd:b9d5806912a7f8147b0b1682988122043e83f13e1a0e48692194ba6688a13567
3 kubepods-burstable-podf764b2f6_7ccc_467f_a506_d83705ab75d5.slice:cri-containerd:3342463ca3fca5320b3edde1ba722390f5093388b6abf7f75f5c0bd9198b52e0
3 kubepods-burstable-podf764b2f6_7ccc_467f_a506_d83705ab75d5.slice:cri-containerd:2fc9a01728091828187e97e8773e17961559a65edf85d35ca186fb443f7e41f4
3 kubepods-burstable-podde1b6d0f_d0b8_45ee_b392_fae13ffd25f2.slice:cri-containerd:d4b486fdacb92259464d0aa8c87963837779c6cecf8c4bf7a8263ca43b2394cc
3 kubepods-burstable-podde1b6d0f_d0b8_45ee_b392_fae13ffd25f2.slice:cri-containerd:6125d4d388bd7474fb4179ca07338b78c036f89124303236f432ace0c264cea4
3 kubepods-burstable-podde1b6d0f_d0b8_45ee_b392_fae13ffd25f2.slice:cri-containerd:01611422d4d25c5200d7307fffad7925b5a71fb46586c929a46d900ebe71203b
3 kubepods-besteffort-podf5976133_f5d4_46e1_9033_e65b64ebb6fd.slice:cri-containerd:af89e80c9dc0460aaa99cc9b614264d2df6bca91d8f906f88fd06546a01ad7d1
3 kubepods-besteffort-podf5976133_f5d4_46e1_9033_e65b64ebb6fd.slice:cri-containerd:1c1a013377bb85563fa10a62bc994fe2eae142f396f78374f751ea15930483d0
3 kubepods-besteffort-pod74a0e475_3bd6_452c_b6bd_705fd68cc204.slice:cri-containerd:dfb211ab4fd56dbc3c3284e35068862ace7e890ab16676bf9772e28af84aa83c
3 kubepods-besteffort-pod74a0e475_3bd6_452c_b6bd_705fd68cc204.slice:cri-containerd:501de36cdfc9ad8ec7fd7a35b25a0b6a3c7f8f9dd5b0cdca802a920cd04bee86
3 kubepods-besteffort-pod64452191_4eee_4aac_b027_902fe377d294.slice:cri-containerd:5a78c22f268c5f3158b177d349275bc948c6429cce8e269ce77e856e62ebf531
3 kubepods-besteffort-pod64452191_4eee_4aac_b027_902fe377d294.slice:cri-containerd:35a072c279da2debb7264522f5ac6e1d68adfb47c90b44124c7c3f49664a068c
3 kubepods-besteffort-pod51bb93fa_b1f7_4586_b9d2_0888026354c6.slice:cri-containerd:768487bafd91b81dd752eb2b79da9a1f5e422b8504bbb09cc0d22a4faf60c055
3 kubepods-besteffort-pod51bb93fa_b1f7_4586_b9d2_0888026354c6.slice:cri-containerd:63fe9be1dabb42c6a84e8662a191d6934126640bc6ba119216691909b3028dc5
3 dev-mqueue.mount
3 dev-hugepages.mount
3 boot-efi.mount
While it's not part of this bug, I'm wondering does cAdvisor really expect that OS release could change every second? All OS updates I've done, have taken minutes, not seconds, even from local 100MB/s mirror to SSD...
I don't map host rootfs to container i.e. OS release and service updates happen only when the image itself is updated, so it would be good to get rid of that kind of (redundant to me) CPU overhead too.
Should I file a new bug for that?
Setup:
Use-case:
docker run -it --volume=/:/rootfs:ro --volume=/var/run:/var/run:ro --volume=/sys:/sys:ro --device=/dev/kmsg --publish=8080:8080 cadvisor:latest /usr/bin/cadvisor -disable_metrics accelerator,advtcp,cpuLoad,cpu_topology,cpuset,disk,diskIO,hugetlb,memory,memory_numa,network,oom_event,percpu,perf_event,process,referenced_memory,resctrl,sched,tcp,udp
strace -f -e openat -p $(pidof cadvisor) 2> trace.txt
Expected output:
Actual output: