Running cAdvisor as a daemonset on ec2 instances timeouts when pulling metrics.

ffilippopoulos commented 5 years ago

We are running a kubernetes cluster (v1.11) on aws ec2 instances with disks of gp2 type and size of 50 gigs. We trying to deploy cAdvisor as a daemonset to substitute running it as part of the kubelet (as rumour is that is going to become deprecated). We use the manifest from https://github.com/google/cadvisor/blob/master/deploy/kubernetes/base/daemonset.yaml to deploy the daemonset, plus a headless service in order to configure a prometheus job for cAdvisor using dns_sd_config.

On this setup, most of cAdvisor pods start logging tons of messages complaining about du and find:

lib/docker/containers/b14861ee62361be22cab6acfc6ab5a28f90d29e586ad905a998d0b1729dc502c]; will not log again for this container unless duration exceeds 2s
I1108 12:13:17.447694       1 fsHandler.go:135] du and find on following dirs took 14.396513201s: [/rootfs/var/lib/docker/overlay2/5456a06386fd2448a9a352349ae9f8822d90aeb3d8026f218d85287b512bc42b/diff /rootfs/var/lib/docker/containers/3a3045e685257e34b0aeb9d09036b9fff198f5a3e61c55ce15b4e9ad734b9d6f]; will not log again for this container unless duration exceeds 2s
I1108 12:13:17.448553       1 fsHandler.go:135] du and find on following dirs took 14.296350454s: [/rootfs/var/lib/docker/overlay2/23af37647d0e69dacb0359fc938299b5ebc79afc37d4dc4dfeedcca378e259d6/diff /rootfs/var/lib/docker/containers/2648c1bf712a610dc84e6f8af222cee113f2fc910c0541e554738cf1a9d4f45e]; will not log again for this container unless duration exceeds 2s

and we see that prometheus timeouts when trying to scrape them. Actually prometheus doesn't have to do anything with it because just curling the metrics endpoint never returns.So it looks like the pods are stuck doing du and find operations and cannot handle the metrics endpoint in time.

Running cAdvisor as part of kubelet works fine and we are able to scrape for metrics.

/cc @dashpole

dashpole commented 5 years ago

I wonder if it is the fact that you are now running 2 cAdvisors at the same time? When you are running the daemonset, can you still query the kubelet's cAdvisor (:10255/metrics/cadvisor)

ffilippopoulos commented 5 years ago

yes, that is true, we are still running the kubelet one and can query those metrics as well.

dashpole commented 5 years ago

Can you post the output of docker info?

george-angel commented 5 years ago

Containers: 109
 Running: 79
 Paused: 0
 Stopped: 30
Images: 143
Server Version: 18.06.1-ce
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 468a545b9edcd5932818eb9de8e72413e616e86e
runc version: 69663f0bd4b60df09991c08812a60108003fa340
init version: v0.13.2 (expected: fec3683b971d9c3ef73f284f176672c44b448662)
Security Options:
 seccomp
  Profile: default
 selinux
Kernel Version: 4.14.78-coreos
Operating System: Container Linux by CoreOS 1911.3.0 (Rhyolite)
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 31.42GiB
Name: ip-10-66-21-100
ID: A5KU:5YMU:GZXF:UURN:XA7X:JCO4:FS3I:RYQ4:KGE2:FC4A:37F2:P52M
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

dashpole commented 5 years ago

Hmm, we haven't seen any issues recently with overlay2

dashpole commented 5 years ago

can you try running with the cadvisor-args patch, and see if that helps? I'm wondering if there is an expensive metric we collect by default that that disables, or if the housekeeping interval is different...

nielsole commented 5 years ago

We have the same issue (Using v32). Cadvisor sometimes fails to respond within 15 seconds. CPU usage of cadvisor becomes especially high when the node is under high load: Screenshot from 2019-04-24 09-04-36 Roughly 70% User 30% Sys time.

We already increased requests to 400m CPU, but as can be seen in the graph, cadvisor would need even more CPU. I now applied the patch @dashpole proposed (slightly modified: --disable_metrics=percpu,disk,network,tcp,udp,sched) and will see how it holds up in the next days.

nielsole commented 5 years ago

So for the past 5 days we haven't had any peaks. Directly after rolling out --disable_metrics=percpu,disk,network,tcp,udp,sched (5 days ago) metrics immediately became better (rollout happened at the missing values in the middle of the picture): Screenshot from 2019-04-29 10-58-45 And now it looks like I would expect it to:

nielsole commented 5 years ago

Still a bit surprised by how much deviation there is between cadvisor instances, but our nodes are very differently loaded with containers, so this is probably be ok. Going forward I wonder what the best approach is. I could try different configurations, but this will take weeks to finish.

george-angel commented 5 years ago

Problem with running with --disable_metrics=percpu,disk,network,tcp,udp,sched - is that we are missing disk and network metrics, which were previously present in kubelet's cadvisor.

I raised https://github.com/google/cadvisor/pull/2236 - which raises parity in metrics provided, but the performance issue then returns.

I don't quite understand how kubelet can expose identical metrics without the impact to performance.

google / cadvisor

Running cAdvisor as a daemonset on ec2 instances timeouts when pulling metrics. #2098