google / cadvisor

Analyzes resource usage and performance characteristics of running containers.
Other
16.68k stars 2.3k forks source link

High memory usage #3558

Open thunderbird86 opened 2 weeks ago

thunderbird86 commented 2 weeks ago

Description

We faced high memory usage of cAdvisor on some nodes

image

There is no OOM kills, memory cleaned after some time, here is two pod on the same cluster

image

And spikes higher after 05/31, its just because we increased limits for pod, but it proceed consume all available memory

Details

Pods are placed on different node-groups separated by load type with affinity In other it's the same amd64 arch instances started from one image

cAdvisor versions which we tested v0.47.2 v0.49.1

kubectl version

Client Version: v1.29.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.27.13-eks-3af4770

cAdvisor params:

- --update_machine_info_interval=5m
- --housekeeping_interval=15s
- --max_housekeeping_interval=15s
- --event_storage_event_limit=default=0
- --event_storage_age_limit=default=0
- --containerd=/run/containerd/containerd.sock
- --disable_root_cgroup_stats=false
- --store_container_labels=false
- --whitelisted_container_labels=io.kubernetes.container.name,io.kubernetes.pod.name,io.kubernetes.pod.namespace
- --disable_metrics=percpu,tcp,udp,referenced_memory,advtcp,memory_numa,hugetlb
- --enable_load_reader=true
- --docker_only=true
- --profiling=true

go tool pprof -http=:8082 cadvisor-zd6t4-heap.out

image
iwankgb commented 1 week ago

@thunderbird86, can you try code from #3561?

thunderbird86 commented 1 week ago

Thanks @iwankgb, I've rolled out it, and need couple of days to gather statistics

thunderbird86 commented 1 day ago

Hi @iwankgb, I've tried few attempts, and it doesn't help Here is last week

image