gocrane / crane

Crane is a FinOps Platform for Cloud Resource Analytics and Economics in Kubernetes clusters. The goal is not only to help users to manage cloud cost easier but also ensure the quality of applications.
https://gocrane.io
Apache License 2.0
1.85k stars 378 forks source link

upgrade cadvisor to reduce crane-agent cpu usage #845

Closed xrmzju closed 1 year ago

xrmzju commented 1 year ago

Describe the bug The current version of cAdvisor (v0.39.2) reads extra cgroup files even though we have specified that the only metrics we need are CpuUsageMetrics and ProcessSchedulerMetrics .This causes a lot of file reading, which consumes a lot of CPU. More details can be found in this issue.

Reproduce steps

  1. Run crane-agent on a node with more than 100 pods.
  2. Observe the CPU usage, which can be quite high.
  3. In our environment, running crane-agent on a node with 150 pods can consume up to 8 cores.

Expected behavior We expect cAdvisor to only read the cgroup files that are necessary to collect the specified metrics, and not to read any extra files that are not needed.

Screenshots

Environment (please complete the following information):