karmada-io / karmada

Open, Multi-Cloud, Multi-Cluster Kubernetes Orchestration
https://karmada.io
Apache License 2.0
4.12k stars 807 forks source link

karmada-metrics-adapter: reduce memory usage #4796

Closed chaunceyjiang closed 1 month ago

chaunceyjiang commented 1 month ago

When there is a large amount of pod usage in the member cluster, metrics-adapter will consume a lot of memory. The reason is that it caches all the information of all pods in the cluster. However, we don't need all this information, so we trim some of the information to reduce memory usage.

What type of PR is this? /kind bug

What this PR does / why we need it:

Which issue(s) this PR fixes: Fixes #

Special notes for your reviewer:

The core idea of this PR is to use TransformFunc to trim the fields that pod and node need to buffer. Currently, it only retains fields such as name, namespace, labels, etc. Other fields are not needed, so we do not need to cache them.

Does this PR introduce a user-facing change?:

karmada-metrics-adapter: reduce memory usage
codecov-commenter commented 1 month ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 51.76%. Comparing base (57c1989) to head (1186588). Report is 20 commits behind head on master.

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #4796 +/- ## ========================================== - Coverage 51.78% 51.76% -0.02% ========================================== Files 250 250 Lines 24989 24980 -9 ========================================== - Hits 12940 12931 -9 + Misses 11340 11339 -1 - Partials 709 710 +1 ``` | [Flag](https://app.codecov.io/gh/karmada-io/karmada/pull/4796/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=karmada-io) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/karmada-io/karmada/pull/4796/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=karmada-io) | `51.76% <ø> (-0.02%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=karmada-io#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

chaosi-zju commented 1 month ago

We have tested on such environment: 2 member clusters, each cluster have 500 node, 10000 pod in all.

before this PR introduced: karmada-metrics-adapter cost 1.3GB memory at least (and may probabilistically rise), memory overhead is mainly due to the line _ = sci.Lister(provider.PodsGVR).

after this PR introduced: only cost about 300MB.

This is a significant performance optimization ! ٩(๑❛ᴗ❛๑)۶

chaosi-zju commented 1 month ago

Now, can you add more explaination in PR description to summarize how does this PR works to reduce the memory?

You only said you trimmed some of the information, may be you can explain more about why typedmanager is better than genericmanager, or other summary terms.

chaosi-zju commented 1 month ago

/LGTM

XiShanYongYe-Chang commented 1 month ago

It's probably more of a feature. /kind feature /lgtm

karmada-bot commented 1 month ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: RainbowMango

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[pkg/metricsadapter/OWNERS](https://github.com/karmada-io/karmada/blob/master/pkg/metricsadapter/OWNERS)~~ [RainbowMango] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment