karmada-io / karmada

Open, Multi-Cloud, Multi-Cluster Kubernetes Orchestration
https://karmada.io
Apache License 2.0
4.47k stars 885 forks source link

karmada apiserver memory upper limit affects federation cluster size #3562

Open xigang opened 1 year ago

xigang commented 1 year ago

What do you think about this question?:

I have a question. Karmada will locally cache resource data defined by ResourceRegistry, such as pods, nodes, workloads, etc. If a federated cluster manages 100 member clusters, each with 5k nodes, will a single Karmada apiserver have a memory bottleneck?

How do we need to support larger federated clusters?

Environment:

xigang commented 1 year ago

/cc @RainbowMango @XiShanYongYe-Chang @ikaven1024

ikaven1024 commented 1 year ago

We have test it with 100 clusters, and 2 millions pods. FYI: https://github.com/karmada-io/karmada/issues/2518. If you have some other resources, having large quantity like pods, you can run searval karmada-search group, each group cache one or some kinds resources. And add a new component (may call it karmada-search-gateway), redirect client request to the target group due to the resource kind requested. Shows this architecture as below:

image
XiShanYongYe-Chang commented 1 year ago

you can run searval karmada-search group, each group cache one or some kinds resources.

That sounds great, but how to achieve that goal?

ikaven1024 commented 1 year ago

May add some fields like group in RR. karmada-search-gateway and karmada-search-group watch these RRs. E.g.

apiVersion: search.karmada.io/v1alpha1
kind: ResourceRegistry
metadata:
  name: rr_pods
spec:
  group:
    pods: 
  targetCluster:{}
---
apiVersion: search.karmada.io/v1alpha1
kind: ResourceRegistry
metadata:
  name: rr_nodes_svcs
spec:
  group:
    nodes_svcs: 
  targetCluster:{}
---
apiVersion: search.karmada.io/v1alpha1
kind: ResourceRegistry
metadata:
  name: rr_foos
spec:
  group:
    foo: 
  targetCluster:{}
XiShanYongYe-Chang commented 1 year ago

Sounds like a good way to go.

xigang commented 1 year ago

@ikaven1024 But I understand that a single karmada-search group uses the memory of one machine, and there will still be memory bottlenecks, such as a very large number of pods resources.

ikaven1024 commented 1 year ago

When search group reach its bottleneck, client informers also face the same problem. Our federation scale should have a upper bound, but we have not test it.

xigang commented 1 year ago

When search group reach its bottleneck, client informers also face the same problem. Our federation scale should have a upper bound, but we have not test it.

Thanks @ikaven1024 reply.