kubernetes / website

Kubernetes website and documentation repo:
https://kubernetes.io
Creative Commons Attribution 4.0 International
4.46k stars 14.37k forks source link

Improve explanation about number of Pods per node #42373

Open sftim opened 1 year ago

sftim commented 1 year ago

This is a Feature Request

What would you like to be added Explain the limits on how many Pods can fit on one node. Also update https://kubernetes.io/docs/setup/best-practices/cluster-large/ to reference that explanation.

Why is this needed Per https://github.com/kubernetes/kubernetes/issues/119391, it's not really clear how many Pods you should have on a node, nor how large that number can be.

Comments /language en

reylejano commented 1 year ago

/triage accepted

reylejano commented 1 year ago

reference issues:

Ritikaa96 commented 1 year ago

Explain the limits on how many Pods can fit on one node. Also update https://kubernetes.io/docs/setup/best-practices/cluster-large/ to reference that explanation.

Hi @sftim , as per official docs max no of pod/node is 110 , as per my investigation there had been several discussion to increase the limits. Are you suggesting we add explanation for why this number is recommended?

sftim commented 1 year ago

Per https://github.com/kubernetes/kubernetes/issues/119391, it's not really clear how many Pods you should have on a node, nor how large that number can be.

We should make both of those details clear. When working on this, we should be aware that kube-proxy is optional and production clusters do omit it.

per official docs max no of pod/node is 110

We can't rely on the documentation here; we're checking and potentially revising it, so we can't also rely on it.

Ritikaa96 commented 1 year ago

We can't rely on the documentation here; we're checking and potentially revising it, so we can't also rely on it.

I agree with this. How about adding :

More specifically, Kubernetes is designed to accommodate configurations that meet all of the following recommended criteria:

No more than 110 pods per node No more than 5,000 nodes No more than 150,000 total pods No more than 300,000 total containers

NOTE: 110 Pods per node limit is a recommendation and this limit can be manipulated and increased. However, If you increase the number of pods per node above 110, you can face the following problems :

  • "NodeNotReady" "PLEG is not healthy" where PLEG stands for Pod lifecycle event generator. This can occur due to cluster nodes having a very high load average.
  • Kube-proxy, if configured, become problematic due to the maximum number of IPs in the range as each pod in the node is recommended to have a unique IP address. However, this case can be solved using CNI with eBPF like Cilium. To learn more about Kubernetes plan to increase pod per node limit, refer to this issue.
Ritikaa96 commented 1 year ago

I would like to see a flexible statement rather than "no more than..." as it shows the user that k8s will not accommodate config beyond the limit at all. WDYT about the above NOTE? It definitely need some improvement or change in position.

dElogics commented 3 months ago

To me it seems this restriction is primary imposed by k8s sponsors so the VMs do not run out of business. Because when they do, they will have an impact on the cloud.

dbowling commented 2 months ago

I would suggest that any updated documentation also include clarification that this does or does not include pods in Completed status.

https://discuss.kubernetes.io/t/maximum-number-of-pods-per-node/15438/4

tengqm commented 1 month ago

There is simply no easy answer to this question. You have to consider a lot of factors when determining the upper limit of pod numbers per node. For example, the scale of your cluster (number of nodes versus the number of control plane instances), the capacity of worker nodes, the network bandwidth for syncing and resyncing pod status, the desired performance of pod probes, the per-node CIDR you pre-allocated, the resource requirements (including compute and non-compute resources) from typical workloads ...

dbowling commented 1 month ago

@tengqm that's actually really helpful. Something I've found challenging with this issue is that I (personally) don't know what factors are relevant to this particular question. I would suggest that even the limited details in your comment may be relevant to the documentation, as it seems like a lot of us are wondering where this number came from and if we need to provision more capacity to our cluster. As-is, I wasn't able to find anything that sent me in the right direction, let alone an actual answer. I know that I would have loved to had even just "for your consideration" type notes.