Azure / AKS

Azure Kubernetes Service
https://azure.github.io/AKS/
1.97k stars 308 forks source link

Change flow of using Azure Virtual Network Subnet in the Azure Kubernetes Service #4035

Open mdanylyuk opened 11 months ago

mdanylyuk commented 11 months ago

Is your feature request related to a problem? Please describe. Hello guys,

I have clusters that use "Azure CNI networking for dynamic allocation of IPs", with 2 node pools - each of them is linked to a separate subnet range that is dedicated to pods only. If I set the pod count to 32 I lose 6 IPs because system pods, azure-cns/azure-npm/cloud-node-manager/csi-azuredisk-node/csi-azurefile-node/kube-proxy, use node`s IP. And I can allocate only 26 IPs on each node.

For example, I have subnet /24 for pods on the second node pool. Azure allows to me use only 251 IPs from this subnet. And if the batch is 16 it means that I can use only 240 of them (15*16=240) and after this all pods will not be able to start, to get a new IP. So by default, I lose 11 IPs in this case.

In summary, I can use, correctly, 198 IPs. Because 7 nodes block 32 IPs, but only 26 can be used, and 1 node can block only 16 IPs (7*26+16=198)

All lost IPs cannot be used even for cluster scaling/upgrading.

Almost similar calculations you can do for any combination of subnet size and maximum pods.

Describe the solution you'd like Can we change the IP request batch size from 16 to 4? Also, change the minimum free IP count.

The best approach is using IPs one-by-one without this batch request and locking on the nodes.

Additional context Small PoC with calculations of different combinations

network /24 251 IPs pods count 32 - max pods 198 max nodes 7 (also it's 14 service pods for defender + 1 service pod for metric service) pods count 38 - max pods 208 max nodes 7 (also it's 14 service pods for defender + 1 service pod for metric service) pods count 16 - max pods 140 max nodes 14 (also it's 28 service pods for defender + 2 service pod for metric service) pods count 22 - max pods 176 max nodes 11 (also it's 22 service pods for defender + 2 service pod for metric service)

In the attachments, you can find logs of azure-cns pod from the node that cannot get the initial batch of IPs azure-cns-6h9s4.log

mdanylyuk commented 10 months ago

Hello guys, Is there any news, any ideas, what to do with it?

mdanylyuk commented 9 months ago

Hello guys, Do you have any information on how to solve this problem or any information about plans to update the configuration from your side?

mdanylyuk commented 9 months ago

Hello guys, Is there any information, any news, any plans?

mdanylyuk commented 7 months ago

Hi guys, Any news?

mdanylyuk commented 6 months ago

Hello, Any news/plans?

mdanylyuk commented 4 months ago

Hey guys, Any news, any changes?

mdanylyuk commented 2 months ago

Guys, Any news, any plans?