What happend: CNIs are attaching very slow while deploying large scale deployment, such as 2,400 Pods in 300 Nodes.
I configured to deploy 8 Pods per node and each Pod contains 1 GPU and 5 SR-IOV VFs.
What you expected to happen: CNIs will be attached well, without an error.
How to reproduce it (as minimally and precisely as possible): Deploy deployment or daemonset with SR-IOV VFs
Anything else we need to know?:
I'm using sriov-network-operator v1.2.0 (btw I upgraded ib-sriov-cni version from v1.0.2 to v1.0.3 due to fix an error) and multus-cni v3.8 (delegated from kubespray), whereabouts v0.7 in order to use Infiniband VFs in k8s.
Here's the description of Pods which error occurs:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 48m default-scheduler Successfully assigned default/mpi-test-d55775bb6-ccc5r to srh100-570
Normal AddedInterface 47m multus Add eth0 [10.11.244.50/32] from k8s-pod-network
Normal AddedInterface 47m multus Add net1 [192.168.192.96/20] from default/sriov-gpu-ib0
Warning FailedCreatePodSandBox 43m (x5 over 43m) kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to reserve sandbox name "mpi-test-d55775bb6-ccc5r_default_148c0437-446b-4a18-98ee-0f140509ed75_0": name "mpi-test-d55775bb6-ccc5r_default_148c0437-446b-4a18-98ee-0f140509ed75_0" is reserved for "a4861aef5ea2610c7701b895918a10d823195ec858daece333f9654523d58142"
Normal AddedInterface 41m multus Add eth0 [10.11.244.55/32] from k8s-pod-network
Normal AddedInterface 37m multus Add eth0 [10.11.244.56/32] from k8s-pod-network
Normal AddedInterface 33m multus Add net1 [192.168.193.101/20] from default/sriov-gpu-ib0
Warning FailedCreatePodSandBox 33m (x3 over 43m) kubelet Failed to create pod sandbox: rpc error: code = DeadlineExceeded desc = context deadline exceeded
Warning FailedCreatePodSandBox 32m (x5 over 32m) kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to reserve sandbox name "mpi-test-d55775bb6-ccc5r_default_148c0437-446b-4a18-98ee-0f140509ed75_1": name "mpi-test-d55775bb6-ccc5r_default_148c0437-446b-4a18-98ee-0f140509ed75_1" is reserved for "41c1b5157dec676c45f030ee123da5bbd9f26a1660efac5e6d2b89516d0baee3"
Normal SandboxChanged 29m (x9 over 42m) kubelet Pod sandbox changed, it will be killed and re-created.
Normal AddedInterface 25m multus Add eth0 [10.11.244.49/32] from k8s-pod-network
Normal AddedInterface 21m multus Add net1 [192.168.194.14/20] from default/sriov-gpu-ib0
Normal AddedInterface 17m multus Add net2 [192.168.211.217/20] from default/sriov-gpu-ib1
Warning FailedCreatePodSandBox 15m kubelet Failed to create pod sandbox: rpc error: code = DeadlineExceeded desc = context deadline exceeded
Warning FailedCreatePodSandBox 14m (x5 over 15m) kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to reserve sandbox name "mpi-test-d55775bb6-ccc5r_default_148c0437-446b-4a18-98ee-0f140509ed75_2": name "mpi-test-d55775bb6-ccc5r_default_148c0437-446b-4a18-98ee-0f140509ed75_2" is reserved for "ed53bee3044f76c889eaf8054222818fa31508ca52ed37292fdb5873d4afec50"
Normal SandboxChanged 9m14s (x8 over 29m) kubelet Pod sandbox changed, it will be killed and re-created.
Normal AddedInterface 4m37s multus Add eth0 [10.11.244.55/32] from k8s-pod-network
Normal AddedInterface 2m47s multus Add net1 [192.168.200.21/20] from default/sriov-gpu-ib0
Normal AddedInterface 112s multus Add net2 [192.168.212.133/20] from default/sriov-gpu-ib1
Normal AddedInterface 74s multus Add net3 [192.168.226.3/20] from default/sriov-gpu-ib2
Normal AddedInterface 68s multus Add net4 [192.168.240.214/20] from default/sriov-gpu-ib3
Normal AddedInterface 58s multus Add net5 [192.169.0.105/20] from default/sriov-gpu-ib4
When I deployed, I saw that adding CNI from multus took a very long time in first trial, around 4 minutes per CNI.
But it seemed there was a timeout so kubelet tried to recreate and restart container and multus also tried to attach CNI but eventually this procedure repeated several times.
The last time CNIs were attached successfully, but at that moment I deleted deployment so almost all Pods were deleting phase.
And here's network-attachment-definition I use (Also there are several resources):
What happend: CNIs are attaching very slow while deploying large scale deployment, such as 2,400 Pods in 300 Nodes. I configured to deploy 8 Pods per node and each Pod contains 1 GPU and 5 SR-IOV VFs.
What you expected to happen: CNIs will be attached well, without an error.
How to reproduce it (as minimally and precisely as possible): Deploy deployment or daemonset with SR-IOV VFs
Anything else we need to know?: I'm using
sriov-network-operator
v1.2.0 (btw I upgradedib-sriov-cni
version from v1.0.2 to v1.0.3 due to fix an error) andmultus-cni
v3.8 (delegated fromkubespray
),whereabouts
v0.7 in order to use Infiniband VFs in k8s.Here's the description of Pods which error occurs:
When I deployed, I saw that adding CNI from
multus
took a very long time in first trial, around 4 minutes per CNI. But it seemed there was a timeout sokubelet
tried to recreate and restart container andmultus
also tried to attach CNI but eventually this procedure repeated several times.The last time CNIs were attached successfully, but at that moment I deleted deployment so almost all Pods were deleting phase.
And here's
network-attachment-definition
I use (Also there are several resources):Is there any idea why
multus
took so long to attach CNIs into Pods?Thanks.
Environment: