I'm using Calico with Spiderpool to provide fixed IP capabilities for StatefulSet applications. Since there's no need for underlay network connectivity, I've configured the pod CIDR as an overlay network segment, 100.64.0.0/10. The SpiderIPPool subnet is also set to 100.64.0.0/10, with the ips spanning the entire range from 100.64.0.1 to 100.127.255.254. The issue I'm encountering is that after the Pods are created, they remain in the ContainerCreating state for an extended period, waiting for the CNI to allocate IPs-sometimes over a minute. Upon checking the Spiderpool agent logs, it appears that the requests are queuing up. I captured some CPU profiles using pprof and noticed that most of the time is spent on ipdiffset. Suspecting that the large network segment might be the issue, I changed the IPs from a /10 to a /20 subnet, which significantly sped up the allocation process.
Additionally, only about 10 Pods were created, so the concurrency was quite low. Please help identify the reasons for the slow allocation. Thank you.
What help do you need?
I'm using Calico with Spiderpool to provide fixed IP capabilities for StatefulSet applications. Since there's no need for underlay network connectivity, I've configured the pod CIDR as an overlay network segment, 100.64.0.0/10. The SpiderIPPool subnet is also set to 100.64.0.0/10, with the ips spanning the entire range from 100.64.0.1 to 100.127.255.254. The issue I'm encountering is that after the Pods are created, they remain in the ContainerCreating state for an extended period, waiting for the CNI to allocate IPs-sometimes over a minute. Upon checking the Spiderpool agent logs, it appears that the requests are queuing up. I captured some CPU profiles using pprof and noticed that most of the time is spent on ipdiffset. Suspecting that the large network segment might be the issue, I changed the IPs from a /10 to a /20 subnet, which significantly sped up the allocation process.
Additionally, only about 10 Pods were created, so the concurrency was quite low. Please help identify the reasons for the slow allocation. Thank you.
Here is the IPAM duration with the /10 subnet:![image](https://github.com/spidernet-io/spiderpool/assets/1765402/c7ecce95-ebc8-41b6-9451-81f57e21641a)
Here is the IPAM duration with the /20 subnet:![image](https://github.com/spidernet-io/spiderpool/assets/1765402/31209894-8adb-46c7-b756-dc4559b22593)
Here are the CPU performance metrics:![image](https://github.com/spidernet-io/spiderpool/assets/1765402/e6928e7c-2a4a-432b-9de1-1e87b3cda2f1)
Here are the memory performance metrics:![image](https://github.com/spidernet-io/spiderpool/assets/1765402/5a1f8c3a-100d-429b-ba1a-0fd6ae76075e)