Closed sunya-ch closed 1 year ago
The idea is to label IPPool resource with hostname and network name. Then, daemon can ListIPPool with options. Steps to live migrate from previous version are
Should be fixed by ed3847b5eec787b16be70624e982ec78dd42cea8 (for pod listing at initial state), 6a63faa68989183ea6639cf46775d2e25021ce47 (for ippool listing by daemon), 56143875dbe95c0c07cad7b3895bcd073ee663bc (avoid hard error when daemon failed).
Describe the bug A clear and concise description of what the bug is.
As pods and ippools could be very large at scale, the CNI component (controller, and daemon) timely hang at calling List API.
For example,
https://github.com/foundation-model-stack/multi-nic-cni/blob/0879ec42963726ab10214f532ca5c30c787a30f4/controllers/cidr_handler.go#L703
https://github.com/foundation-model-stack/multi-nic-cni/blob/0879ec42963726ab10214f532ca5c30c787a30f4/daemon/src/backend/ippool.go#L76-L87
To Reproduce Steps to reproduce the behavior:
Expected behavior A clear and concise description of what you expected to happen.
Screenshots If applicable, add screenshots to help explain your problem.
log of failed multi-nicd pod:
Environment (please complete the following information):
Additional context Add any other context about the problem here.