While creating 100s of pod which uses secondary network using ovs-cni, some pods fails with following error The OF port xxxxx state is not up,
subsequently CmdDel is invoked to delete the pod, but that fails with Error deallocating IP: Did not find reserved IP for container and this error causes deadloop for sandbox deletion, CmdDel is invoked multiple times and causes kubelet log filled up with these error. There should be a fix to be done to avoid this deadloop.
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 16m default-scheduler Successfully assigned bat-t1/cnf-complex-t1-1-net-947d44f4b-92d54 to pool08-n108-wk08-n055
Normal AddedInterface 16m multus Add eth0 [192.168.146.114/32]
Warning FailedCreatePodSandBox 15m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "9e268d66cf446f1d8cc8ade7d6beb5f196488509b773de20e2d2c3b436bd6d83" network for pod "cnf-complex-t1-1-net-947d44f4b-92d54": networkPlugin cni failed to set up pod "cnf-complex-t1-1-net-947d44f4b-92d54_bat-t1" network: [bat-t1/cnf-complex-t1-1-net-947d44f4b-92d54:net-traf-1-1]: error adding container to network "net-traf-1-1": The OF port veth8ba8c86e state is not up, failed to clean up sandbox container "9e268d66cf446f1d8cc8ade7d6beb5f196488509b773de20e2d2c3b436bd6d83" network for pod "cnf-complex-t1-1-net-947d44f4b-92d54": networkPlugin cni failed to teardown pod "cnf-complex-t1-1-net-947d44f4b-92d54_bat-t1" network: delegateDel: error invoking DelegateDel - "ovs": error in getting result from DelNetwork: Error deallocating IP: Did not find reserved IP for container 9e268d66cf446f1d8cc8ade7d6beb5f196488509b773de20e2d2c3b436bd6d83 / delegateDel: error invoking DelegateDel - "ovs": error in getting result from DelNetwork: Error deallocating IP: Did not find reserved IP for container 9e268d66cf446f1d8cc8ade7d6beb5f196488509b773de20e2d2c3b436bd6d83 / delegateDel: error invoking DelegateDel - "ovs": error in getting result from DelNetwork: Error deallocating IP: Did not find reserved IP for container 9e268d66cf446f1d8cc8ade7d6beb5f196488509b773de20e2d2c3b436bd6d83]
Normal SandboxChanged 48s (x69 over 15m) kubelet Pod sandbox changed, it will be killed and re-created.
The configurable retry fix is here #158 and would be raising another PR to address the deadloop issue.
While creating 100s of pod which uses secondary network using ovs-cni, some pods fails with following error
The OF port xxxxx state is not up
,subsequently CmdDel is invoked to delete the pod, but that fails with
Error deallocating IP: Did not find reserved IP for container
and this error causes deadloop for sandbox deletion, CmdDel is invoked multiple times and causes kubelet log filled up with these error. There should be a fix to be done to avoid this deadloop.The configurable retry fix is here #158 and would be raising another PR to address the deadloop issue.