k8snetworkplumbingwg / ovs-cni

Open vSwitch CNI plugin
Apache License 2.0
224 stars 71 forks source link

pod creation fails with OF port not up error #159

Closed pperiyasamy closed 3 years ago

pperiyasamy commented 3 years ago

While creating 100s of pod which uses secondary network using ovs-cni, some pods fails with following error The OF port xxxxx state is not up,

subsequently CmdDel is invoked to delete the pod, but that fails with Error deallocating IP: Did not find reserved IP for container and this error causes deadloop for sandbox deletion, CmdDel is invoked multiple times and causes kubelet log filled up with these error. There should be a fix to be done to avoid this deadloop.

Events:
  Type     Reason                  Age                 From               Message
  ----     ------                  ----                ----               -------
  Normal   Scheduled               16m                 default-scheduler  Successfully assigned bat-t1/cnf-complex-t1-1-net-947d44f4b-92d54 to pool08-n108-wk08-n055
  Normal   AddedInterface          16m                 multus             Add eth0 [192.168.146.114/32]
  Warning  FailedCreatePodSandBox  15m                 kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "9e268d66cf446f1d8cc8ade7d6beb5f196488509b773de20e2d2c3b436bd6d83" network for pod "cnf-complex-t1-1-net-947d44f4b-92d54": networkPlugin cni failed to set up pod "cnf-complex-t1-1-net-947d44f4b-92d54_bat-t1" network: [bat-t1/cnf-complex-t1-1-net-947d44f4b-92d54:net-traf-1-1]: error adding container to network "net-traf-1-1": The OF port veth8ba8c86e state is not up, failed to clean up sandbox container "9e268d66cf446f1d8cc8ade7d6beb5f196488509b773de20e2d2c3b436bd6d83" network for pod "cnf-complex-t1-1-net-947d44f4b-92d54": networkPlugin cni failed to teardown pod "cnf-complex-t1-1-net-947d44f4b-92d54_bat-t1" network: delegateDel: error invoking DelegateDel - "ovs": error in getting result from DelNetwork: Error deallocating IP: Did not find reserved IP for container 9e268d66cf446f1d8cc8ade7d6beb5f196488509b773de20e2d2c3b436bd6d83 / delegateDel: error invoking DelegateDel - "ovs": error in getting result from DelNetwork: Error deallocating IP: Did not find reserved IP for container 9e268d66cf446f1d8cc8ade7d6beb5f196488509b773de20e2d2c3b436bd6d83 / delegateDel: error invoking DelegateDel - "ovs": error in getting result from DelNetwork: Error deallocating IP: Did not find reserved IP for container 9e268d66cf446f1d8cc8ade7d6beb5f196488509b773de20e2d2c3b436bd6d83]
  Normal   SandboxChanged          48s (x69 over 15m)  kubelet            Pod sandbox changed, it will be killed and re-created.

The configurable retry fix is here #158 and would be raising another PR to address the deadloop issue.