k8snetworkplumbingwg / ovs-cni

Open vSwitch CNI plugin
Apache License 2.0
224 stars 71 forks source link

OVS-CNI doesn't free IPs after container deleted #143

Open dpronyaev opened 3 years ago

dpronyaev commented 3 years ago

I have defined devnet with range 10.71.11.1-10.71.11.253 in my test environment. I've created and deleted different depoloyments with pods that had IP assigned. I one moment I've got an error while creating pod:

Warning FailedCreatePodSandBox 52s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "86c8de27be9761bb120710364470dcf3f13db7de6b604f3a7e42000c004a43a4" network for pod "delme-slb-7fd58cb956-b7trg": networkPlugin cni failed to set up pod "delme-slb-7fd58cb956-b7trg_default" network: Multus: [default/delme-slb-7fd58cb956-b7trg]: error adding container to network "devnet": delegateAdd: error invoking DelegateAdd - "ovs": error in getting result from AddNetwork: failed to set up IPAM plugin type "host-local": failed to allocate for range 0: no IP addresses available in range set: 10.71.11.1-10.71.11.253, failed to clean up sandbox container "86c8de27be9761bb120710364470dcf3f13db7de6b604f3a7e42000c004a43a4" network for pod "delme-slb-7fd58cb956-b7trg": networkPlugin cni failed to teardown pod "delme-slb-7fd58cb956-b7trg_default" network: delegateDel: error invoking DelegateDel - "ovs": error in getting result from DelNetwork: Failed to obtain OVS port for given connection: failed to find object from table Port / delegateDel: error invoking ConflistDel - "cbr0": conflistDel: error converting the raw bytes into a conflist: error parsing configuration list: no 'plugins' key]

In /var/lib/cni/networks/devnet i've found:

root@k8s-1:/var/lib/cni/networks/devnet# ls -alF
итого 420
drwxr-xr-x 2 root root 4096 дек 11 10:14 ./
drwxr-xr-x 4 root root   32 окт 23 10:10 ../
-rw-r--r-- 1 root root   70 дек  4 10:50 10.71.11.1
-rw-r--r-- 1 root root   70 дек  4 10:55 10.71.11.10
-rw-r--r-- 1 root root   70 ноя 17 09:43 10.71.11.109
-rw-r--r-- 1 root root   70 дек  4 10:55 10.71.11.11
-rw-r--r-- 1 root root   70 ноя 17 09:43 10.71.11.113
-rw-r--r-- 1 root root   70 дек  4 10:55 10.71.11.12
-rw-r--r-- 1 root root   70 дек  4 10:55 10.71.11.13
-rw-r--r-- 1 root root   70 дек  4 10:56 10.71.11.14
-rw-r--r-- 1 root root   70 дек  4 10:56 10.71.11.15
-rw-r--r-- 1 root root   70 дек  4 10:56 10.71.11.16
-rw-r--r-- 1 root root   70 дек  1 10:18 10.71.11.164
-rw-r--r-- 1 root root   70 дек  1 10:18 10.71.11.165
-rw-r--r-- 1 root root   70 дек  1 10:19 10.71.11.166
-rw-r--r-- 1 root root   70 дек  1 10:21 10.71.11.168
-rw-r--r-- 1 root root   70 дек  1 11:06 10.71.11.169
-rw-r--r-- 1 root root   70 дек  4 10:57 10.71.11.17
-rw-r--r-- 1 root root   70 дек  1 11:33 10.71.11.170
-rw-r--r-- 1 root root   70 дек  1 11:33 10.71.11.171
-rw-r--r-- 1 root root   70 дек  1 11:33 10.71.11.172
-rw-r--r-- 1 root root   70 дек  1 11:33 10.71.11.173
-rw-r--r-- 1 root root   70 дек  1 11:34 10.71.11.174
-rw-r--r-- 1 root root   70 дек  1 11:34 10.71.11.175
-rw-r--r-- 1 root root   70 дек  1 11:34 10.71.11.176
-rw-r--r-- 1 root root   70 дек  1 11:35 10.71.11.177
-rw-r--r-- 1 root root   70 дек  1 11:37 10.71.11.178
-rw-r--r-- 1 root root   70 дек  1 11:37 10.71.11.179
-rw-r--r-- 1 root root   70 дек  4 10:57 10.71.11.18
-rw-r--r-- 1 root root   70 дек  1 11:37 10.71.11.180
-rw-r--r-- 1 root root   70 дек  1 11:37 10.71.11.181
-rw-r--r-- 1 root root   70 дек  1 11:37 10.71.11.182
-rw-r--r-- 1 root root   70 дек  1 11:37 10.71.11.183
-rw-r--r-- 1 root root   70 дек  1 11:37 10.71.11.184
-rw-r--r-- 1 root root   70 дек  1 11:37 10.71.11.185
-rw-r--r-- 1 root root   70 дек  1 11:39 10.71.11.186
-rw-r--r-- 1 root root   70 дек  1 11:39 10.71.11.187
-rw-r--r-- 1 root root   70 дек  1 11:39 10.71.11.188
-rw-r--r-- 1 root root   70 дек  1 11:40 10.71.11.189
-rw-r--r-- 1 root root   70 дек  4 10:57 10.71.11.19
-rw-r--r-- 1 root root   70 дек  1 11:40 10.71.11.190
-rw-r--r-- 1 root root   70 дек  1 11:40 10.71.11.191
-rw-r--r-- 1 root root   70 дек  1 11:40 10.71.11.192
-rw-r--r-- 1 root root   70 дек  1 11:40 10.71.11.193
-rw-r--r-- 1 root root   70 дек  4 10:57 10.71.11.20
-rw-r--r-- 1 root root   70 дек  4 10:57 10.71.11.21
-rw-r--r-- 1 root root   70 дек  4 12:30 10.71.11.22
-rw-r--r-- 1 root root   70 дек  1 12:29 10.71.11.220
-rw-r--r-- 1 root root   70 дек  1 12:29 10.71.11.221
-rw-r--r-- 1 root root   70 дек  1 12:44 10.71.11.222
-rw-r--r-- 1 root root   70 дек  1 12:44 10.71.11.223
-rw-r--r-- 1 root root   70 дек  1 12:44 10.71.11.224
-rw-r--r-- 1 root root   70 дек  1 12:44 10.71.11.225
-rw-r--r-- 1 root root   70 дек  1 12:44 10.71.11.226
-rw-r--r-- 1 root root   70 дек  1 12:44 10.71.11.227
-rw-r--r-- 1 root root   70 дек  4 12:30 10.71.11.23
-rw-r--r-- 1 root root   70 дек  1 12:47 10.71.11.230
-rw-r--r-- 1 root root   70 дек  1 12:47 10.71.11.231
-rw-r--r-- 1 root root   70 дек  1 12:47 10.71.11.232
-rw-r--r-- 1 root root   70 дек  1 12:47 10.71.11.233
-rw-r--r-- 1 root root   70 дек  1 12:48 10.71.11.234
-rw-r--r-- 1 root root   70 дек  1 12:48 10.71.11.235
-rw-r--r-- 1 root root   70 дек  1 12:48 10.71.11.236
-rw-r--r-- 1 root root   70 дек  1 12:48 10.71.11.237
-rw-r--r-- 1 root root   70 дек  1 12:51 10.71.11.238
-rw-r--r-- 1 root root   70 дек  4 12:30 10.71.11.24
-rw-r--r-- 1 root root   70 дек  2 17:41 10.71.11.243
-rw-r--r-- 1 root root   70 дек  2 17:41 10.71.11.244
-rw-r--r-- 1 root root   70 дек  2 17:41 10.71.11.245
-rw-r--r-- 1 root root   70 дек  2 17:41 10.71.11.246
-rw-r--r-- 1 root root   70 дек  2 17:44 10.71.11.247
-rw-r--r-- 1 root root   70 дек  2 17:44 10.71.11.248
-rw-r--r-- 1 root root   70 дек  4 12:30 10.71.11.25
-rw-r--r-- 1 root root   70 дек  4 10:50 10.71.11.253
-rw-r--r-- 1 root root   70 дек  4 12:30 10.71.11.26
-rw-r--r-- 1 root root   70 дек  4 12:30 10.71.11.27
-rw-r--r-- 1 root root   70 дек  7 12:20 10.71.11.28
-rw-r--r-- 1 root root   70 дек  7 12:20 10.71.11.29
-rw-r--r-- 1 root root   70 дек  7 12:20 10.71.11.30
-rw-r--r-- 1 root root   70 дек  7 12:20 10.71.11.31
-rw-r--r-- 1 root root   70 дек  7 12:30 10.71.11.32
-rw-r--r-- 1 root root   70 дек  7 12:30 10.71.11.33
-rw-r--r-- 1 root root   70 дек  7 12:30 10.71.11.34
-rw-r--r-- 1 root root   70 дек  7 12:30 10.71.11.35
-rw-r--r-- 1 root root   70 дек  7 12:31 10.71.11.36
-rw-r--r-- 1 root root   70 дек  7 12:31 10.71.11.37
-rw-r--r-- 1 root root   70 дек  7 12:31 10.71.11.38
-rw-r--r-- 1 root root   70 дек  7 12:31 10.71.11.39
-rw-r--r-- 1 root root   70 дек  7 12:32 10.71.11.40
-rw-r--r-- 1 root root   70 дек  7 12:32 10.71.11.41
-rw-r--r-- 1 root root   70 дек  7 13:15 10.71.11.42
-rw-r--r-- 1 root root   70 дек  7 13:15 10.71.11.43
-rw-r--r-- 1 root root   70 дек  7 13:15 10.71.11.44
-rw-r--r-- 1 root root   70 дек  7 13:15 10.71.11.45
-rw-r--r-- 1 root root   70 дек 11 10:14 10.71.11.46
-rw-r--r-- 1 root root   70 дек 11 10:14 10.71.11.47
-rw-r--r-- 1 root root   70 дек 11 10:14 10.71.11.48
-rw-r--r-- 1 root root   70 окт 28 11:12 10.71.11.49
-rw-r--r-- 1 root root   70 дек 11 10:14 10.71.11.50
-rw-r--r-- 1 root root   70 дек 11 10:14 10.71.11.51
-rw-r--r-- 1 root root   70 дек 11 10:14 10.71.11.52
-rw-r--r-- 1 root root   70 дек  4 10:53 10.71.11.6
-rw-r--r-- 1 root root   70 дек  4 10:53 10.71.11.7
-rw-r--r-- 1 root root   70 дек  4 10:53 10.71.11.8
-rw-r--r-- 1 root root   70 дек  4 10:53 10.71.11.9
-rw-r--r-- 1 root root   11 дек 11 10:14 last_reserved_ip.0
-rwxr-x--- 1 root root    0 окт 23 10:10 lock*

So I think IPAM doesn't deletes corresponding file in /var/lib/cni/networks/ when pod is deleted, so adress pool overflows.

pperiyasamy commented 3 years ago

@dpronyaev This seems like a host-local IPAM issue, do you see those IP files are empty? This is a known issue with this plugin and this PR addresses it. Can you confirm it ?

dpronyaev commented 3 years ago

@dpronyaev This seems like a host-local IPAM issue, do you see those IP files are empty? This is a known issue with this plugin and this PR addresses it. Can you confirm it ?

No, none of theese files is empty. For example:

root@k8s-2:/var/lib/cni/networks/devnet# tail -n +1 *
==> 10.71.11.11 <==
0a1e0d027cae251b66df6d75dfbcf1db60d8962551751610272326c76c295ef9
eth0
==> 10.71.11.14 <==
900d4afadd284e38b8b60d1583c5f838466a02ba183efaedc5490e3960a10e29
eth0
==> 10.71.11.2 <==
f16eb1fdae3ca6e6c377d36f4873373cef63ff2ff78a02541013ae2441cbba84
eth0
==> 10.71.11.22 <==
58510d919c0208156806a65d57ff2cd58e7fcfba1cd2c2a26bb4648ae114bacd
eth0
==> 10.71.11.25 <==
b6a1397dc586debde7c57f40200126ec844b935859461b0027ddc408488210c8
eth0
==> 10.71.11.27 <==
d05b687de370d10529b076cfc177c65acc4acaecb4c0df1cb1d7f488024d941e
eth0
==> 10.71.11.31 <==
3df1a9c65c6d5e6d88410760a546145604556f08bc6e63931c216c5917829aa6
eth0
==> 10.71.11.35 <==
0bd2ab5d14cb00c2e53baeffc85b2937f1104a2937c5314139839ecd86fd3ebf
eth0
==> 10.71.11.45 <==
9dabe5900cb0cedc121935c10885d7ccf186e882fe210d6d0b64a9c0d9e555b7
eth0
==> 10.71.11.46 <==
522124e72e6d8263470542a9f605c24efa7b56910e44ae123cb3e859a7a08910
eth0
==> 10.71.11.47 <==
5ad3eb0cf2f8da42a850f0025740d41aa0d905def8b8c3a2256a06efcccc105a
eth0
==> 10.71.11.49 <==
dde10ec4e7a40890e7f697162ba5fab2088536a0e6008531bc21cefed4846972
eth0
==> 10.71.11.52 <==
667c0c0a9ac2ef0a6b229b8805cc40c5872bca886be4a32b3c25c1086843c6a7
eth0
==> 10.71.11.54 <==
4bb28abdc39897b13ccfc1cc1a78a7c510c51a0a5634a56aa932a6581539757f
eth0
==> 10.71.11.55 <==
4195f2aab0e8bc566b3553be7cab0c37c2faedac5b73147c4b52e46655c47470
eth0
==> 10.71.11.56 <==
e654b45d8525557dd2dc631b8cade1fa58d38a0541d46c43f131911b45c2fa31
eth0
==> 10.71.11.57 <==
9f5f9e7334730f5e06478bc0a0627fc152d969da54659f10d97a0f90a600765e
eth0
==> 10.71.11.58 <==
205472bccaff5ea4123991c3d56fc2c8ccc8e2a26664be5d778b38539ec7b5da
eth0
==> 10.71.11.59 <==
697c5257803d1ec5fbcccd5eb69a2ad98ad0226c25d6c8b74f022c96114d9b83
eth0
==> 10.71.11.60 <==
b71464a7145e3fa702d3f5a634cec3b07074bbeeb157ba17a9c8117d2fae6672
eth0
==> last_reserved_ip.0 <==
10.71.11.60
==> lock <==
pperiyasamy commented 3 years ago

@dpronyaev ah! yes, there is a regression generated from previous commit , hope you're seeing errors like below in kubelet logs during pod delete (?):

Dec 17 09:00:36 dl380-006-ECCD-SUT kubelet[24746]: E1217 09:00:36.214057   24746 remote_runtime.go:140] StopPodSandbox "c64a9c227d1371707c2e8e9b8d78ce8850dd17a0d075e3a1561c47dc6d0cc27c" from runtime service failed: rpc error: code = Unknown desc = networkPlugin cni failed to teardown pod "pod-1_default" network: delegateDel: error invoking DelegateDel - "ovs": error in getting result from DelNetwork: failed to connect to ovsdb socket /var/run/openvswitch/db.sock: error: Invalid socket file

Let me raise another PR to fix this.

dpronyaev commented 3 years ago

@pperiyasamy errors in kubelet log look like this:

Dec 17 09:37:24 k8s-2 kubelet[796]: E1217 09:37:24.343894     796 pod_workers.go:191] Error syncing pod 121058c3-9dd8-4d89-94dc-882d7d92619b ("delme-centrex-2-7949cdb9bd-cd28m_default(121058c3-9dd8-4d89-94dc-882d7d92619b)"), skipping: failed to "CreatePodSandbox" for "delme-centrex-2-7949cdb9bd-cd28m_default(121058c3-9dd8-4d89-94dc-882d7d92619b)" with CreatePodSandboxError: "CreatePodSandbox for pod \"delme-centrex-2-7949cdb9bd-cd28m_default(121058c3-9dd8-4d89-94dc-882d7d92619b)\" failed: rpc error: code = Unknown desc = [failed to set up sandbox container \"72a12f853fa251556c3ac0307c1606b65fa89b0ab41f2e70e7f7e7971f35d96b\" network for pod \"delme-centrex-2-7949cdb9bd-cd28m\": networkPlugin cni failed to set up pod \"delme-centrex-2-7949cdb9bd-cd28m_default\" network: Multus: [default/delme-centrex-2-7949cdb9bd-cd28m]: error adding container to network \"devnet\": delegateAdd: error invoking DelegateAdd - \"ovs\": error in getting result from AddNetwork: failed to set up IPAM plugin type \"host-local\": failed to allocate for range 0: no IP addresses available in range set: 10.71.11.1-10.71.11.253, failed to clean up sandbox container \"72a12f853fa251556c3ac0307c1606b65fa89b0ab41f2e70e7f7e7971f35d96b\" network for pod \"delme-centrex-2-7949cdb9bd-cd28m\": networkPlugin cni failed to teardown pod \"delme-centrex-2-7949cdb9bd-cd28m_default\" network: delegateDel: error invoking DelegateDel - \"ovs\": error in getting result from DelNetwork: Failed to obtain OVS port for given connection: failed to find object from table Port / delegateDel: error invoking ConflistDel - \"cbr0\": conflistDel: error converting the raw bytes into a conflist: error parsing configuration list: no 'plugins' key]"
pperiyasamy commented 3 years ago

@dpronyaev I assume these errors occurred during pod creation and consequence of ip addresses not cleaned up during previous pod deletes invocations. do you have kubelet logs while doing pod deletion ? where does ovsdb socket file exist ? is it different location other than /var/run/openvswitch/db.sock ? can you share us the NAD and/or if you've any flat file configuration ?

dpronyaev commented 3 years ago

Here is fresh log of deleting pod:

Dec 18 14:44:27 k8s-2 kubelet[796]: E1218 14:44:27.331347     796 remote_runtime.go:140] StopPodSandbox "43957e00aabee6d0a0ebcd4edc3b0d109f7689212a0d5df09d266a05dd3971b2" from runtime service failed: rpc error: code = Unknown desc = networkPlugin cni failed to teardown pod "delme-slb-7fd58cb956-fn9zt_default" network: delegateDel: error invoking ConflistDel - "cbr0": conflistDel: error converting the raw bytes into a conflist: error parsing configuration list: no 'plugins' key
Dec 18 14:44:27 k8s-2 kubelet[796]: E1218 14:44:27.332528     796 kuberuntime_manager.go:898] Failed to stop sandbox {"docker" "43957e00aabee6d0a0ebcd4edc3b0d109f7689212a0d5df09d266a05dd3971b2"}
Dec 18 14:44:27 k8s-2 kubelet[796]: E1218 14:44:27.332942     796 kubelet_pods.go:1250] Failed killing the pod "delme-slb-7fd58cb956-fn9zt": failed to "KillPodSandbox" for "3c7337e3-c846-4ba7-a35b-e50d04ba9ee4" with KillPodSandboxError: "rpc error: code = Unknown desc = networkPlugin cni failed to teardown pod \"delme-slb-7fd58cb956-fn9zt_default\" network: delegateDel: error invoking ConflistDel - \"cbr0\": conflistDel: error converting the raw bytes into a conflist: error parsing configuration list: no 'plugins' key"
Dec 18 14:44:27 k8s-2 kubelet[796]: E1218 14:44:27.512128     796 remote_runtime.go:140] StopPodSandbox "127c7c34c89a21d4feb6d9fba4356b72cde0539d6c3b2d52c9af49ac2800abb6" from runtime service failed: rpc error: code = Unknown desc = networkPlugin cni failed to teardown pod "delme-test-8548f5976c-fwj4c_default" network: delegateDel: error invoking DelegateDel - "ovs": error in getting result from DelNetwork: failed to find bridge delme / delegateDel: error invoking ConflistDel - "cbr0": conflistDel: error converting the raw bytes into a conflist: error parsing configuration list: no 'plugins' key
Dec 18 14:44:27 k8s-2 kubelet[796]: E1218 14:44:27.512184     796 kuberuntime_manager.go:898] Failed to stop sandbox {"docker" "127c7c34c89a21d4feb6d9fba4356b72cde0539d6c3b2d52c9af49ac2800abb6"}
Dec 18 14:44:27 k8s-2 kubelet[796]: E1218 14:44:27.512226     796 kubelet_pods.go:1250] Failed killing the pod "delme-test-8548f5976c-fwj4c": failed to "KillPodSandbox" for "76481c00-36de-482a-afad-d52ecd532613" with KillPodSandboxError: "rpc error: code = Unknown desc = networkPlugin cni failed to teardown pod \"delme-test-8548f5976c-fwj4c_default\" network: delegateDel: error invoking DelegateDel - \"ovs\": error in getting result from DelNetwork: failed to find bridge delme / delegateDel: error invoking ConflistDel - \"cbr0\": conflistDel: error converting the raw bytes into a conflist: error parsing configuration list: no 'plugins' key"
Dec 18 14:44:27 k8s-2 kubelet[796]: W1218 14:44:27.549530     796 docker_sandbox.go:402] failed to read pod IP from plugin/docker: networkPlugin cni failed on the status hook for pod "delme-test-8548f5976c-fwj4c_default": CNI failed to retrieve network namespace path: cannot find network namespace for the terminated container "127c7c34c89a21d4feb6d9fba4356b72cde0539d6c3b2d52c9af49ac2800abb6"

ovsdb socket file exists in /var/run/openvswitch/db.sock

Please explane what do you mean "NAD and/or if you've any flat file configuration"? I wasn't able to google it so I'm confused)

pperiyasamy commented 3 years ago

@dpronyaev As per the log, i think the error error in getting result from DelNetwork: failed to find bridge delme which causes IPAM cleanup not executed and CmdDelis returned prematurely. does delmeovs bridge still exist ? or is it got removed by some other process after pod is created ? Can you share your NAD (kubectl get net-attach-def <name> -o yaml) and pod yaml configuration ?

I thought you had ovsdb socket file is in different location and its path is not configured via flat file configuration, see this, but we can safely ignore this as socket file exists in the default location /var/run/openvswitch/db.sock

dpronyaev commented 3 years ago

@pperiyasamy at the moment of helm delete delme ovs bridge exists and I delete it AFTER deletion of all containers.

Here is NAD of existing bridge:

apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
  annotations:
    k8s.v1.cni.cncf.io/resourceName: ovs-cni.network.kubevirt.io/ovs-bridge
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"k8s.cni.cncf.io/v1","kind":"NetworkAttachmentDefinition","metadata":{"annotations":{"k8s.v1.cni.cncf.io/resourceName":"ovs-cni.network.kubevirt.io/ovs-bridge"},"name":"devnet","namespace":"default"},"spec":{"config":"{ \"cniVersion\": \"0.3.1\", \"type\": \"ovs\", \"bridge\": \"ovs-bridge\", \"vlan\": 101, \"ipam\": { \"type\": \"host-local\", \"subnet\": \"10.71.8.0/22\", \"rangeStart\": \"10.71.11.1\", \"rangeEnd\": \"10.71.11.253\", \"routes\": [ { \"dst\": \"0.0.0.0/0\" } ], \"gateway\": \"10.71.11.254\" } }"}}
  creationTimestamp: "2020-10-26T07:22:41Z"
  generation: 1
  managedFields:
  - apiVersion: k8s.cni.cncf.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .: {}
          f:k8s.v1.cni.cncf.io/resourceName: {}
          f:kubectl.kubernetes.io/last-applied-configuration: {}
      f:spec:
        .: {}
        f:config: {}
    manager: kubectl-client-side-apply
    operation: Update
    time: "2020-10-26T07:22:41Z"
  name: devnet
  namespace: default
  resourceVersion: "786144"
  selfLink: /apis/k8s.cni.cncf.io/v1/namespaces/default/network-attachment-definitions/devnet
  uid: 858501f8-156d-455f-b5fb-09a3d37f9a5c
spec:
  config: '{ "cniVersion": "0.3.1", "type": "ovs", "bridge": "ovs-bridge", "vlan":
    101, "ipam": { "type": "host-local", "subnet": "10.71.8.0/22", "rangeStart": "10.71.11.1",
    "rangeEnd": "10.71.11.253", "routes": [ { "dst": "0.0.0.0/0" } ], "gateway": "10.71.11.254"
    } }'
dpronyaev commented 3 years ago

Here is pod yaml

apiVersion: v1
kind: Pod
metadata:
  annotations:
    k8s.v1.cni.cncf.io/network-status: |-
      [{
          "name": "devnet",
          "interface": "eth0",
          "ips": [
              "10.71.11.102"
          ],
          "mac": "02:00:00:b3:96:bf",
          "default": true,
          "dns": {}
      },{
          "name": "nason8-int",
          "interface": "eno0",
          "ips": [
              "10.42.42.251"
          ],
          "mac": "02:00:00:c9:c0:a5",
          "dns": {}
      }]
    k8s.v1.cni.cncf.io/networks: '[{ "name": "nason8-int", "interface": "eno0", "ips":
      ["10.42.42.251/24"] }]'
    k8s.v1.cni.cncf.io/networks-status: |-
      [{
          "name": "devnet",
          "interface": "eth0",
          "ips": [
              "10.71.11.102"
          ],
          "mac": "02:00:00:b3:96:bf",
          "default": true,
          "dns": {}
      },{
          "name": "nason8-int",
          "interface": "eno0",
          "ips": [
              "10.42.42.251"
          ],
          "mac": "02:00:00:c9:c0:a5",
          "dns": {}
      }]
    v1.multus-cni.io/default-network: default/devnet
  creationTimestamp: "2020-12-18T07:56:13Z"
  generateName: nason8-test-546fbcfc6-
  labels:
    app: nason8-test
    pod-template-hash: 546fbcfc6
    sandbox: nason8
  managedFields:
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .: {}
          f:k8s.v1.cni.cncf.io/networks: {}
          f:v1.multus-cni.io/default-network: {}
        f:generateName: {}
        f:labels:
          .: {}
          f:app: {}
          f:pod-template-hash: {}
          f:sandbox: {}
        f:ownerReferences:
          .: {}
          k:{"uid":"bfc5b343-242f-4c3a-8e3a-0bf88390dbb1"}:
            .: {}
            f:apiVersion: {}
            f:blockOwnerDeletion: {}
            f:controller: {}
            f:kind: {}
            f:name: {}
            f:uid: {}
      f:spec:
        f:affinity:
          .: {}
          f:podAffinity:
            .: {}
            f:requiredDuringSchedulingIgnoredDuringExecution: {}
        f:containers:
          k:{"name":"nason8-test"}:
            .: {}
            f:args: {}
            f:command: {}
            f:image: {}
            f:imagePullPolicy: {}
            f:name: {}
            f:resources: {}
            f:securityContext:
              .: {}
              f:capabilities:
                .: {}
                f:add: {}
            f:terminationMessagePath: {}
            f:terminationMessagePolicy: {}
            f:volumeMounts:
              .: {}
              k:{"mountPath":"/root"}:
                .: {}
                f:mountPath: {}
                f:name: {}
              k:{"mountPath":"/shared"}:
                .: {}
                f:mountPath: {}
                f:name: {}
        f:dnsConfig:
          .: {}
          f:nameservers: {}
          f:options: {}
        f:dnsPolicy: {}
        f:enableServiceLinks: {}
        f:hostname: {}
        f:restartPolicy: {}
        f:schedulerName: {}
        f:securityContext: {}
        f:terminationGracePeriodSeconds: {}
        f:volumes:
          .: {}
          k:{"name":"root"}:
            .: {}
            f:name: {}
            f:persistentVolumeClaim:
              .: {}
              f:claimName: {}
          k:{"name":"shared"}:
            .: {}
            f:name: {}
            f:persistentVolumeClaim:
              .: {}
              f:claimName: {}
    manager: kube-controller-manager
    operation: Update
    time: "2020-12-18T07:56:13Z"
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:status:
        f:conditions:
          .: {}
          k:{"type":"PodScheduled"}:
            .: {}
            f:lastProbeTime: {}
            f:lastTransitionTime: {}
            f:message: {}
            f:reason: {}
            f:status: {}
            f:type: {}
    manager: kube-scheduler
    operation: Update
    time: "2020-12-18T07:56:13Z"
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          f:k8s.v1.cni.cncf.io/network-status: {}
          f:k8s.v1.cni.cncf.io/networks-status: {}
    manager: multus
    operation: Update
    time: "2020-12-18T07:56:19Z"
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:status:
        f:conditions:
          k:{"type":"ContainersReady"}:
            .: {}
            f:lastProbeTime: {}
            f:lastTransitionTime: {}
            f:status: {}
            f:type: {}
          k:{"type":"Initialized"}:
            .: {}
            f:lastProbeTime: {}
            f:lastTransitionTime: {}
            f:status: {}
            f:type: {}
          k:{"type":"Ready"}:
            .: {}
            f:lastProbeTime: {}
            f:lastTransitionTime: {}
            f:status: {}
            f:type: {}
        f:containerStatuses: {}
        f:hostIP: {}
        f:phase: {}
        f:podIP: {}
        f:podIPs:
          .: {}
          k:{"ip":"10.71.11.102"}:
            .: {}
            f:ip: {}
        f:startTime: {}
    manager: kubelet
    operation: Update
    time: "2020-12-18T07:56:20Z"
  name: nason8-test-546fbcfc6-j6lzk
  namespace: default
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: ReplicaSet
    name: nason8-test-546fbcfc6
    uid: bfc5b343-242f-4c3a-8e3a-0bf88390dbb1
  resourceVersion: "11868114"
  selfLink: /api/v1/namespaces/default/pods/nason8-test-546fbcfc6-j6lzk
  uid: ced94fd2-2320-4cd4-bd24-40535d89961a
spec:
  affinity:
    podAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: sandbox
            operator: In
            values:
            - nason8
        topologyKey: kubernetes.io/hostname
  containers:
  - args:
    - while true; do sleep 30; done;
    command:
    - /bin/bash
    - -c
    - --
    image: docker.company/application-dev:latest
    imagePullPolicy: Always
    name: nason8-test
    resources: {}
    securityContext:
      capabilities:
        add:
        - SYS_BOOT
        - NET_ADMIN
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /root
      name: root
    - mountPath: /shared
      name: shared
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: default-token-pp2tj
      readOnly: true
  dnsConfig:
    nameservers:
    - 10.71.0.41
    - 10.71.11.254
    options:
    - name: timeout
      value: "1"
    - name: attempts
      value: "1"
  dnsPolicy: None
  enableServiceLinks: true
  hostname: nason8-test
  nodeName: k8s-2
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: default
  serviceAccountName: default
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - name: root
    persistentVolumeClaim:
      claimName: inasonov
  - name: shared
    persistentVolumeClaim:
      claimName: nason8-shared
  - name: default-token-pp2tj
    secret:
      defaultMode: 420
      secretName: default-token-pp2tj
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2020-12-18T07:56:13Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2020-12-18T07:56:19Z"
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2020-12-18T07:56:19Z"
    status: "True"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2020-12-18T07:56:13Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: docker://1d2706e4953fc64899c93d3d0109709363e1fd5e2713d3959cfa0bf82194a470
    image: docker.company/application-dev:latest
    imageID: docker-pullable://docker.company/application-dev@sha256:8fdd8241cc83c82a775e1a04e752377ce72d5adfbf916479fa7ee49c80195501
    lastState: {}
    name: nason8-test
    ready: true
    restartCount: 0
    started: true
    state:
      running:
        startedAt: "2020-12-18T07:56:19Z"
  hostIP: 10.0.1.36
  phase: Running
  podIP: 10.71.11.102
  podIPs:
  - ip: 10.71.11.102
  qosClass: BestEffort
  startTime: "2020-12-18T07:56:13Z"
pperiyasamy commented 3 years ago

The NAD devnet that you've listed is not used by the pod for multus secondary networking as networks annotation contains this:

    k8s.v1.cni.cncf.io/networks: '[{ "name": "nason8-int", "interface": "eno0", "ips":
      ["10.42.42.251/24"] }]'

can you paste the NAD for nason8-int network ? is there any chance ovs bridge was deleted before container delete during your helm delete ?

pperiyasamy commented 3 years ago

can you check if /etc/cni/net.d/00-multus.conf is also configured properly for default network devnet ? Because the error was also thrown like conflistDel: error converting the raw bytes into a conflist: error parsing configuration list: no 'plugins' key

Here is a sample multus config:

# cat /etc/cni/net.d/00-multus.conf
{ "cniVersion": "0.3.1", "name": "multus-cni-network", "type": "multus", "kubeconfig": "/etc/cni/net.d/multus.d/multus.kubeconfig", "delegates": [ { "name": "k8s-pod-network", "cniVersion": "0.3.1", "plugins": [ { "type": "calico", "datastore_type": "kubernetes", "mtu": 1410, "nodename_file_optional": false, "log_file_path": "/var/log/calico/cni/cni.log", "ipam": { "type": "calico-ipam", "assign_ipv4" : "true", "assign_ipv6" : "false" }, "container_settings": { "allow_ip_forwarding": false }, "policy": { "type": "k8s" }, "kubernetes": { "kubeconfig": "/etc/cni/net.d/calico-kubeconfig" } }, {"type": "portmap", "snat": true, "capabilities": {"portMappings": true}} ] } ] }
dpronyaev commented 3 years ago

@pperiyasamy

root@k8s-1:~# kubectl get networkattachmentdefinition.k8s.cni.cncf.io nason8-int -o yaml
apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
  annotations:
    k8s.v1.cni.cncf.io/resourceName: ovs-cni.network.kubevirt.io/nason8-int
  creationTimestamp: "2020-12-18T07:54:23Z"
  generation: 1
  managedFields:
  - apiVersion: k8s.cni.cncf.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .: {}
          f:k8s.v1.cni.cncf.io/resourceName: {}
      f:spec:
        .: {}
        f:config: {}
    manager: kubectl-create
    operation: Update
    time: "2020-12-18T07:54:23Z"
  name: nason8-int
  namespace: default
  resourceVersion: "11867680"
  selfLink: /apis/k8s.cni.cncf.io/v1/namespaces/default/network-attachment-definitions/nason8-int
  uid: 49c5fd34-9eba-4178-ae66-4045efee8c9d
spec:
  config: '{ "cniVersion": "0.3.1", "type": "ovs", "bridge": "nason8", "ipam": { "type":
    "static" } }'
root@k8s-1:~# cat /etc/cni/net.d/00-multus.conf
{ "cniVersion": "0.3.1", "name": "multus-cni-network", "type": "multus", "kubeconfig": "/etc/cni/net.d/multus.d/multus.kubeconfig", "delegates": [ { "name": "cbr0", "cniVersion": "0.3.1", "plugins": [ { "type": "flannel", "delegate": { "hairpinMode": true, "isDefaultGateway": true } }, { "type": "portmap", "capabilities": { "portMappings": true } } ] } ] }

is there any chance ovs bridge was deleted before container delete during your helm delete ?

I think no. My script does helm delete, waits for all pods to disapear. Only after that it deletes networkattachmentdefinition.k8s.cni.cncf.io and only after that deletes coresponding OVS bridge.

pperiyasamy commented 3 years ago

@dpronyaev Now it looks your configuration is changed to use flannel as primary plugin and you are trying to use nason8-intfor the secondary network. can you correct resourceName to the following in the NAD k8s.v1.cni.cncf.io/resourceName: ovs-cni.network.kubevirt.io/nason8. is IP address passed from pod spec as you're trying to use static ipam in nason8-int ? What is the issue now ?

kubevirt-bot commented 3 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

kubevirt-bot commented 3 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle rotten