k8snetworkplumbingwg / ovs-cni

Open vSwitch CNI plugin
Apache License 2.0
224 stars 71 forks source link

error adding container to network "work-network": failed to find bridge ovs-br1 on Microk8s #223

Open yockgen opened 2 years ago

yockgen commented 2 years ago

Using Microk8 for cluster setup, experiencing " error adding container to network "work-network": failed to find bridge ovs-br1" issue, ovs-br1 have been created on all nodes as below:

Kubernetes Distribution: Microk8s

Error Log

Events:
  Normal   AddedInterface          51s                 multus             Add eth0 [10.1.231.211/32]  
  Warning  FailedCreatePodSandBox  51s                 kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "a2ffab9564353e1e97c6049d027ac72f01290ff4db1ccd07e0bfd87acd12d9e5": [default/multus-deployment-55c7b5f4fb-2gth5:work-network]: error adding container to network "work-network": failed to find bridge ovs-br1

ovs-cni installed

yockgenm@tgl02:~$ kubectl get pods -A
NAMESPACE     NAME                                       READY   STATUS              RESTARTS           AGE
kube-system   kube-multus-ds-amd64-wfvfb                 1/1     Running             9 (7d ago)         28d
kube-system   ovs-cni-amd64-jrsmv                        1/1     Running             0                  19h
kube-system   calico-node-vvcj8                          1/1     Running             5 (2d23h ago)      7d
kube-system   ovs-cni-amd64-qf4v4                        1/1     Running             0                  19h
kube-system   kube-multus-ds-amd64-w8sjn                 1/1     Running             5 (2d23h ago)      7d
kube-system   coredns-64c6478b6c-dt4t4                   1/1     Running             8 (7d ago)         26d
kube-system   calico-kube-controllers-59c6c9c94c-prhqb   1/1     Running             7 (7d ago)         24d
kube-system   calico-node-s6g77                          1/1     Running             7 (7d ago)         24d
kube-system   kube-multus-ds-fplck                       0/1     CrashLoopBackOff    1289 (55s ago)     7d
kube-system   kube-multus-ds-fl55x                       1/1     Running             2610 (5m47s ago)   24d
default       multus-deployment-55c7b5f4fb-wjq4d         0/1     ContainerCreating   0                  11m
default       multus-deployment-55c7b5f4fb-kqtkj         0/1     ContainerCreating   0                  11m
default       multus-deployment-55c7b5f4fb-2gth5         0/1     ContainerCreating   0                  11m

OvS Bridge

//node 1
yockgenm@tgl02:~$ sudo ovs-vsctl show
c8d62735-42a6-4326-823a-f4ed39ee4722
    Bridge ovs-br1
        Port ovs-br1
            Interface ovs-br1
                type: internal
    ovs_version: "2.16.2"

//node 2
root@yockgenm-tgl01:~# ovs-vsctl show
1d00c8e0-309c-4b80-aa96-498089c59a83
    Bridge ovs-br1
        datapath_type: netdev
        Port ovs-br1
            Interface ovs-br1
                type: internal
        Port dpdk-p0
            Interface dpdk-p0
                type: dpdk
                options: {dpdk-devargs="0000:58:00.0", flow-ctrl-autoneg="true"}
    ovs_version: "2.16.90"

Network Definition

cat <<EOF | kubectl create -f -
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: work-network
  annotations:
    k8s.v1.cni.cncf.io/resourceName: ovs-cni.network.kubevirt.io/br1
spec:
  config: '{
      "cniVersion": "0.4.0",
      "type": "ovs",
      "bridge": "ovs-br1",
      "vlan": 100,
      "ipam": {
        "type": "host-local",
        "subnet": "192.168.1.0/24",
        "rangeStart": "192.168.1.201",
        "rangeEnd": "192.168.1.250",
        "routes": [
          { "dst": "0.0.0.0/0" }
        ],
        "gateway": "192.168.1.1"
      }
    }'
EOF

Pod Deployment

cat <<EOF | kubectl create -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  name: multus-deployment
  labels:
    app: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: multuspod
  template:
    metadata:
      annotations:
        k8s.v1.cni.cncf.io/networks: work-network
      labels:
        app: multuspod
    spec:
      containers:
      - name: multuspod
        command: ["/bin/ash", "-c", "trap : TERM INT; sleep infinity & wait"]
        image: alpine

EOF

Any help is really appreciate!

phoracek commented 2 years ago

Hello.

You Multus DS has suspiciously high number of restarts 2610. Could you check its logs to see what is the cause of that? I wonder if it may be connected.

Looking at you NetworkAttachmentDefinition, I see that you refer bridge ovs-br1 while resourceName annotation requests br1. This suggests that resource injector is not running on your setup, nor you used explicit resource request on the Pod. This should not be causing the problem. I'm raising this just so you know that you may have issues with scheduling in case the bridge is not available on all nodes. See the second example in https://github.com/k8snetworkplumbingwg/ovs-cni#overview.

Could you please share your Node status? kubectl get node -o yaml, specifically the part with reported resources. I wonder if OVS marker reports the bridge as available.

Also, OVS CNI expects the OVS DB socket to be available in /var/run/openvswitch/db.sock. Is that the case on your nodes?