smartxworks / virtink

Lightweight Virtualization Add-on for Kubernetes
Apache License 2.0
481 stars 37 forks source link

multus network with bridge cni plugins #53

Closed cicdteam closed 1 year ago

cicdteam commented 1 year ago

Hello

I'm trying to use multus with bridge cni plugin:

apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: overlay
spec:
  config: '{
    "cniVersion": "0.3.1",
    "name": "overlay",
    "type": "bridge",
    "bridge": "my-bridge",
    "ipam": {
        "type": "host-local",
        "subnet": "10.88.0.0/16"
    }
  }'

---
apiVersion: virt.virtink.smartx.com/v1alpha1
kind: VirtualMachine
metadata:
  name: ubuntu-rootfs
spec:
  instance:
    cpu:
      sockets: 4
      coresPerSocket: 1
    memory:
      size: 4Gi
    kernel:
      image: smartxworks/virtink-kernel-5.15.12
      imagePullPolicy: IfNotPresent
      cmdline: "console=ttyS0 root=/dev/vda rw"
    disks:
      - name: ubuntu
      - name: cloud-init
    interfaces:
      - name: pod
        bridge: {}
      - name: overlay
        bridge: {}
  networks:
    - name: pod
      pod: {}
    - name: overlay
      multus:
        networkName: overlay
  volumes:
    - name: ubuntu
      containerRootfs:
        image: smartxworks/virtink-container-rootfs-ubuntu
        imagePullPolicy: IfNotPresent
        size: 16Gi
    - name: cloud-init
      cloudInit:
        userData: |-
          #cloud-config
          password: password
          chpasswd: { expire: False }
          ssh_pwauth: True

and got vm failed:

% k get vm
NAME            STATUS   NODE
ubuntu-rootfs   Failed   

there is no any logs (and Pod with VM also destroyed) but I've catch related logs:

2022/09/15 08:54:05 Failed to build VM config: setup bridge network: start DHCP server: start dnsmasq: "/usr/sbin/dnsmasq --conf-file=/var/run/virtink/dnsmasq/br-net1.conf --pid-file=/var/run/virtink/dnsmasq/br-net1.pid": exit status 1: 
dnsmasq: bad IP address at line 6 of /var/run/virtink/dnsmasq/br-net1.conf

As I understand on line 6 ether is router details usually (dhcp-option=option:router)

VM started only when bridge has setting IsDefaultGateway: true:

apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: overlay
spec:
  config: '{
    "cniVersion": "0.3.1",
    "name": "overlay",
    "type": "bridge",
    "bridge": "my-bridge",
    "isDefaultGateway": true,
    "ipam": {
        "type": "host-local",
        "subnet": "10.88.0.0/16"
    }
  }'

but then networking inside pod with vm looks weird and pod not available via net

scuzhanglei commented 1 year ago

in bridge mode, virtink run a dnsmasq server in pod to supply ip and routes of the original interface to VM. I guess this error is caused by the 'net1' interface has no routes, so result in a wrong dnsmasq config file. for a workaround, you can add "isGateway": true in your NetworkAttachmentDefinition.

cicdteam commented 1 year ago

I've tried isGateway:true (it was first I've tried) and it not works. Only isDefaultGateway: true allows start pod with vm

scuzhanglei commented 1 year ago

when isDefaultGateway: true bridge cni will add a default route to container, this will conflict with the default pod interface.

only isGateway: true, bridge will not add a route to container. so ipam should return a route explicitly to make it works:

apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: overlay
spec:
  config: '{
    "cniVersion": "0.3.1",
    "name": "overlay",
    "type": "bridge",
    "bridge": "my-bridge",
    "isGateway": true,
    "ipam": {
        "type": "host-local",
        "subnet": "10.88.0.0/16"
        "routes": [
            { "dst": "10.88.0.0/16" }
        ]
    }
  }'
cicdteam commented 1 year ago

yes, I able to run vm with 2 interfaces when specify "routes" part in cni configuration. Btw, isGateway: false in my settings as bridge was created (and configured with IP addressing) outside k8s cluster.

Seems all ok, but I still have issues with routing.

Network definition:

apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: overlay
spec:
  config: '{
    "cniVersion": "0.3.1",
    "name": "overlay",
    "type": "bridge",
    "bridge": "vm-bridge0",
    "isDefaultGateway": false,
    "isGateway": false,
    "ipMasq": false,
    "ipam": {
        "type": "host-local",
        "ranges": [
          [
            {
              "subnet":     "10.77.0.0/16",
              "rangeStart": "10.77.77.10",
              "rangeEnd":   "10.77.77.50"
            }
          ]
        ],
        "routes": [
            {"dst": "10.77.0.0/16", "gw": "10.77.0.3"}
        ]
    }
  }'

VM netowrking settings:

    interfaces:
      - name: pod
        masquerade: {}
      - name: overlay
        bridge: {}

  networks:
    - name: pod
      pod: {}
    - name: overlay
      multus:
        networkName: overlay

as result in VM I see duplicated routes:

ubuntu@ubuntu-rootfs:~$ ip r s
default via 10.0.2.1 dev ens4 proto dhcp src 10.0.2.2 metric 100 
default via 10.77.0.3 dev ens5 proto dhcp src 10.77.77.12 metric 100 
10.0.2.0/30 dev ens4 proto kernel scope link src 10.0.2.2 metric 100 
10.0.2.1 dev ens4 proto dhcp scope link src 10.0.2.2 metric 100 
10.43.0.10 via 10.0.2.1 dev ens4 proto dhcp src 10.0.2.2 metric 100 
10.43.0.10 via 10.77.0.3 dev ens5 proto dhcp src 10.77.77.12 metric 100 
10.77.0.0/16 dev ens5 proto kernel scope link src 10.77.77.12 metric 100 
10.77.0.3 dev ens5 proto dhcp scope link src 10.77.77.12 metric 100 

I thought there should be one route for 10.77.0.0/16 but I see wrong default and 10.43.0.10 (dns in k3s) routes

dnsmasq configs in pre-runner pod:

/ # cat /var/run/virtink/dnsmasq/br-eth0.conf 
port=0
interface=br-eth0
bind-interfaces
dhcp-range=10.0.2.2,static,255.255.255.252
dhcp-host=52:54:00:ae:4a:95,10.0.2.2,infinite
dhcp-option=option:router,10.0.2.1
dhcp-option=option:dns-server,10.43.0.10
dhcp-option=option:domain-search,default.svc.cluster.local,svc.cluster.local,cluster.local,eu-west-1.compute.internal
dhcp-authoritative
shared-network=br-eth0,10.0.2.2

/ # cat /var/run/virtink/dnsmasq/br-net1.conf
port=0
interface=br-net1
bind-interfaces
dhcp-range=10.77.77.12,static,255.255.0.0
dhcp-host=52:54:00:77:0e:83,10.77.77.12,infinite
dhcp-option=option:router,10.77.0.3
dhcp-option=option:dns-server,10.43.0.10
dhcp-option=option:domain-search,default.svc.cluster.local,svc.cluster.local,cluster.local,eu-west-1.compute.internal
dhcp-authoritative
shared-network=br-net1,10.77.77.12

dhcp-option=option:router specifies default router and seems for second interface should be dhcp-option=option:classless-static-route,10.77.0.0/16,10.77.0.3

fengye87 commented 1 year ago

@cicdteam I think you're right. Are you suggesting to make the first network default route while other networks non-default route? Would you care to send a PR?