Mirantis / virtlet

Kubernetes CRI implementation for running VM workloads
Apache License 2.0
743 stars 128 forks source link

Virtlet considered VFs with vfio-pci driver as a disk device #900

Closed leyao-daily closed 5 years ago

leyao-daily commented 5 years ago

In my host, i create 6 vfs and use dpdk-devbind.py to bind vfio-pci driver to them.

root@ubuntu:~# python3 dpdk-devbind.py -s

Network devices using DPDK-compatible driver
============================================
0000:3d:06.0 'Ethernet Virtual Function 700 Series 37cd' drv=vfio-pci unused=i40evf
0000:3d:06.1 'Ethernet Virtual Function 700 Series 37cd' drv=vfio-pci unused=i40evf
0000:3d:06.2 'Ethernet Virtual Function 700 Series 37cd' drv=vfio-pci unused=i40evf
0000:3d:06.3 'Ethernet Virtual Function 700 Series 37cd' drv=vfio-pci unused=i40evf
0000:3d:06.4 'Ethernet Virtual Function 700 Series 37cd' drv=vfio-pci unused=i40evf
0000:3d:06.5 'Ethernet Virtual Function 700 Series 37cd' drv=vfio-pci unused=i40evf

Network devices using kernel driver
===================================
0000:18:00.0 'Ethernet Controller X710 for 10GbE SFP+ 1572' if=enp24s0f0 drv=i40e unused=vfio-pci
0000:18:00.1 'Ethernet Controller X710 for 10GbE SFP+ 1572' if=enp24s0f1 drv=i40e unused=vfio-pci
0000:3d:00.0 'Ethernet Connection X722 for 10GBASE-T 37d2' if=eno1 drv=i40e unused=vfio-pci *Active*
0000:3d:00.1 'Ethernet Connection X722 for 10GBASE-T 37d2' if=eno2 drv=i40e unused=vfio-pci
0000:af:00.0 'Ethernet Controller X710 for 10GbE SFP+ 1572' if=enp175s0f0 drv=i40e unused=vfio-pci
0000:af:00.1 'Ethernet Controller X710 for 10GbE SFP+ 1572' if=enp175s0f1 drv=i40e unused=vfio-pci

No 'Crypto' devices detected
============================

No 'Eventdev' devices detected
==============================

No 'Mempool' devices detected
=============================

No 'Compress' devices detected
==============================

The required envs are all set correctly.

And then i use sriov-cni, sriov-device-plugin and multus-v3.3-tp, they are all tested in plaint pod in k8s and worked well. This is the sriov device plugin conf and when it works i can get them in my node description.

 {
        "resourceList": [
            {
                "resourceName": "intel_sriov",
                "selectors": {
                    "vendors": ["8086"],
                    "devices": ["37cd"],
                    "drivers": ["vfio-pci"],
                    "pfNames": ["eno2"]
                }
            }
        ]
    }

I create a network:

apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: sriovnet
  annotations:
    k8s.v1.cni.cncf.io/resourceName: intel.com/intel_sriov
spec:
  config: '{
    "type": "sriov",
    "cniVersion": "0.3.1",
    "ipam": {
            "type": "host-local",
            "subnet": "10.56.206.0/24",
            "routes": [
                    { "dst": "0.0.0.0/0" }
            ],
            "gateway": "10.56.206.1"
    }
  }'

And when i apply a virtlet vm:

root@ubuntu:~# vim cirros-vm.yaml
  name: cirros-vm
  annotations:
    k8s.v1.cni.cncf.io/networks: sriovnet
    # This tells CRI Proxy that this pod belongs to Virtlet runtime
    kubernetes.io/target-runtime: virtlet.cloud
    # inject ssh keys via cloud-init
    VirtletSSHKeys: |
      ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCaJEcFDXEK2ZbX0ZLS1EIYFZRbDAcRfuVjpstSc0De8+sV1aiu+dePxdkuDRwqFtCyk6dEZkssjOkBXtri00MECLkir6FcH3kKOJtbJ6vy3uaJc9w1ERo+wyl6SkAh/+JTJkp7QRXj8oylW5E20LsbnA/dIwWzAF51PPwF7A7FtNg9DnwPqMkxFo1Th/buOMKbP5ZA1mmNNtmzbMpMfJATvVyiv3ccsSJKOiyQr6UG+j7sc/7jMVz5Xk34Vd0l8GwcB0334MchHckmqDB142h/NCWTr8oLakDNvkfC1YneAfAO41hDkUbxPtVBG5M/o7P4fxoqiHEX+ZLfRxDtHB53 me@localhost
spec:
  # This nodeSelector specification tells Kubernetes to run this
  # pod only on the nodes that have extraRuntime=virtlet label.
  # This label is used by Virtlet DaemonSet to select nodes
  # that must have Virtlet runtime
  nodeSelector:
    extraRuntime: virtlet

  containers:
  - name: cirros-vm
    # This specifies the image to use.
    # virtlet.cloud/ prefix is used by CRI proxy, the remaining part
    # of the image name is prepended with https:// and used to download the image
    image: virtlet.cloud/cirros
    imagePullPolicy: IfNotPresent
    # tty and stdin required for `kubectl attach -t` to work
    tty: true
    stdin: true
    resources:
      requests:
        intel.com/intel_sriov: '1'
      limits:
        intel.com/intel_sriov: '1'

The pod description shows that:

Events:
  Type     Reason     Age   From               Message
  ----     ------     ----  ----               -------
  Normal   Scheduled  14s   default-scheduler  Successfully assigned default/cirros-vm to ubuntu
  Warning  Failed     5s    kubelet, ubuntu    Error: "/run/virtlet.sock": rpc error: code = 2 desc = failed to create domain "bdb42e55-51d5-57a3-5648-a20f0460e785": virError(Code=1, Domain=10, Message='internal error: process exited while connecting to monitor: I0925 02:15:00.333997   19563 vmwrapper.go:66] Obtaining PID of the VM container process...
nsfix reexec: pid 19563: entering the namespaces of target pid 8826
2019-09-25T02:15:00.412368Z qemu-system-x86_64: -drive file=/dev/vfio/vfio,format=raw,if=none,id=drive-scsi0-0-0-1: Could not refresh total sector count: Illegal seek')
  Normal   Pulled   4s (x2 over 7s)  kubelet, ubuntu  Container image "virtlet.cloud/cirros" already present on machine
  Normal   Created  4s (x2 over 6s)  kubelet, ubuntu  Created container cirros-vm
  Warning  Failed   4s               kubelet, ubuntu  Error: "/run/virtlet.sock": rpc error: code = 2 desc = failed to create domain "bdb42e55-51d5-57a3-5648-a20f0460e785": virError(Code=1, Domain=10, Message='internal error: process exited while connecting to monitor: I0925 02:15:01.855683   19834 vmwrapper.go:66] Obtaining PID of the VM container process...
nsfix reexec: pid 19834: entering the namespaces of target pid 8826
2019-09-25T02:15:01.912965Z qemu-system-x86_64: -drive file=/dev/vfio/vfio,format=raw,if=none,id=drive-scsi0-0-0-1: Could not refresh total sector count: Illegal seek')

Login the virtlet pod and virsh dumpxml of the vm, i find the /dev/vfio/vfio and /dev/vfio/86 (the vf assigned to this vm) was a disk device, what wrong with it?

<domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
  <name>virtlet-bdb42e55-51d5-cirros-vm</name>
  <uuid>bdb42e55-51d5-57a3-5648-a20f0460e785</uuid>
  <memory unit='KiB'>1048576</memory>
  <currentMemory unit='KiB'>1048576</currentMemory>
  <vcpu placement='static'>1</vcpu>
  <cputune>
    <shares>2</shares>
    <period>100000</period>
  </cputune>
  <os>
    <type arch='x86_64' machine='pc-i440fx-bionic'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
  </features>
  <clock offset='utc'/>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/vmwrapper</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/var/lib/virtlet/volumes/virtlet_root_bdb42e55-51d5-57a3-5648-a20f0460e785'/>
      <target dev='sda' bus='scsi'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <disk type='block' device='disk'>
      <driver name='qemu' type='raw'/>
      <source dev='/dev/vfio/vfio'/>
      <target dev='sdb' bus='scsi'/>
      <address type='drive' controller='0' bus='0' target='0' unit='1'/>
    </disk>
    <disk type='block' device='disk'>
      <driver name='qemu' type='raw'/>
      <source dev='/dev/vfio/86'/>
      <target dev='sdc' bus='scsi'/>
      <address type='drive' controller='0' bus='0' target='0' unit='2'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/var/lib/virtlet/config/config-bdb42e55-51d5-57a3-5648-a20f0460e785.iso'/>
      <target dev='sdd' bus='scsi'/>
      <readonly/>
      <address type='drive' controller='0' bus='0' target='0' unit='3'/>
    </disk>
    <controller type='scsi' index='0' model='virtio-scsi'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </controller>
    <controller type='usb' index='0' model='piix3-uhci'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'/>
    <serial type='unix'>
      <source mode='connect' path='/var/lib/libvirt/streamer.sock'>
        <reconnect enabled='yes' timeout='1'/>
      </source>
      <target type='isa-serial' port='0'>
        <model name='isa-serial'/>
      </target>
    </serial>
    <console type='unix'>
      <source mode='connect' path='/var/lib/libvirt/streamer.sock'>
        <reconnect enabled='yes' timeout='1'/>
      </source>
      <target type='serial' port='0'/>
    </console>
    <input type='tablet' bus='usb'>
      <address type='usb' bus='0' port='1'/>
    </input>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <graphics type='vnc' port='-1' autoport='yes'>
      <listen type='address'/>
    </graphics>
    <video>
      <model type='cirrus' vram='16384' heads='1' primary='yes'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </memballoon>
  </devices>
  <qemu:commandline>
    <qemu:env name='VIRTLET_EMULATOR' value='/usr/bin/kvm'/>
    <qemu:env name='VIRTLET_NET_KEY' value='d0307ccd-1db7-4acb-a83c-6fd9cea6c39a'/>
    <qemu:env name='VIRTLET_CONTAINER_ID' value='bdb42e55-51d5-57a3-5648-a20f0460e785'/>
    <qemu:env name='VIRTLET_CONTAINER_LOG_PATH' value='/var/log/pods/default_cirros-vm_d0307ccd-1db7-4acb-a83c-6fd9cea6c39a/cirros-vm/0.log'/>
    <qemu:env name='VMWRAPPER_KEEP_PRIVS' value='1'/>
  </qemu:commandline>
</domain>