Mirantis / virtlet

Kubernetes CRI implementation for running VM workloads
Apache License 2.0
739 stars 128 forks source link

virtlet sriov only works with VFIO? #886

Closed hemanthnakkina-zz closed 5 years ago

hemanthnakkina-zz commented 5 years ago

Is Virtlet SRIOV only works with VFIO?

I see hardcoded stuff related to VFIO https://github.com/Mirantis/virtlet/blob/master/pkg/nettools/sriov.go#L248-L259 https://github.com/Mirantis/virtlet/blob/master/cmd/vmwrapper/vmwrapper.go#L125

jellonek commented 5 years ago

Yes. Do you have other proposal?

hemanthnakkina-zz commented 5 years ago

I got my understanding wrong and so the query. Closing this.

leyao-daily commented 4 years ago

Yes. Do you have other proposal?

Does virtlet support multus-master with sr-iov ?

hemanthnakkina-zz commented 4 years ago

Couple of months back, I have virtlet working with multus and sriov with multus-cni version:v3.3-tp ( not master). See this for more details... https://github.com/intel/multus-cni/pull/165

leyao-daily commented 4 years ago

Couple of months back, I have virtlet working with multus and sriov with multus-cni version:v3.3-tp ( not master). See this for more details... intel/multus-cni#165

@hemanthnakkina Hi, Do you have step guide or configuration about it? Is the vf driver is vfio-pci?

hemanthnakkina-zz commented 4 years ago

Hey I don't have step-by-step instruction, just followed the documentation from respective projects with modified docker image versions. Yes i used vfio-pci driver

leyao-daily commented 4 years ago

@hemanthnakkina Yeah, but when i used vfio-pci drivers for vfs, virtlet vm considered the devices as the disk driver and report qemu-system-x86_64: -drive file=/dev/vfio/vfio,format=raw,if=none,id=drive-virtio-disk1: Could not refresh total sector count: Illegal seek The virsh dumpxml file: `

/vmwrapper
<disk type='file' device='disk'>
  <driver name='qemu' type='qcow2'/>
  <source file='/var/lib/virtlet/volumes/virtlet_root_a01cd705-d600-501b-6bdb-15fab1431e6e'/>
  <target dev='vda' bus='virtio'/>
  <address type='pci' domain='0x0000' bus='0x01' slot='0x01' function='0x0'/>
</disk>
<disk type='block' device='disk'>
  <driver name='qemu' type='raw'/>
  <source dev='/dev/vfio/vfio'/>
  <target dev='vdb' bus='virtio'/>
  <address type='pci' domain='0x0000' bus='0x01' slot='0x02' function='0x0'/>
</disk>
<disk type='block' device='disk'>
  <driver name='qemu' type='raw'/>
  <source dev='/dev/vfio/84'/>
  <target dev='vdc' bus='virtio'/>
  <address type='pci' domain='0x0000' bus='0x01' slot='0x03' function='0x0'/>
</disk>

...`

Do you have met this issues? And it there any required drivers for pf? my pf drivers is i40e.

hemanthnakkina-zz commented 4 years ago

I don't see these issues and I too have i40e as PF driver. Unfortunately i don't have the environment any more to crosscheck.

leyao-daily commented 4 years ago

@hemanthnakkina Got it. I found that i use multus-master and i change to use multus-v3.3-tp to solve it, and it encounter another wrong when create a vm:

apiVersion: v1 kind: Pod metadata: name: scc annotations: kubernetes.io/target-runtime: virtlet.cloud VirtletDiskDriver: virtio VirtletSSHKeys: | ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCaJEcFDXEK2ZbX0ZLS1EIYFZRbDAcRfuVjpstSc0De8+sV1aiu+dePxdkuDRwqFtCyk6dEZkssjOkBXtri00MECLkir6FcH3kKOJtbJ6vy3uaJc9w1ERo+wyl6SkAh/+JTJkp7QRXj8oylW5E20LsbnA/dIwWzAF51PPwF7A7FtNg9DnwPqMkxFo1Th/buOMKbP5ZA1mmNNtmzbMpMfJATvVyiv3ccsSJKOiyQr6UG+j7sc/7jMVz5Xk34Vd0l8GwcB0334MchHckmqDB142h/NCWTr8oLakDNvkfC1YneAfAO41hDkUbxPtVBG5M/o7P4fxoqiHEX+ZLfRxDtHB53 me@localhost VirtletRootVolumeSize: 1Gi k8s.v1.cni.cncf.io/networks: sriov-net1 spec: nodeSelector: extraRuntime: virtlet containers: name: scc image: virtlet.cloud/cirros imagePullPolicy: IfNotPresent tty: true stdin: true resources: requests: intel.com/intel_sriov_netdevice: '1' limits: intel.com/intel_sriov_netdevice: '1'

And the describe pod : `Events: Type Reason Age From Message


Normal Scheduled 10s default-scheduler Successfully assigned default/scc to ubuntu Warning FailedCreatePodSandBox 3s kubelet, ubuntu Failed create pod sandbox: rpc error: code = Unknown desc = "/run/virtlet.sock": rpc error: code = 2 desc = Error adding pod scc (559e9b85-7e12-4aaf-a44e-14381a0868ee) to CNI network: server returned error: error getting fd: missing link #0 in the container network namespace (Virtlet pod restarted?) `

leyao-daily commented 4 years ago

@hemanthnakkina maybe i build it in wrong steps? multus v3.3tp and intel/sriov-cni sriov device plugin. I describe my steps in https://github.com/Mirantis/virtlet/issues/900, if convenient, hope you can take a look at it.

hemanthnakkina-zz commented 4 years ago

Configuration seems ok. I believe you don't have issues with bringing up virtlet VM without SRIOV/DPDK. During my tests, I have used calico as primary cni and sriov as secondary cni.

leyao-daily commented 4 years ago

@hemanthnakkina And i use dpdk-devbind.py to bind vfio-pci driver to my vf on my node host, is that right? What is the version of your virtlet? Do you have remained the yamls of multus calico sriov and so on. I meet these multus error, but the vm running without vf device. 2019-09-25T23:11:54-04:00 [error] Multus: Err in getting k8s args: ARGS: unknown args ["K8S_ANNOT={\"cni\": \"calico\"}"]

hemanthnakkina-zz commented 4 years ago

Yes, i used dpdk-devbind.py to bind vfio device. Regarding the Multus error, patch have been merged on Aug 18 - https://github.com/Mirantis/virtlet/pull/891 You can use virtlet image with tag as latest

leyao-daily commented 4 years ago

@hemanthnakkina Thanks, and how to you configure the sriov in vm yaml, is it same with a pod? Can I see it?

leyao-daily commented 4 years ago

@hemanthnakkina It's my vm pod yaml:

apiVersion: v1
kind: Pod
metadata:
  name: cirros-vm
  annotations:
    k8s.v1.cni.cncf.io/networks: sriovnet
    # This tells CRI Proxy that this pod belongs to Virtlet runtime
    kubernetes.io/target-runtime: virtlet.cloud
    # inject ssh keys via cloud-init
    VirtletSSHKeys: |
      ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCaJEcFDXEK2ZbX0ZLS1EIYFZRbDAcRfuVjpstSc0De8+sV1aiu+dePxdkuDRwqFtCyk6dEZkssjOkBXtri00MECLkir6FcH3kKOJtbJ6vy3uaJc9w1ERo+wyl6SkAh/+JTJkp7QRXj8oylW5E20LsbnA/dIwWzAF51PPwF7A7FtNg9DnwPqMkxFo1Th/buOMKbP5ZA1mmNNtmzbMpMfJATvVyiv3ccsSJKOiyQr6UG+j7sc/7jMVz5Xk34Vd0l8GwcB0334MchHckmqDB142h/NCWTr8oLakDNvkfC1YneAfAO41hDkUbxPtVBG5M/o7P4fxoqiHEX+ZLfRxDtHB53 me@localhost
spec:
  # This nodeSelector specification tells Kubernetes to run this
  # pod only on the nodes that have extraRuntime=virtlet label.
  # This label is used by Virtlet DaemonSet to select nodes
  # that must have Virtlet runtime
  nodeSelector:
    extraRuntime: virtlet

  containers:
  - name: cirros-vm
    # This specifies the image to use.
    # virtlet.cloud/ prefix is used by CRI proxy, the remaining part
    # of the image name is prepended with https:// and used to download the image
    image: virtlet.cloud/cirros
    imagePullPolicy: IfNotPresent
    # tty and stdin required for `kubectl attach -t` to work
    tty: true
    stdin: true
    resources:
      requests:
        intel.com/intel_sriov: '1'
      limits:
        intel.com/intel_sriov: '1'
hemanthnakkina-zz commented 4 years ago

Seems good. I usually have my nad definition in kube-system namespace and so my annotation looks like k8s.v1.cni.cncf.io/networks: kube-system/sriov-net1

If you have further issues, look at multus logs.