Closed amarshall closed 4 years ago
@amarshall thanks for raising this, I think this is my fault since device cgroups are honoured when sandbox_cgroup_only=true
, I have a patch, could you please help me to test it?
@devimc Running 1.11.0-alpha1 with patches #2542 and #2606 gets the container started and I can see the device passed-through with lspci
(though it gives me “unknown header type 7f”, but I’m fairly certain that‘s a driver issue within the VM).
Thanks!
@amarshall thanks for confirming
@amarshall Are you running cgroups v2 or v1?
@amorenoz As far as I can tell, cgroups v2, which is the default in Fedora since fc31.
Using kata-runtime from with latest version(1.11.1), I test sriov NIC with config "sandbox_cgroup_only=true", also report error with “Operation not permitted”, and qemu vm started failed. But if I set sandbox_cgroup_only=false,there is no error, vm is started successfully.
So I think it may be the same reasons with this problem.
Error msg:
level=error msg="failed to launch qemu: qemu-system-x86_64: -device vfio-pci,host=0000:3b:00.2,x-pci-vendor-id=0x15b3,x-pci-device-id=0x1018,romfile=: vfio error: 0000:3b:00.2: failed to open /dev/vfio/82: Operation not permitted\n" ID=ad42d0a64a2f853b6ae57754f680ebc0bd7c8364be0a3b3045c12a9da5f350e7 error="exit status 1" source=virtcontainers subsystem=qemu
Host kernel version:
5.4.19.bsk.1-amd64 #5.4.19.bsk.1 SMP Debian 5.4.19.bsk.1 Fri Feb 21 13:20:08 UTC 20 x86_64 GNU/Linux
@yadzhang what command did you run? docker, podman, k8s? you are facing an issue with the device cgroup in the host
Thanks for response. I use k8s+containerd+kata-runtime+qemu and network mode is switchdev+sriov. Kata-runtime consider the sriov interface in the sandbox netns as physical interface, so it use vfio to mount into the vm.
@yadzhang can you enable unsafe_interrupts and try
echo "options vfio_iommu_type1 allow_unsafe_interrupts=1" > /etc/modprobe.d/iommu_unsafe_interrupts.conf
@yadzhang can you enable unsafe_interrupts and try
echo "options vfio_iommu_type1 allow_unsafe_interrupts=1" > /etc/modprobe.d/iommu_unsafe_interrupts.conf
The same error.
Warning FailedCreatePodSandBox 2s (x4 over 46s) kubelet, Failed create pod sandbox: rpc error: code = Unknown desc = failed to create containerd task: failed to launch qemu: exit status 1, error messages from qemu log: qemu-system-x86_64: -device vfio-pci,host=0000:3b:00.4,x-pci-vendor-id=0x15b3,x-pci-device-id=0x1018,romfile=: vfio error: 0000:3b:00.4: failed to open /dev/vfio/84: Operation not permitted
Maybe need to add device "/dev/vfio/84" into the sandbox device cgroup file "devices.allow" ?
Maybe need to add device "/dev/vfio/84" into the sandbox device cgroup file "devices.allow" ?
@yadzhang kata-runtime should do it automatically, could you enable_debug
in the configuration file, run again the test again and paste the logs here ?
I check the code about the cgroup manger. It adds /dev/vfio/vfio into cgroup but no /dev/vfio/{id}.
I use containerd-shim-kata-v2 instead of kata-runtim and set all enable_debug=true
in the configuration. And all logs are belows:
Jun 12 11:14:40 kata[1850343]: time="2020-06-12T11:14:40.767948982+08:00" level=info msg="loaded configuration" ID=3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074 file=/data/kata/share/defaults/kata-containers/configuration-qemu.toml format=TOML source=katautils Jun 12 11:14:40 kata[1850343]: time="2020-06-12T11:14:40.768039539+08:00" level=debug ID=3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074 default-kernel-parameters="systemd.unit=kata-containers.target systemd.mask=systemd-networkd.service systemd.mask=systemd-networkd.socket" source=katautils Jun 12 11:14:40 kata[1850343]: time="2020-06-12T11:14:40.768465049+08:00" level=debug msg="container rootfs: /root/tce/containerd/run/daemon/io.containerd.runtime.v2.task/k8s.io/3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074/rootfs" source=virtcontainers subsystem=oci Jun 12 11:14:40 kata[1850343]: time="2020-06-12T11:14:40.770638839+08:00" level=debug msg="restore sandbox failed" ID=3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074 error="open /run/vc/sbs/3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074/persist.json: no such file or directory" sandbox=3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074 source=virtcontainers subsystem=sandbox Jun 12 11:14:40 kata[1850343]: time="2020-06-12T11:14:40.770685291+08:00" level=debug msg="Creating bridges" ID=3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074 source=virtcontainers subsystem=qemu Jun 12 11:14:40 kata[1850343]: time="2020-06-12T11:14:40.77070504+08:00" level=debug msg="Creating UUID" ID=3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074 source=virtcontainers subsystem=qemu Jun 12 11:14:40 kata[1850343]: time="2020-06-12T11:14:40.771238099+08:00" level=debug msg="Disable nesting environment checks" ID=3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074 inside-vm=false source=virtcontainers subsystem=qemu Jun 12 11:14:40 kata[1850343]: time="2020-06-12T11:14:40.771565033+08:00" level=info msg="adding volume" ID=3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074 source=virtcontainers subsystem=qemu volume-type=virtio-9p Jun 12 11:14:40 kata[1850343]: time="2020-06-12T11:14:40.772286581+08:00" level=info msg="Physical network interface found" ID=3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074 interface=eth0 source=virtcontainers subsystem=network Jun 12 11:14:40 kata[1850343]: time="2020-06-12T11:14:40.772815577+08:00" level=info msg="Endpoints found after scan" ID=3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074 endpoints="[0xc0000f2a00]" source=virtcontainers subsystem=network Jun 12 11:14:40 kata[1850343]: time="2020-06-12T11:14:40.7728695+08:00" level=info msg="Attaching endpoint" ID=3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074 endpoint-type=physical hotplug=false source=virtcontainers subsystem=network Jun 12 11:14:40 kata[1850343]: time="2020-06-12T11:14:40.772894941+08:00" level=info msg="Unbinding device from driver" ID=3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074 device-bdf="0000:3b:00.3" driver-path="/sys/bus/pci/devices/0000:3b:00.3/driver/unbind" source=virtcontainers subsystem=device Jun 12 11:14:41 kata[1850343]: time="2020-06-12T11:14:41.3118116+08:00" level=info msg="Writing vendor-device-id to vfio new-id path" ID=3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074 source=virtcontainers subsystem=device vendor-device-id="0x15b3 0x1018" vfio-new-id-path=/sys/bus/pci/drivers/vfio-pci/new_id Jun 12 11:14:41 kata[1850343]: time="2020-06-12T11:14:41.312110237+08:00" level=info msg="Binding device to vfio driver" ID=3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074 device-bdf="0000:3b:00.3" driver-path=/sys/bus/pci/drivers/vfio-pci/bind source=virtcontainers subsystem=device Jun 12 11:14:41 kata[1850343]: time="2020-06-12T11:14:41.312243935+08:00" level=debug msg="Network added" ID=3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074 source=virtcontainers subsystem=network Jun 12 11:14:41 kata[1850343]: time="2020-06-12T11:14:41.334822937+08:00" level=info msg="Starting VM" ID=3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074 sandbox=3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074 source=virtcontainers subsystem=sandbox Jun 12 11:14:41 kata[1850343]: time="2020-06-12T11:14:41.334915029+08:00" level=debug ID=3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074 default-kernel-parameters="tsc=reliable no_timer_check rcupdate.rcu_expedited=1 i8042.direct=1 i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k console=hvc0 console=hvc1 iommu=off cryptomgr.notests net.ifnames=0 pci=lastbus=0 root=/dev/pmem0p1 rootflags=dax,data=ordered,errors=remount-ro ro rootfstype=ext4 debug systemd.show_status=true systemd.log_level=debug" source=virtcontainers subsystem=qemu Jun 12 11:14:41 kata[1850343]: time="2020-06-12T11:14:41.335072446+08:00" level=info msg="launching /data/kata/bin/qemu-system-x86_64 with: [-name sandbox-3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074 -uuid 767679a8-99d6-4e64-877c-3fa806398490 -machine pc,accel=kvm,kernel_irqchip,nvdimm -cpu host -qmp unix:/run/vc/vm/3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074/qmp.sock,server,nowait -m 2048M,slots=10,maxmem=386179M -device pci-bridge,bus=pci.0,id=pci-bridge-0,chassis_nr=1,shpc=on,addr=2,romfile= -device virtio-serial-pci,disable-modern=false,id=serial0,romfile= -device virtconsole,chardev=charconsole0,id=console0 -chardev socket,id=charconsole0,path=/run/vc/vm/3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074/console.sock,server,nowait -device nvdimm,id=nv0,memdev=mem0 -object memory-backend-file,id=mem0,mem-path=/data/kata/share/kata-containers/kata-containers.img,size=402653184 -device virtio-scsi-pci,id=scsi0,disable-modern=false,romfile= -object rng-random,id=rng0,filename=/dev/urandom -device virtio-rng-pci,rng=rng0,romfile= -device virtserialport,chardev=charch0,id=channel0,name=agent.channel.0 -chardev socket,id=charch0,path=/run/vc/vm/3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074/kata.sock,server,nowait -device virtio-9p-pci,disable-modern=false,fsdev=extra-9p-kataShared,mount_tag=kataShared,romfile= -fsdev local,id=extra-9p-kataShared,path=/run/kata-containers/shared/sandboxes/3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074/shared,security_model=none -device vfio-pci,host=0000:3b:00.3,x-pci-vendor-id=0x15b3,x-pci-device-id=0x1018,romfile= -global kvm-pit.lost_tick_policy=discard -vga none -no-user-config -nodefaults -nographic -daemonize -object memory-backend-ram,id=dimm1,size=2048M -numa node,memdev=dimm1 -kernel /data/kata/share/kata-containers/vmlinuz-4.19.86-60 -append tsc=reliable no_timer_check rcupdate.rcu_expedited=1 i8042.direct=1 i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k console=hvc0 console=hvc1 iommu=off cryptomgr.notests net.ifnames=0 pci=lastbus=0 root=/dev/pmem0p1 rootflags=dax,data=ordered,errors=remount-ro ro rootfstype=ext4 debug systemd.show_status=true systemd.log_level=debug panic=1 nr_cpus=96 agent.use_vsock=false systemd.unit=kata-containers.target systemd.mask=systemd-networkd.service systemd.mask=systemd-networkd.socket scsi_mod.scan=none rw -pidfile /run/vc/vm/3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074/pid -D /run/vc/vm/3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074/qemu.log -smp 1,cores=1,threads=1,sockets=96,maxcpus=96]" ID=3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074 source=virtcontainers subsystem=qmp Jun 12 11:14:41 kata[1850343]: time="2020-06-12T11:14:41.370674027+08:00" level=error msg="Unable to launch /data/kata/bin/qemu-system-x86_64: exit status 1" ID=3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074 source=virtcontainers subsystem=qmp Jun 12 11:14:41 kata[1850343]: time="2020-06-12T11:14:41.370727817+08:00" level=error ID=3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074 source=virtcontainers subsystem=qmp Jun 12 11:14:41 kata[1850343]: time="2020-06-12T11:14:41.370769312+08:00" level=error msg="failed to launch qemu: qemu-system-x86_64: -device vfio-pci,host=0000:3b:00.3,x-pci-vendor-id=0x15b3,x-pci-device-id=0x1018,romfile=: vfio error: 0000:3b:00.3: failed to open /dev/vfio/83: Operation not permitted\n" ID=3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074 error="exit status 1" source=virtcontainers subsystem=qemu Jun 12 11:14:41 kata[1850343]: time="2020-06-12T11:14:41.370873008+08:00" level=info msg="Detaching endpoint" ID=3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074 endpoint-type=physical source=virtcontainers subsystem=network Jun 12 11:14:41 kata[1850343]: time="2020-06-12T11:14:41.370898209+08:00" level=info msg="Unbinding device from driver" ID=3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074 device-bdf="0000:3b:00.3" driver-path="/sys/bus/pci/devices/0000:3b:00.3/driver/unbind" source=virtcontainers subsystem=device Jun 12 11:14:41 kata[1850343]: time="2020-06-12T11:14:41.371167118+08:00" level=info msg="Binding back device to host driver" ID=3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074 device-bdf="0000:3b:00.3" driver-path=/sys/bus/pci/drivers/mlx5_core/bind source=virtcontainers subsystem=device Jun 12 11:14:41 kata[1850343]: time="2020-06-12T11:14:41.765714726+08:00" level=debug msg="Network removed" ID=3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074 source=virtcontainers subsystem=network Jun 12 11:14:41 kata[1850343]: time="2020-06-12T11:14:41.765764816+08:00" level=debug msg="Deleting sandbox cgroup" ID=3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074 sandbox=3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074 source=virtcontainers subsystem=sandbox Jun 12 11:14:41 kata[1850343]: time="2020-06-12T11:14:41.7858325+08:00" level=info msg="cleanup agent" ID=3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074 path=/run/kata-containers/shared/sandboxes/3656ccc016ed2c4efca0795472f14e055c89be54fd8ef260e9c29d52576fc074/shared source=virtcontainers subsystem=kata_agent
@yadzhang yeah - I think this is a valid issues, would you mind filing a new issue?
@yadzhang I have a patch, once you raise the issue, I will open a PR and need to help to test it, wdyt?
@yadzhang I have a patch, once you raise the issue, I will open a PR and need to help to test it, wdyt?
Thank you for reply, I will raise a new issue.
Description of problem
Fedora 32 beta, Kernel 5.6.3-300. Using kata-runtime from Fedora repo (1.11.0-alpha1) and encountered #2542, so forked the upstream package spec and applied the patch for that fix. This problem was encountered on that as well as changing the package spec to pull directly from commit af24829c2ae78c3b811c4cff6736ddaba500d37c. Both encounter the problem described.
Expected result
sudo podman run -it --rm --cap-add=ALL --runtime=kata-runtime --device /dev/vfio/72 fedora
Container starts and has device attached.Actual result
sudo podman run -it --rm --cap-add=ALL --runtime=kata-runtime --device /dev/vfio/72 fedora
This device being passed-through is an Intel 82599 Virtual Function from a X520, however I also encounter the same error with other devices bound to vfio-pci. I can successfully pass this device through to a VM using qemu-kvm via virt-manager, so IOMMU, etc., are configured correctly.
I am not using a custom kernel for the VM, as it doesn't seem like it should be necessary as this is not a large BAR device. I'd also expect if missing drivers were the only problem I'd still be able to start the container.
Starting a container using kata-runtime without attempting vfio passthrough is successful.
Various troubleshooting changes that had no effect:
-v /dev:/dev
topodman run
setenforce 0
(unsurprising as nothing in the audit logs indicates an SELinux problem)hotplug_vfio_on_root_bus = true
,machine_type = "pc"
hotplug_vfio_on_root_bus = true
,pcie_root_port = 1
(machine_type = "q35"
, the default); fails (I think expectedly) with--privileged
, this fails instead with the following. Per logs, though, this seems to occur before attaching the vfio device, as there is no "Start hot-plug VFIO device" entry as before.