Closed karlatec closed 3 years ago
Soft reboot with a physical NVMe device attached will always cause some failures, there are some issues that we didn't figure out in QEMU/libvfio-user.
Here is the error case I met when using a physical NVMe device as the backend:
qemu-system-x86_64: kvm_set_user_memory_region: KVM_SET_USER_MEMORY_REGION failed, slot=4, start=0xc0000, size=0xbff40000: File exists kvm_set_phys_mem: error registering slot: File exists Aborted (core dumped)
link this issue to https://github.com/nutanix/libvfio-user/issues/439, we will also track it there.
Here is the patch to fix it https://review.spdk.io/gerrit/c/spdk/spdk/+/7689, we need to add device reset support in SPDK and take care of the memory regions that registered to SPDK.
Unregistering the DMA regions might be something libvfio-user should do, I've started a discussion on Slack.
I updated the submodule via https://review.spdk.io/gerrit/c/spdk/spdk/+/7831, I tested and this issue has been fixed.
Patch has been merged, @karlatec the issues that block vfio-user performance tests have been fixed, I think we can start the performance tests now.
Current Behavior
After restarting Qemu VM which attaches to vfio-user socket there's a bunch of errors displayed:
Steps to Reproduce
i=1
rm -rf /var/run/muser rm -rf /dev/shm/muser
mkdir -p /var/run/muser mkdir -p /var/run/muser/iommu_group mkdir -p /var/run/muser/domain/muser$i/$i mkdir -p /dev/shm/muser/muser$i sleep 1
build/bin/nvmf_tgt -m [12] & pid=$! echo "PID: $pid" sleep 3
scripts/rpc.py bdev_nvme_attach_controller -b Nvme0 -t pcie -a 0000:0b:00.0 scripts/rpc.py nvmf_create_transport --trtype VFIOUSER scripts/rpc.py nvmf_create_subsystem nqn.2019-07.io.spdk:cnode0 -s SPDK001 -a scripts/rpc.py nvmf_subsystem_add_ns nqn.2019-07.io.spdk:cnode0 Nvme0n1 scripts/rpc.py nvmf_subsystem_add_listener nqn.2019-07.io.spdk:cnode0 -t VFIOUSER -a /var/run/muser/domain/muser$i/$i -s 0 sleep 1
ln -s /var/run/muser/domain/muser$i/$i /var/run/muser/domain/muser$i/$i/iommu_group ln -s /var/run/muser/domain/muser$i/$i /var/run/muser/iommu_group/$i ln -s /var/run/muser/domain/muser$i/$i/bar0 /dev/shm/muser/muser$i/bar0
taskset -a -c 1-2 /home/klateck/work/qemu-vfiouser/build/qemu-system-x86_64 -m 1024 --enable-kvm \ -cpu host -smp 2 -vga std -vnc :100 -daemonize \ -object memory-backend-file,id=mem,size=1024M,mem-path=/dev/hugepages,share=on,prealloc=yes,host-nodes=0,policy=bind \ -snapshot -monitor telnet:127.0.0.1:10002,server,nowait \ -numa node,memdev=mem \ -pidfile /home/klateck/vhost_test/vms/0/qemu.pid \ -serial file:/home/klateck/vhost_test/vms/0/serial.log \ -D /home/klateck/vhost_test/vms/0/qemu.log \ -chardev file,path=/home/klateck/vhost_test/vms/0/seabios.log,id=seabios \ -device isa-debugcon,iobase=0x402,chardev=seabios \ -net user,hostfwd=tcp::10000-:22,hostfwd=tcp::10001-:8765 \ -net nic -drive file=/home/sys_sgci/spdk_dependencies/spdk_test_image.qcow2,if=none,id=os_disk \ -device ide-hd,drive=os_disk,bootindex=0 \ -device vfio-user-pci,socket=/var/run/muser/domain/muser1/1/cntrl
[root@vhost32-cloud-12806 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 3G 0 disk └─sda1 8:1 0 3G 0 part / nvme0n1 259:1 0 372.6G 0 disk [root@vhost32-cloud-12806 ~]# sudo poweroff Connection to 127.0.0.1 closed by remote host. Connection to 127.0.0.1 closed.
[2021-04-28 12:57:43.651374] vfio_user.c:1104:memory_region_add_cb: ERROR: Memory region register 0x7fe3aae00000-0x7fe3eae00000 failed [2021-04-28 12:57:43.668633] vfio_user.c:1177:memory_region_remove_cb: ERROR: Memory region unregister 0x7fe3aae00000-0x7fe3eae00000 failed [2021-04-28 12:57:43.669019] vfio_user.c:1104:memory_region_add_cb: ERROR: Memory region register 0x7fe3aac00000-0x7fe3eac00000 failed [2021-04-28 12:57:43.670388] vfio_user.c:1177:memory_region_remove_cb: ERROR: Memory region unregister 0x7fe3aac00000-0x7fe3eac00000 failed [2021-04-28 12:57:43.670464] vfio_user.c:1104:memory_region_add_cb: ERROR: Memory region register 0x7fe3aae00000-0x7fe3eae00000 failed [2021-04-28 12:57:43.691523] vfio_user.c:1177:memory_region_remove_cb: ERROR: Memory region unregister 0x7fe3aae00000-0x7fe3eae00000 failed [2021-04-28 12:57:43.691602] vfio_user.c:1104:memory_region_add_cb: ERROR: Memory region register 0x7fe3ab000000-0x7fe3eb000000 failed [2021-04-28 12:57:43.692692] vfio_user.c:1177:memory_region_remove_cb: ERROR: Memory region unregister 0x7fe3ab000000-0x7fe3eb000000 failed [2021-04-28 12:57:43.692948] vfio_user.c:1104:memory_region_add_cb: ERROR: Memory region register 0x7fe3aae00000-0x7fe3eae00000 failed [2021-04-28 12:57:43.750532] vfio_user.c:1177:memory_region_remove_cb: ERROR: Memory region unregister 0x7fe3aae00000-0x7fe3eae00000 failed [2021-04-28 12:57:43.750691] vfio_user.c:1104:memory_region_add_cb: ERROR: Memory region register 0x7fe3eaa00000-0x7fe3eac00000 failed [2021-04-28 12:57:43.751007] vfio_user.c:1104:memory_region_add_cb: ERROR: Memory region register 0x7fe3aa400000-0x7fe3ea400000 failed [2021-04-28 12:57:43.752075] vfio_user.c:1104:memory_region_add_cb: ERROR: Memory region register 0x7fe3aa200000-0x7fe3aa400000 failed