Closed karlatec closed 3 years ago
Thanks @karlatec I can reproduce this issue in my local environment, I will provide a fix patch for it ASAP.
Existing vfio-user doesn't call spdk_mem_register for the VM memory regions, so when doing hardware DMA access from VM memory to physical NVMe drives, vtophys will report error.
I will fix this.
Could we re-open the issue? After merging the fixes and doing a manual re-test this looks fine, but I'd like to keep the issues open until we can get automated tests in place, so we can get the issues covered (like in 1813 issue)
Sure, using physical NVMe devices with vfio-user still has some issues, but the error isn't same like this issue, let's keep it open for now.
Closing for now. Currently we're gated on a stable QEMU w/ vfio-user which is keeping us from getting this automated in CI. Will reopen if necessary after CI is activated (but we're pretty confident this is fixed).
Current Behavior
App crashes with
Assertion
bdev_ch->io_outstanding > 0' failed.` when trying to connect a Qemu VM to a vfio-user socket, which points to subsystem with NVMe bdev namespace attached.Steps to Reproduce
i=1
rm -rf /var/run/muser rm -rf /dev/shm/muser
mkdir -p /var/run/muser mkdir -p /var/run/muser/iommu_group mkdir -p /var/run/muser/domain/muser$i/$i mkdir -p /dev/shm/muser/muser$i
build/bin/nvmf_tgt & pid=$! echo "PID: $pid" sleep 3
scripts/rpc.py bdev_nvme_attach_controller -b Nvme0 -t pcie -a 0000:0b:00.0 scripts/rpc.py nvmf_create_transport --trtype VFIOUSER scripts/rpc.py nvmf_create_subsystem nqn.2019-07.io.spdk:cnode0 -s SPDK001 -a scripts/rpc.py nvmf_subsystem_add_ns nqn.2019-07.io.spdk:cnode0 Nvme0n1 scripts/rpc.py nvmf_subsystem_add_listener nqn.2019-07.io.spdk:cnode0 -t VFIOUSER -a /var/run/muser/domain/muser$i/$i -s 0
sleep 1 ln -s /var/run/muser/domain/muser$i/$i /var/run/muser/domain/muser$i/$i/iommu_group ln -s /var/run/muser/domain/muser$i/$i /var/run/muser/iommu_group/$i ln -s /var/run/muser/domain/muser$i/$i/bar0 /dev/shm/muser/muser$i/bar0
sudo qemu-vfiouser/build/x86_64-softmmu/qemu-system-x86_64 -m 1024 --enable-kvm -cpu host -smp 2 \ -object memory-backend-file,id=mem,size=1024M,mem-path=/dev/hugepages,prealloc=yes,share=yes,host-nodes=0,policy=bind \ -numa node,memdev=mem \ -vga std -vnc :100 -daemonize -snapshot \ -monitor telnet:127.0.0.1:10002,server,nowait \ -pidfile /home/klateck/vhost_test/vms/0/qemu.pid \ -serial file:/home/klateck/vhost_test/vms/0/serial.log \ -D /home/klateck/vhost_test/vms/0/qemu.log \ -chardev file,path=/home/klateck/vhost_test/vms/0/seabios.log,id=seabios \ -device isa-debugcon,iobase=0x402,chardev=seabios \ -net user,hostfwd=tcp::10000-:22,hostfwd=tcp::10001-:8765 -net nic \ -drive file=/home/klateck/spdk_test_image.qcow2,if=none,id=os_disk -device ide-hd,drive=os_disk,bootindex=0 \ -device vfio-user-pci,socket=/var/run/muser/domain/muser1/1/cntrl
[2021-03-04 15:00:35.306368] vfio_user.c: 510:acq_map: ERROR: Map ACQ failed, ACQ 3ffde000, errno -1 [2021-03-04 15:00:35.306412] vfio_user.c:1043:map_admin_queue: ERROR: /var/run/muser/domain/muser1/1: failed to map CQ0: -1 [2021-03-04 15:00:35.306441] vfio_user.c:1103:memory_region_add_cb: NOTICE: Failed to map SQID 1 0x3ffd8000-0x3ffdc000, will try again in next poll [2021-03-04 15:00:35.306512] vfio_user.c: 510:acq_map: ERROR: Map ACQ failed, ACQ 3ffde000, errno -1 [2021-03-04 15:00:35.306528] vfio_user.c:1043:map_admin_queue: ERROR: /var/run/muser/domain/muser1/1: failed to map CQ0: -1 [2021-03-04 15:00:35.306539] vfio_user.c:1103:memory_region_add_cb: NOTICE: Failed to map SQID 1 0x3ffd8000-0x3ffdc000, will try again in next poll [2021-03-04 15:00:35.306613] vfio_user.c: 510:acq_map: ERROR: Map ACQ failed, ACQ 3ffde000, errno -1 [2021-03-04 15:00:35.306629] vfio_user.c:1043:map_admin_queue: ERROR: /var/run/muser/domain/muser1/1: failed to map CQ0: -1 [2021-03-04 15:00:35.306641] vfio_user.c:1103:memory_region_add_cb: NOTICE: Failed to map SQID 1 0x3ffd8000-0x3ffdc000, will try again in next poll [2021-03-04 15:00:35.306705] vfio_user.c: 510:acq_map: ERROR: Map ACQ failed, ACQ 3ffde000, errno -1 [2021-03-04 15:00:35.306720] vfio_user.c:1043:map_admin_queue: ERROR: /var/run/muser/domain/muser1/1: failed to map CQ0: -1 [2021-03-04 15:00:35.306731] vfio_user.c:1103:memory_region_add_cb: NOTICE: Failed to map SQID 1 0x3ffd8000-0x3ffdc000, will try again in next poll [2021-03-04 15:00:35.306787] vfio_user.c: 510:acq_map: ERROR: Map ACQ failed, ACQ 3ffde000, errno -1 [2021-03-04 15:00:35.306802] vfio_user.c:1043:map_admin_queue: ERROR: /var/run/muser/domain/muser1/1: failed to map CQ0: -1 [2021-03-04 15:00:35.306814] vfio_user.c:1103:memory_region_add_cb: NOTICE: Failed to map SQID 1 0x3ffd8000-0x3ffdc000, will try again in next poll [2021-03-04 15:00:44.130209] nvme_pcie.c: 833:nvme_pcie_prp_list_append: ERROR: vtophys(0x7f3df9a0a000) failed [2021-03-04 15:00:44.130268] nvme_qpair.c: 268:nvme_io_qpair_print_command: NOTICE: READ sqid:1 cid:895 nsid:1 lba:0 len:8 PRP1 0x0 PRP2 0x0 [2021-03-04 15:00:44.130283] nvme_qpair.c: 452:spdk_nvme_print_completion: NOTICE: INVALID FIELD (00/02) qid:1 cid:895 cdw0:0 sqhd:0000 p:0 m:0 dnr:1 [2021-03-04 15:00:44.130297] bdev_nvme.c:2571:bdev_nvme_readv: ERROR: readv failed: rc = -14 nvmf_tgt: bdev.c:5203: spdk_bdev_io_complete: Assertion `bdev_ch->io_outstanding > 0' failed.