Error when use SPDK in QEMU NVMe

xxks-kkk commented 5 years ago

Expected Behavior

I try to run SPDK in a QEMU with simulated NVMe device. I can successfully build and rebind the driver. However, when I run hello_bdev using sudo ./hello_bdev, I hit the following error

Starting SPDK v18.10 / DPDK 18.08.0 initialization...
[ DPDK EAL parameters: hello_bdev --no-shconf -c 0x1 --file-prefix=spdk_pid3361 ]
EAL: Detected 4 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Probing VFIO support...
EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable clock cycles !
app.c: 602:spdk_app_start: *NOTICE*: Total cores available: 1
reactor.c: 703:spdk_reactors_init: *NOTICE*: Occupied cpu socket mask is 0x1
reactor.c: 490:_spdk_reactor_run: *NOTICE*: Reactor started on core 0 on socket 0
bdev.c: 797:spdk_bdev_initialize: *ERROR*: could not allocate spdk_bdev_io pool
subsystem.c: 120:spdk_subsystem_init_next: *ERROR*: Init subsystem bdev failed
app.c: 689:spdk_app_stop: *WARNING*: spdk_app_stop'd on non-zero
hello_bdev.c: 285:main: *ERROR*: ERROR starting application

Possible Solution

Steps to Reproduce

Start QEMU with NVMe like below:

sudo qemu-system-x86_64 -enable-kvm -curses -m 512 -smp 4 -redir tcp:4444::22 -hda my-disk.img -hdb my-seed.img -drive file=f100M,if=none,id=D22 -device nvme,drive=D22,serial=foo -cpu host -kernel kbuild2/arch/x86_64/boot/bzImage -append "root=/dev/sda1"

Inside the guest system, run the example program

Context (Environment including OS version, SPDK version, etc.)

SPDK 18.10
Kernel 4.19.6
Guest OS: Ubuntu 18.10
Host OS: Ubuntu 16.04

jimharris commented 5 years ago

Hi @xxks-kkk,

How much hugepage memory did you allocate in your VM? I see "-m 512" on the QEMU command line so I'm guessing you had pretty limited hugepage memory.

Could you try allocating more hugepage memory? Or override the size of the bdev_io_pool but putting the following in bdev.conf in the same directory from where you are running the hello_bdev app:

[Bdev] BdevIoPoolSize 1024

This would reduce the number of bdev_ios in the global pool from default 64K to only 1K.

Note: it might be worth override this default in the hello_bdev application itself. Even if the submitter confirms that increasing the hugepage memory works, let's keep this issue open for discussion.

-Jim

xxks-kkk commented 5 years ago

@jimharris

Thanks for the quick response. Tried

[Bdev]
BdevIoPoolSize 1024

with default QEMU setup, doesn't work. However, after I increase -m value of QEMU to 4096 and run sudo scripts/setup.sh without HUGEPAGES=, everything works.

fshenx commented 5 years ago

@xxks-kkk We can't reproduce this issue. I set "-m 512" and "-m 256", both are successful with some error messages. It is recommended to allocate a bit more memory at initialization time. Maybe you don't have enough memory left. Thanks.

./qemu-system-x86_64 -cpu host -smp 8 -m 512 -object memory-backend-file,id=mem,size=512m,mem-path=/dev/hugepages,share=on -numa node,memdev=mem -drive file=/home/shenfurong/Fedora26.qcow2,if=none,id=disk -device ide-hd,drive=disk,bootindex=0 -net user,hostfwd=tcp::10000-:22 -net nic --enable-kvm -drive format=raw,file=/root/test.img,if=none,id=nvmedrive -device nvme,drive=nvmedrive,serial=1234

[root@localhost spdk]# ./scripts/setup.sh
0000:00:04.0 (8086 5845): nvme -> uio_pci_generic
[root@localhost spdk]# ./examples/nvme/hello_world/hello_world
Starting DPDK 17.05.0 initialization...
[ DPDK EAL parameters: hello_world -c 0x1 --file-prefix=spdk0 --base-virtaddr=0x1000000000 --proc-type=auto ]
EAL: Detected 8 lcore(s)
EAL: Auto-detected process type: PRIMARY
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable clock cycles !
Initializing NVMe Controllers
EAL: PCI device 0000:00:04.0 on NUMA socket 0
EAL:   probe driver: 8086:5845 spdk_nvme
Attaching to 0000:00:04.0
nvme_qpair.c: 112:nvme_admin_qpair_print_command: *NOTICE*: SET FEATURES (09) sqid:0 cid:63 nsid:0 cdw10:0000000b cdw11:0000001f
nvme_qpair.c: 284:nvme_qpair_print_completion: *NOTICE*: INVALID FIELD (00/02) sqid:0 cid:63 cdw0:0 sqhd:0005 p:1 m:0 dnr:1
nvme_ctrlr.c: 952:nvme_ctrlr_configure_aer: *ERROR*: nvme_ctrlr_cmd_set_async_event_config failed!
nvme_qpair.c: 112:nvme_admin_qpair_print_command: *NOTICE*: GET LOG PAGE (02) sqid:0 cid:63 nsid:ffffffff cdw10:007f00c0 cdw11:00000000
nvme_qpair.c: 284:nvme_qpair_print_completion: *NOTICE*: INVALID OPCODE (00/01) sqid:0 cid:63 cdw0:0 sqhd:0006 p:1 m:0 dnr:1
nvme_ctrlr.c: 352:nvme_ctrlr_set_intel_support_log_pages: *ERROR*: nvme_ctrlr_cmd_get_log_page failed!
Attached to 0000:00:04.0
Using controller QEMU NVMe Ctrl       (1234                ) with 1 namespaces.
  Namespace ID: 1 size: 1GB
Initialization complete.
Hello world!

jimharris commented 5 years ago

Closing this issue - VM needed more memory. Submitter confirmed that adding more memory to the VM fixed the issue.

I did look at the possibility of reducing the size of the bdev_io pool - but that does little to reduce the memory consumption.

spdk / spdk