spdk / spdk

Storage Performance Development Kit
https://spdk.io/
Other
3.02k stars 1.19k forks source link

Initialization timed out in state 22 (wait for identify controller)-RISC-V #3475

Open AssOfCat opened 1 month ago

AssOfCat commented 1 month ago

Sighting report

I am building the latest SPDK release version(v24.05.x) in the RISC-V environment, and when I run ./build/bin/spdk_nvme_identify, I encounter the following error:

root@ubuntu:# ./build/bin/spdk_nvme_identify
EAL: WARNING! Base virtual address hint (0x200000005000 != 0x3f97ce3000) not respected!
EAL:    This may cause issues with mapping memory into secondary processes
EAL: WARNING! Base virtual address hint (0x20000000b000 != 0x3f97066000) not respected!
EAL:    This may cause issues with mapping memory into secondary processes
EAL: WARNING! Base virtual address hint (0x200000a0c000 != 0x3b97000000) not respected!
EAL:    This may cause issues with mapping memory into secondary processes
EAL: WARNING! Base virtual address hint (0x200000c11000 != 0x3f97005000) not respected!
EAL:    This may cause issues with mapping memory into secondary processes
EAL: WARNING! Base virtual address hint (0x200001612000 != 0x3796e00000) not respected!
EAL:    This may cause issues with mapping memory into secondary processes
EAL: WARNING! Base virtual address hint (0x200001817000 != 0x3b96f9f000) not respected!
EAL:    This may cause issues with mapping memory into secondary processes
EAL: WARNING! Base virtual address hint (0x200002218000 != 0x3396c00000) not respected!
EAL:    This may cause issues with mapping memory into secondary processes
EAL: WARNING! Base virtual address hint (0x20000241d000 != 0x3b96f3e000) not respected!
EAL:    This may cause issues with mapping memory into secondary processes
EAL: WARNING! Base virtual address hint (0x200002e1e000 != 0x2f96a00000) not respected!
EAL:    This may cause issues with mapping memory into secondary processes
EAL: WARNING! Base virtual address hint (0x200003023000 != 0x3b96edd000) not respected!
EAL:    This may cause issues with mapping memory into secondary processes
EAL: WARNING! Base virtual address hint (0x200003a24000 != 0x2b96800000) not respected!
EAL:    This may cause issues with mapping memory into secondary processes
EAL: WARNING! Base virtual address hint (0x200003c29000 != 0x3b96e7c000) not respected!
EAL:    This may cause issues with mapping memory into secondary processes
EAL: WARNING! Base virtual address hint (0x20000462a000 != 0x26d2400000) not respected!
EAL:    This may cause issues with mapping memory into secondary processes
EAL: WARNING! Base virtual address hint (0x20000482f000 != 0x3b96e1b000) not respected!
EAL:    This may cause issues with mapping memory into secondary processes
EAL: WARNING! Base virtual address hint (0x200005230000 != 0x22d2200000) not respected!
EAL:    This may cause issues with mapping memory into secondary processes
EAL: WARNING! Base virtual address hint (0x200005435000 != 0x3796d9f000) not respected!
EAL:    This may cause issues with mapping memory into secondary processes
EAL: WARNING! Base virtual address hint (0x200005e36000 != 0x1ed2000000) not respected!
EAL:    This may cause issues with mapping memory into secondary processes
EAL: TSC using RISC-V rdtime.
[2024-08-09 13:44:33.230847] nvme_ctrlr.c:4095:nvme_ctrlr_process_init: *ERROR*: [0003:c3:00.0] Initialization timed out in state 22 (wait for identify controller)
[2024-08-09 13:44:33.231503] nvme.c: 708:nvme_ctrlr_poll_internal: *ERROR*: Failed to initialize SSD: 0003:c3:00.0
[2024-08-09 13:44:33.231597] nvme_ctrlr.c:1043:nvme_ctrlr_fail: *ERROR*: [0003:c3:00.0] in failed state.
[2024-08-09 13:44:33.288557] nvme_ctrlr.c:4095:nvme_ctrlr_process_init: *ERROR*: [0001:41:00.0] Initialization timed out in state 22 (wait for identify controller)
[2024-08-09 13:44:33.288713] nvme.c: 708:nvme_ctrlr_poll_internal: *ERROR*: Failed to initialize SSD: 0001:41:00.0
[2024-08-09 13:44:33.288799] nvme_ctrlr.c:1043:nvme_ctrlr_fail: *ERROR*: [0001:41:00.0] in failed state.
No NVMe controllers found.

Expected Behavior

SSD should be Identified.

Current Behavior

Initialization timed out in state 22 (wait for identify controller)

Possible Solution

I made the following attempt:

  1. Run Fio with kernel NVMe driver, it works properly.
  2. Manually invalidate the CPU cache of the doorbell.spdk_mmio_write_4(pqpair->sq_tdbl, pqpair->sq_tail)
  3. Set my machine's IOMMU to passthrough mode, by configuring the grub file.
  4. Set the timeout time to infinity.

But these have no effect.

Steps to Reproduce

  1. Check that nvme disk
    
    root@ubuntu:# ./scripts/setup.sh status
    Hugepages
    node     hugesize     free /  total
    node0   1048576kB        0 /      0
    node0      2048kB        0 /      0
    node1   1048576kB        0 /      0
    node1      2048kB        0 /      0
    node2   1048576kB        0 /      0
    node2      2048kB        0 /      0
    node3   1048576kB        0 /      0
    node3      2048kB        0 /      0

Type BDF Vendor Device NUMA Driver Device Block devices NVMe 0001:41:00.0 144d a80a 0 nvme nvme0 nvme0n1 NVMe 0003:c3:00.0 1e49 0021 0 nvme nvme1 nvme1n1

2. setup.sh

root@ubuntu:# HUGE_EVEN_ALLOC="yes" ./scripts/setup.sh 0003:c3:00.0 (1e49 0021): nvme -> uio_pci_generic 0001:41:00.0 (144d a80a): nvme -> uio_pci_generic

3. check status

root@ubuntu:# ./scripts/setup.sh status Hugepages node hugesize free / total node0 1048576kB 0 / 0 node0 2048kB 256 / 256 node1 1048576kB 0 / 0 node1 2048kB 256 / 256 node2 1048576kB 0 / 0 node2 2048kB 256 / 256 node3 1048576kB 0 / 0 node3 2048kB 256 / 256

Type BDF Vendor Device NUMA Driver Device Block devices NVMe 0001:41:00.0 144d a80a 0 uio_pci_generic - - NVMe 0003:c3:00.0 1e49 0021 0 uio_pci_generic - -

4.run `identify`

./build/bin/spdk_nvme_identify


## Context

<!--- Providing context helps us come up with a solution that is most useful in the real world -->

Linux ubuntu 6.5.0-rc1+ #2 SMP Sun Dec 10 16:31:34 CST 2023 riscv64 riscv64 riscv64 GNU/Linux

root@ubuntu:# lscpu Architecture: riscv64 Byte Order: Little Endian CPU(s): 64 On-line CPU(s) list: 0-63 NUMA: NUMA node(s): 4 NUMA node0 CPU(s): 0-7,16-23 NUMA node1 CPU(s): 8-15,24-31 NUMA node2 CPU(s): 32-39,48-55 NUMA node3 CPU(s): 40-47,56-63


spdk version:v24.05 [v24.05](https://github.com/spdk/spdk/releases/tag/v24.05)

Ref: [issue #2254](https://github.com/spdk/spdk/issues/2254)
ksztyber commented 4 weeks ago

[Bug scrub] The support for RISC-V is not fully functional. Marking the issue as an enhancement request to make it working.