spdk / spdk

Storage Performance Development Kit
https://spdk.io/
Other
3.04k stars 1.19k forks source link

Failed to restore FTL bdev using uuid #1144

Closed GaoBo1997 closed 4 years ago

GaoBo1997 commented 4 years ago

Expected Behavior

I open two command windows. One runs ./scripts/setup.sh and ./app/spdk_tgt/spdk_tgt, and another one runs the following commands:

./scripts/rpc.py bdev_ftl_create -b nvme0 -a 00:04.0
{
  "name": "nvme0",
  "uuid": "fb40a328-b0d7-4368-a640-f762d729c0f3"
}
./scripts/rpc.py bdev_ftl_delete -b nvme0
true
./scripts/rpc.py bdev_ftl_create -b nvme0  -a 00:04.0 -u fb40a328-b0d7-4368-a640-f762d729c0f3

It should restore metadata from the device and create bdev ftl successfully.

Current Behavior

But the following is current behavior:

request:
{
  "name": "nvme0",
  "trtype": "pcie",
  "traddr": "00:04.0",
  "uuid": "fb40a328-b0d7-4368-a640-f762d729c0f3",
  "allow_open_bands": false,
  "method": "bdev_ftl_create",
  "req_id": 1
}
Got JSON-RPC error response
response:
{
  "code": -32603,
  "message": "Failed to create FTL bdev: No such device"
}

More information:

The windows which runs ./app/spdk_tgt/spdk_tgt shows:

root@virspdk:~/spdk_1_12/spdk# ./app/spdk_tgt/spdk_tgt 
Starting SPDK v20.01-pre git sha1 159703c9d / DPDK 19.11.0 initialization...
[ DPDK EAL parameters: spdk_tgt --no-shconf -c 0x1 --log-level=lib.eal:6 --log-level=lib.cryptodev:5 --log-level=user1:6 --iova-mode=pa --base-virtaddr=0x200000000000 --match-allocations --file-prefix=spdk_pid32235 ]
app.c: 642:spdk_app_start: *NOTICE*: Total cores available: 1
reactor.c: 316:_spdk_reactor_run: *NOTICE*: Reactor started on core 0
Bands validity:
 Band   1:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed

device UUID:         fb40a328-b0d7-4368-a640-f762d729c0f3
total valid LBAs:    0
total writes:        32688
user writes:         0
WAF:                 inf
limits:
  crit: 0
  high: 0
   low: 0
 start: 0
ftl_restore.c: 299:ftl_restore_head_complete: *ERROR*: Band sequence consistency failed
ftl_init.c: 928:ftl_restore_md_cb: *ERROR*: Failed to restore the metadata from the SSD
Bands validity:
 Band   1:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band   2:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band   3:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band   4:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band   5:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band   6:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band   7:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band   8:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band   9:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  10:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  11:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  12:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  13:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  14:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  15:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  16:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  17:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  18:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  19:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  20:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  21:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  22:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  23:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  24:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  25:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  26:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  27:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  28:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  29:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  30:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  31:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  32:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  33:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  34:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  35:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  36:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  37:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  38:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  39:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  40:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  41:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  42:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  43:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  44:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  45:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  46:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  47:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  48:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  49:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  50:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  51:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  52:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  53:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  54:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  55:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  56:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  57:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  58:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  59:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed
 Band  60:        0 / 32688     num_zones: 8    wr_cnt: 1   merit:     0.000    state: closed

device UUID:         fb40a328-b0d7-4368-a640-f762d729c0f3
total valid LBAs:    0
total writes:        0
user writes:         0
WAF:                 -nan
limits:
  crit: 0
  high: 0
   low: 0
 start: 0
bdev_ftl.c: 653:bdev_ftl_create_cb: *ERROR*: Failed to create FTL device (-19)

Run ./examples/nvme/identify/identify:

root@virspdk:~/spdk_1_12/spdk# ./examples/nvme/identify/identify 
Starting SPDK v20.01-pre git sha1 159703c9d / DPDK 19.11.0 initialization...
[ DPDK EAL parameters: identify --no-shconf -c 0x1 -n 1 -m 0 --log-level=lib.eal:6 --log-level=lib.cryptodev:5 --log-level=user1:6 --iova-mode=pa --base-virtaddr=0x200000000000 --match-allocations --file-prefix=spdk_pid32241 ]
nvme_qpair.c: 116:nvme_admin_qpair_print_command: *NOTICE*: GET FEATURES (0a) sqid:0 cid:86 nsid:0 cdw10:000000ca cdw11:00000000
nvme_qpair.c: 304:spdk_nvme_qpair_print_completion: *NOTICE*: INVALID FIELD (00/02) sqid:0 cid:86 cdw0:0 sqhd:000f p:1 m:0 dnr:1
get_feature(0xCA) failed
=====================================================
NVMe Controller at 0000:00:04.0 [1d1d:1f1f]
=====================================================
Controller Capabilities/Features
================================
Vendor ID:                             1d1d
Subsystem Vendor ID:                   1af4
Serial Number:                         deadbeef
Model Number:                          QEMU NVMe OCSSD Ctrl
Firmware Version:                      2.0
Recommended Arb Burst:                 6
IEEE OUI Identifier:                   00 02 b3
Multi-path I/O
  May have multiple subsystem ports:   No
  May be connected to multiple hosts:  No
  Associated with SR-IOV VF:           No
Max Data Transfer Size:                524288
Max Number of Namespaces:              1
Error Recovery Timeout:                Unlimited
NVMe Specification Version (VS):       1.2
NVMe Specification Version (Identify): 1.2
Maximum Queue Entries:                 2048
Contiguous Queues Required:            Yes
Arbitration Mechanisms Supported
  Weighted Round Robin:                Supported
  Vendor Specific:                     Not Supported
Reset Timeout:                         7500 ms
Doorbell Stride:                       4 bytes
NVM Subsystem Reset:                   Not Supported
Command Sets Supported
  NVM Command Set:                     Supported
Boot Partition:                        Not Supported
Memory Page Size Minimum:              4096 bytes
Memory Page Size Maximum:              4096 bytes
Optional Asynchronous Events Supported
  Namespace Attribute Notices:         Not Supported
  Firmware Activation Notices:         Not Supported
128-bit Host Identifier:               Not Supported

Controller Memory Buffer Support
================================
Supported:                             No

Admin Command Set Attributes
============================
Security Send/Receive:                 Not Supported
Format NVM:                            Supported
Firmware Activate/Download:            Not Supported
Namespace Management:                  Not Supported
Device Self-Test:                      Not Supported
Directives:                            Not Supported
NVMe-MI:                               Not Supported
Virtualization Management:             Not Supported
Doorbell Buffer Config:                Not Supported
Abort Command Limit:                   4
Async Event Request Limit:             4
Number of Firmware Slots:              N/A
Firmware Slot 1 Read-Only:             N/A
Firmware Update Granularity:           No Information Provided
Per-Namespace SMART Log:               No
Command Effects Log Page:              Not Supported
Get Log Page Extended Data:            Supported
Telemetry Log Pages:                   Not Supported
Error Log Page Entries Supported:      4
Keep Alive:                            Not Supported

NVM Command Set Attributes
==========================
Submission Queue Entry Size
  Max:                       64
  Min:                       64
Completion Queue Entry Size
  Max:                       16
  Min:                       16
Number of Namespaces:        1
Compare Command:             Not Supported
Write Uncorrectable Command: Not Supported
Dataset Management Command:  Supported
Write Zeroes Command:        Not Supported
Set Features Save Field:     Not Supported
Reservations:                Not Supported
Timestamp:                   Not Supported
Volatile Write Cache:        Present
Atomic Write Unit (Normal):  1
Atomic Write Unit (PFail):   1
Atomic Compare & Write Unit: 1
Fused Compare & Write:       Not Supported
Scatter-Gather List
  SGL Command Set:           Supported
  SGL Keyed:                 Not Supported
  SGL Bit Bucket Descriptor: Not Supported
  SGL Metadata Pointer:      Not Supported
  Oversized SGL:             Not Supported
  SGL Metadata Address:      Supported
  SGL Offset:                Not Supported
  Transport SGL Data Block:  Not Supported
Replay Protected Memory Block:  Not Supported

Firmware Slot Information
=========================
Active slot:                 1
Slot 4 Firmware Revision:    .j......
Slot 6 Firmware Revision:    ........

Error Log
=========

Arbitration
===========
Arbitration Burst:           64
Low Priority Weight:         8
Medium Priority Weight:      16
High Priority Weight:        32

Power Management
================
Number of Power States:      1
Current Power State:         Power State #0
Power State #0:  Max Power:  25.00 W
Non-Operational Permissive Mode: Not Supported

Health Information
==================
Critical Warnings:
  Available Spare Space:     WARNING
  Temperature:               OK
  Device Reliability:        OK
  Read Only:                 No
  Volatile Memory Backup:    OK
Current Temperature:         323 Kelvin (50 Celsius)
Temperature Threshold:       333 Kelvin (60 Celsius)
Available Spare:             0%
Available Spare Threshold:   20%
Life Percentage Used:        0%
Data Units Read:             17001184
Data Units Written:          404615424
Host Read Commands:          4980
Host Write Commands:         36834
Controller Busy Time:        0 minutes
Power Cycles:                0
Power On Hours:              1 hours
Unsafe Shutdowns:            0
Unrecoverable Media Errors:  0
Lifetime Error Log Entries:  0
Warning Temperature Time:    0 minutes
Critical Temperature Time:   0 minutes

Number of Queues
================
Number of I/O Submission Queues:      63
Number of I/O Completion Queues:      63

Active Namespaces
=================
Namespace ID:1
Deallocate:                            Supported
Deallocated/Unwritten Error:           Supported
Deallocated Read Value:                All 0x00
Deallocate in Write Zeroes:            Not Supported
Deallocated Guard Field:               0xFFFF
Flush:                                 Supported
Reservation:                           Not Supported
Metadata Transferred as:               Separate Metadata Buffer
Namespace Sharing Capabilities:        Private
Size (in LBAs):                        2097152 (2M)
Capacity (in LBAs):                    2097152 (2M)
Utilization (in LBAs):                 2097152 (2M)
Thin Provisioning:                     Not Supported
Per-NS Atomic Units:                   No
NGUID/EUI64 Never Reused:              No
Number of LBA Formats:                 1
Current LBA Format:                    LBA Format #00
LBA Format #00: Data Size:  4096  Metadata Size:    16

Namespace OCSSD Geometry
=======================
OC version:                     maj:2 min:0
LBA format:
  Group bits:                   1
  PU bits:                      2
  Chunk bits:                   6
  Logical block bits:           12
Media and Controller Capabilities:
  Namespace supports Vector Chunk Copy:                 Not Supported
  Namespace supports multiple resets a free chunk:      Not Supported
Wear-level Index Delta Threshold:                       0
Groups (channels):              2
PUs (LUNs) per group:           4
Chunks per LUN:                 60
Logical blks per chunk:         4096
MIN write size:                 4
OPT write size:                 8
Cache min write size:           24
Max open chunks:                0
Max open chunks per PU:         0

OCSSD Chunk Info Glance
======================
------------
Chunk index:                    0
Chunk state:                    Closed(0x2)
Chunk type (write mode):        Sequential Write
Chunk type (size_deviate):      No
Wear-level Index:               3
Starting LBA:                   0
Number of blocks in chunk:      4096
Write Pointer:                  4096
------------
Chunk index:                    1
Chunk state:                    Free(0x1)
Chunk type (write mode):        Sequential Write
Chunk type (size_deviate):      No
Wear-level Index:               0
Starting LBA:                   4096
Number of blocks in chunk:      4096
Write Pointer:                  0
------------
Chunk index:                    2
Chunk state:                    Free(0x1)
Chunk type (write mode):        Sequential Write
Chunk type (size_deviate):      No
Wear-level Index:               0
Starting LBA:                   8192
Number of blocks in chunk:      4096
Write Pointer:                  0
------------
Chunk index:                    3
Chunk state:                    Free(0x1)
Chunk type (write mode):        Sequential Write
Chunk type (size_deviate):      No
Wear-level Index:               0
Starting LBA:                   12288
Number of blocks in chunk:      4096
Write Pointer:                  0
------------
Chunk index:                    4
Chunk state:                    Free(0x1)
Chunk type (write mode):        Sequential Write
Chunk type (size_deviate):      No
Wear-level Index:               0
Starting LBA:                   16384
Number of blocks in chunk:      4096
Write Pointer:                  0
------------
Chunk index:                    5
Chunk state:                    Free(0x1)
Chunk type (write mode):        Sequential Write
Chunk type (size_deviate):      No
Wear-level Index:               0
Starting LBA:                   20480
Number of blocks in chunk:      4096
Write Pointer:                  0
------------
Chunk index:                    6
Chunk state:                    Free(0x1)
Chunk type (write mode):        Sequential Write
Chunk type (size_deviate):      No
Wear-level Index:               0
Starting LBA:                   24576
Number of blocks in chunk:      4096
Write Pointer:                  0
------------
Chunk index:                    7
Chunk state:                    Free(0x1)
Chunk type (write mode):        Sequential Write
Chunk type (size_deviate):      No
Wear-level Index:               0
Starting LBA:                   28672
Number of blocks in chunk:      4096
Write Pointer:                  0

I use qemu-nvme to create a ocssd image and load it.

qemu-img create -f ocssd -o num_grp=2,num_pu=4,num_chk=60 ocssd.img

sudo /home/gao/Desktop/hdd/SPDK_Lab/qemu-nvme-cmd/bin/qemu-system-x86_64 -m 4G -enable-kvm \
-cpu qemu64,+ssse3,+sse4.1,+sse4.2 -smp 2 \
-drive file=/home/gao/Desktop/hdd/SPDK_Lab/ubuntu.raw,format=raw  \
-drive file=/home/gao/Desktop/hdd/SPDK_Lab/ocssd.img,id=myocssd,format=raw,if=none \
-device nvme,drive=myocssd,serial=deadbeef  \
-net user,hostfwd=tcp::10022-:22 \
-net nic

Possible Solution

Maybe my method is wrong...

Steps to Reproduce

1. qemu-nvme

git clone https://github.com/OpenChannelSSD/qemu-nvme.git

cd qemu-nvme
./configure --target-list=x86_64-softmmu --prefix=$HOME/qemu-nvme
make
make install

qemu-img create -f ocssd -o num_grp=2,num_pu=4,num_chk=60 ocssd.img
sudo /home/gao/Desktop/hdd/SPDK_Lab/qemu-nvme-cmd/bin/qemu-system-x86_64 -m 4G -enable-kvm \
-cpu qemu64,+ssse3,+sse4.1,+sse4.2 -smp 2 \
-drive file=/home/gao/Desktop/hdd/SPDK_Lab/ubuntu.raw,format=raw  \
-drive file=/home/gao/Desktop/hdd/SPDK_Lab/ocssd.img,id=myocssd,format=raw,if=none \
-device nvme,drive=myocssd,serial=deadbeef  \
-net user,hostfwd=tcp::10022-:22 \
-net nic

2. spdk

./scripts/setup.sh

./app/spdk_tgt/spdk_tgt

./scripts/rpc.py bdev_ftl_create -b nvme0 -a 00:04.0
(uuid xxx_xxx)
./scripts/rpc.py bdev_ftl_delete -b nvme0

./scripts/rpc.py bdev_ftl_create -b nvme0 -a 00:04.0 -u (xxx_xxx, same as last create)

Context (Environment including OS version, SPDK version, etc.)

spdk version is newest. I suppose that is SPDK v20.01-pre git sha1 159703c9d / DPDK 19.11.0. I download the newest spdk today.

mmkayPL commented 4 years ago

Hey, could you post information about the exact QEMU version you're using? FTL officially works on the OCSSD surfaced by this version of QEMU: https://github.com/spdk/qemu/tree/spdk-3.0.0 and some settings suggest this isn't what's being used here. In the meantime could you also check with the following geometry settings: num_grp=1,num_pu=8 and Metadata Size: 0

GaoBo1997 commented 4 years ago

I use this QEMU: https://github.com/OpenChannelSSD/qemu-nvme I read the FTL guide: https://spdk.io/doc/ftl.html but I don't know the setting lmetadata=/path/to/md/file means. I will try https://github.com/spdk/qemu/tree/spdk-3.0.0 later and feedback.

GaoBo1997 commented 4 years ago

ok. When I use this version of QEMU: https://github.com/spdk/qemu/tree/spdk-3.0.0 everything is ok. And the FTL guide need to be updated: https://spdk.io/doc/ftl.html lmetadata=/path/to/md/file is nod recognized, while metadata=/path/to/md/file is true. And the bdev_ftl_create command's option -l is no more recognized.

And one more question, why the metadata file is separated from the ocssd file? In pblk, metadata is stored in the ocssd file. And how to calculate the size of metadata file? I would appreciate it if you could answer my question. Thank you!

mmkayPL commented 4 years ago

Thanks for pointing out the documentation mismatch. The version of QEMU SPDK's fork was based of didn't have any sector metadata support initially. The separate file was at first only used to save information about sector state in OCSSD drives (ie. erased, written, etc.) and was eventually utilized for sector metadata as well.

GaoBo1997 commented 4 years ago

Thank you for your help! My problem has been solved. And I will close the issue.