openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.68k stars 1.76k forks source link

multipath nvme does not populate 'devid' property #16709

Open JKDingwall opened 3 weeks ago

JKDingwall commented 3 weeks ago

System information

Type Version/Name
Distribution Name Ubuntu
Distribution Version 22.04.5
Kernel Version 6.8.0-48-generic
Architecture amd64
OpenZFS Version 2.2.6

Describe the problem you're observing

We have two Dell T550 systems with different models of NVME drive. On one system the NVME devices present as multipath and on the second not.

The multipath NVME:

# nvme list
Node                  SN                   Model                                    Namespace Usage                      Format           FW Rev  
--------------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme0n1          S6XXXXXXXXXXX97       Dell Ent NVMe v2 AGN RI U.2 3.84TB       1           3.84  TB /   3.84  TB    512   B +  0 B   2.3.0   
/dev/nvme1n1          S6XXXXXXXXXXX94       Dell Ent NVMe v2 AGN RI U.2 3.84TB       1           3.84  TB /   3.84  TB    512   B +  0 B   2.3.0   
/dev/nvme2n1          S6XXXXXXXXXXX92       Dell Ent NVMe v2 AGN RI U.2 3.84TB       1           3.84  TB /   3.84  TB    512   B +  0 B   2.3.0   
/dev/nvme3n1          S6XXXXXXXXXXX89       Dell Ent NVMe v2 AGN RI U.2 3.84TB       1           3.84  TB /   3.84  TB    512   B +  0 B   2.3.0   

The non-multipath NVME:

# nvme list
Node                  SN                   Model                                    Namespace Usage                      Format           FW Rev  
--------------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme0n1          SDB3XXXXXXXXXXX03    Dell DC NVMe PE8010 RI U.2 3.84TB        1           3.84  TB /   3.84  TB    512   B +  0 B   1.2.0   
/dev/nvme1n1          SDB3XXXXXXXXXXX0K    Dell DC NVMe PE8010 RI U.2 3.84TB        1           3.84  TB /   3.84  TB    512   B +  0 B   1.2.0   
/dev/nvme2n1          SDB3XXXXXXXXXXX0J    Dell DC NVMe PE8010 RI U.2 3.84TB        1           3.84  TB /   3.84  TB    512   B +  0 B   1.2.0   
/dev/nvme3n1          SDB3XXXXXXXXXXX0N    Dell DC NVMe PE8010 RI U.2 3.84TB        1           3.84  TB /   3.84  TB    512   B +  0 B   1.2.0   

Both systems have an identical operating system installation:

# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.5 LTS
Release:        22.04
Codename:       jammy

# uname -a
Linux zhapcon0000-007f-00-00 6.8.0-48-generic #48~22.04.1 SMP PREEMPT_DYNAMIC Fri Oct 25 14:37:37 UTC  x86_64 x86_64 x86_64 GNU/Linux

# zfs version
zfs-2.2.6-8_g8786ac8453
zfs-kmod-2.2.6-8_g8786ac8453

This most obviously manifests itself as no 'devid' field being provided in the zdb -C output for the whole disk mirror in the zpool. The impact in our environment is that an automated disk replacement tools cannot match the failed/removed unit in the zpool with the replacement device because we are using the devid property as a key. I believe it is this code which is supposed to generate that property and it already seems to have some special behaviour to handle multipath dm devices: https://github.com/openzfs/zfs/blob/baa50314567afd986a00838f0fa65fdacbd12daf/lib/libzutil/os/linux/zutil_import_os.c#L385

We have experienced other slight oddities with udev for this configuration that we've managed to workaround but we're a bit stuck here.

The multipath NVME:

# zdb -C
diskconvm:
    version: 5000
    name: 'zpool'
    state: 0
    txg: 11692
    pool_guid: 6843043406111341747
    errata: 0
    hostid: 1633771873
    hostname: 'multipath'
    com.delphix:has_per_vdev_zaps
    vdev_children: 2
    vdev_tree:
        type: 'root'
        id: 0
        guid: 6843043406111341747
        create_txg: 4
        com.klarasystems:vdev_zap_root: 344
        children[0]:
            type: 'mirror'
            id: 0
            guid: 14146151563094489910
            metaslab_array: 132
            metaslab_shift: 29
            ashift: 12
            asize: 3834843234304
            is_log: 0
            create_txg: 4
            com.delphix:vdev_zap_top: 129
            children[0]:
                type: 'disk'
                id: 0
                guid: 15486488221411976041
                path: '/dev/disk/by-partuuid/9346790e-7c9f-4d0b-959b-816b0db8bd6e'
                whole_disk: 0
                create_txg: 4
                com.delphix:vdev_zap_leaf: 130
            children[1]:
                type: 'disk'
                id: 1
                guid: 2039583289570950783
                path: '/dev/disk/by-partuuid/6b81e35e-35e5-4223-a38a-a76178c38c89'
                whole_disk: 0
                create_txg: 4
                com.delphix:vdev_zap_leaf: 131
        children[1]:
            type: 'mirror'
            id: 1
            guid: 17107054174393118353
            metaslab_array: 518
            metaslab_shift: 34
            ashift: 12
            asize: 3840741474304
            is_log: 0
            create_txg: 9752
            com.delphix:vdev_zap_top: 238
            children[0]:
                type: 'disk'
                id: 0
                guid: 7664684068309311742
                path: '/dev/disk/by-partuuid/70e45513-4a10-764e-adc7-821ab2f0c0cc'
                whole_disk: 1
                create_txg: 9752
                com.delphix:vdev_zap_leaf: 239
            children[1]:
                type: 'disk'
                id: 1
                guid: 12386411936378184772
                path: '/dev/disk/by-partuuid/b319938d-c323-f844-abaa-89a80c6769e0'
                whole_disk: 1
                create_txg: 9752
                com.delphix:vdev_zap_leaf: 240
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
        com.klarasystems:vdev_zaps_v2

The non-multipath NVME:

# zdb -C
diskconvm:
    version: 5000
    name: 'zpool'
    state: 0
    txg: 11825
    pool_guid: 459534711329581208
    errata: 0
    hostid: 1633771873
    hostname: 'nonmultipath'
    com.delphix:has_per_vdev_zaps
    vdev_children: 2
    vdev_tree:
        type: 'root'
        id: 0
        guid: 459534711329581208
        create_txg: 4
        com.klarasystems:vdev_zap_root: 286
        children[0]:
            type: 'mirror'
            id: 0
            guid: 14229864630524350197
            metaslab_array: 256
            metaslab_shift: 29
            ashift: 12
            asize: 3834843234304
            is_log: 0
            create_txg: 4
            com.delphix:vdev_zap_top: 129
            children[0]:
                type: 'disk'
                id: 0
                guid: 4931286175406302527
                path: '/dev/disk/by-partuuid/ee246422-fa3b-4427-9882-3f3d2ca54e15'
                whole_disk: 0
                create_txg: 4
                com.delphix:vdev_zap_leaf: 130
            children[1]:
                type: 'disk'
                id: 1
                guid: 10676830712763383564
                path: '/dev/disk/by-partuuid/7165ff99-fed2-4392-a999-e5cb695577ab'
                whole_disk: 0
                create_txg: 4
                com.delphix:vdev_zap_leaf: 131
        children[1]:
            type: 'mirror'
            id: 1
            guid: 347403132298270685
            metaslab_array: 569
            metaslab_shift: 34
            ashift: 12
            asize: 3840741474304
            is_log: 0
            create_txg: 9870
            com.delphix:vdev_zap_top: 561
            children[0]:
                type: 'disk'
                id: 0
                guid: 11674655218606386584
                path: '/dev/disk/by-partuuid/597ff055-b345-2f4b-88a6-6cbb85914a1d'
                devid: 'nvme-Dell_DC_NVMe_PE8010_RI_U.2_3.84TB_SDB3XXXXXXXXXXX0N-part1'
                phys_path: 'pci-0000:e6:00.0-nvme-1'
                vdev_enc_sysfs_path: '/sys/bus/pci/slots/161'
                whole_disk: 1
                create_txg: 9870
                com.delphix:vdev_zap_leaf: 564
            children[1]:
                type: 'disk'
                id: 1
                guid: 5209775201699158177
                path: '/dev/disk/by-partuuid/09d6cc9c-8a3b-e446-a637-7b2aaf7f7f2b'
                devid: 'nvme-Dell_DC_NVMe_PE8010_RI_U.2_3.84TB_SDB3XXXXXXXXXXX0J-part1'
                phys_path: 'pci-0000:e5:00.0-nvme-1'
                vdev_enc_sysfs_path: '/sys/bus/pci/slots/160'
                whole_disk: 1
                create_txg: 9870
                com.delphix:vdev_zap_leaf: 567
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
        com.klarasystems:vdev_zaps_v2

Describe how to reproduce the problem

N/A

Include any warning/errors/backtraces from the system logs

N/A