openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.64k stars 1.75k forks source link

Persistent device name for cache device lost across export/import #13170

Open nabijaczleweli opened 2 years ago

nabijaczleweli commented 2 years ago

System information

Type Version/Name
Distribution Name Debian
Distribution Version Bullseye
Kernel Version Linux tarta 5.10.0-11-amd64 #1 SMP Debian 5.10.92-1 (2022-01-18) x86_64 Linux
Architecture amd64
OpenZFS Version 2.1.2-1.1, 2.1.4-1.1

Describe the problem you're observing

After exporting and importing a pool using a cache device added as a /dev/disk/by-*-style name, that name is lost and the /dev basename is used. zpool replaceing it works, until the next re-import.

The cache device here was added as filling-cache (under /dev/disk/by-partlabel/, /dev/disk/by-partlabel/filling-cache -> ../../nvme0n1p4). On import, I once again got

$ zpool status filling
  pool: filling
 state: ONLINE
status: Some supported and requested features are not enabled on the pool.
        The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(7) for details.
  scan: scrub repaired 0B in 03:48:24 with 0 errors on Tue Mar  1 10:40:24 2022
config:

        NAME                                   STATE     READ WRITE CKSUM
        filling                                ONLINE       0     0     0
          mirror-0                             ONLINE       0     0     0
            ata-HGST_HUS726T4TALE6L4_V6K2L4RR  ONLINE       0     0     0
            ata-HGST_HUS726T4TALE6L4_V6K2MHYR  ONLINE       0     0     0
          raidz1-1                             ONLINE       0     0     0
            ata-HGST_HUS728T8TALE6L4_VDKT237K  ONLINE       0     0     0
            ata-HGST_HUS728T8TALE6L4_VDGY075D  ONLINE       0     0     0
            ata-HGST_HUS728T8TALE6L4_VDKVRRJK  ONLINE       0     0     0
        cache
          nvme0n1p4                            ONLINE       0     0     0

errors: No known data errors

$ zpool list -v filling
NAME                                    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
filling                                25.5T  6.53T  18.9T        -       64M     1%    25%  1.00x    ONLINE  -
  mirror                               3.64T   444G  3.20T        -       64M     9%  11.9%      -    ONLINE
    ata-HGST_HUS726T4TALE6L4_V6K2L4RR      -      -      -        -       64M      -      -      -    ONLINE
    ata-HGST_HUS726T4TALE6L4_V6K2MHYR      -      -      -        -       64M      -      -      -    ONLINE
  raidz1                               21.8T  6.10T  15.7T        -         -     0%  27.9%      -    ONLINE
    ata-HGST_HUS728T8TALE6L4_VDKT237K      -      -      -        -         -      -      -      -    ONLINE
    ata-HGST_HUS728T8TALE6L4_VDGY075D      -      -      -        -         -      -      -      -    ONLINE
    ata-HGST_HUS728T8TALE6L4_VDKVRRJK      -      -      -        -         -      -      -      -    ONLINE
cache                                      -      -      -        -         -      -      -      -  -
  nvme0n1p4                            63.0G  62.9G   104M        -         -     0%  99.8%      -    ONLINE

What's worse is that zpool replace filling nvme0n1p4 filling-cache errors with

invalid vdev specification
use '-f' to override the following errors:
/dev/disk/by-partlabel/filling-cache is part of unknown pool 'filling'

Forcing the matter yields

cannot open '/dev/disk/by-partlabel/filling-cache': Device or resource busy
cannot replace nvme0n1p4 with filling-cache: device is in use as a cache

The disks are, well, disks. The cache device is part of

Disk /dev/nvme0n1: 119.24 GiB, 128035676160 bytes, 250069680 sectors
Disk model: SAMSUNG MZVLW128HEGR-000L2              
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 672C9FEA-8D24-3541-B93A-730CE7AE75EF
First LBA: 2048
Last LBA: 250069646
Alternative LBA: 250069679
Partition entries LBA: 2
Allocated partition entries: 128

Device             Start       End   Sectors Type-UUID                            UUID                                 Name                 Attrs
/dev/nvme0n1p1      2048      4095      2048 21686148-6449-6E6F-744E-656564454649 EB475644-9671-CE4A-AA98-F9F9C73684EF tarta-BIOS-boot-nvme 
/dev/nvme0n1p2      4096    528383    524288 BC13C2FF-59E6-4262-A352-B275FD6F7172 B0175EB8-2B72-5148-8422-1CF275429023 tarta-boot-nvme      
/dev/nvme0n1p3    528384 117968895 117440512 6A898CC3-1DD2-11B2-99A6-080020736631 40D8A22A-C611-9E43-B023-ABB5BE43F5EE tarta-zoot-nvme      
/dev/nvme0n1p4 117968896 250069646 132100751 6A898CC3-1DD2-11B2-99A6-080020736631 5C3A3952-F636-4147-91BE-04F26C6E6948 filling-cache        

Describe how to reproduce the problem

Dunno, I've tried it a few times and never got it to happen. Except on that pool. I've removed the cache and attached it again last time, with the filling-cache name, and on this import it reverted again.

This isn't a race, since, well, the device, partitions, and names exist in the initrd, and the module for the disks is only loaded in the real root. Plus, the import took from 20:28:19 to 20:41:45 and the import depends on settle, so.

nabijaczleweli commented 2 years ago

In the interest of science: I've removed the cache device and rebooted for an unrelated reason today, then added it like this:

$ l /dev/disk/by-partlabel/filling-cache
lrwxrwxrwx 1 root root 15 Mar 13 00:20 /dev/disk/by-partlabel/filling-cache -> ../../nvme0n1p4
# zpool add -nP filling cache filling-cache
would update 'filling' to the following configuration:

        filling
          mirror-0
            /dev/disk/by-id/ata-HGST_HUS726T4TALE6L4_V6K2L4RR-part1
            /dev/disk/by-id/ata-HGST_HUS726T4TALE6L4_V6K2MHYR-part1
          raidz1-1
            /dev/disk/by-id/ata-HGST_HUS728T8TALE6L4_VDKT237K-part1
            /dev/disk/by-id/ata-HGST_HUS728T8TALE6L4_VDGY075D-part1
            /dev/disk/by-id/ata-HGST_HUS728T8TALE6L4_VDKVRRJK-part1
        cache
          /dev/disk/by-partlabel/filling-cache
# zpool add filling cache filling-cache
$ zpool list -v filling
NAME                                    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
filling                                25.5T  6.58T  18.9T        -       64M     1%    25%  1.00x    ONLINE  -
  mirror                               3.64T   460G  3.19T        -       64M    10%  12.3%      -    ONLINE
    ata-HGST_HUS726T4TALE6L4_V6K2L4RR      -      -      -        -       64M      -      -      -    ONLINE
    ata-HGST_HUS726T4TALE6L4_V6K2MHYR      -      -      -        -       64M      -      -      -    ONLINE
  raidz1                               21.8T  6.13T  15.7T        -         -     0%  28.1%      -    ONLINE
    ata-HGST_HUS728T8TALE6L4_VDKT237K      -      -      -        -         -      -      -      -    ONLINE
    ata-HGST_HUS728T8TALE6L4_VDGY075D      -      -      -        -         -      -      -      -    ONLINE
    ata-HGST_HUS728T8TALE6L4_VDKVRRJK      -      -      -        -         -      -      -      -    ONLINE
cache                                      -      -      -        -         -      -      -      -  -
  filling-cache                        63.0G   450M  62.5G        -         -     0%  0.69%      -    ONLINE
$ zpool list -vP filling
NAME                                                          SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
filling                                                      25.5T  6.58T  18.9T        -       64M     1%    25%  1.00x    ONLINE  -
  mirror                                                     3.64T   460G  3.19T        -       64M    10%  12.3%      -    ONLINE
    /dev/disk/by-id/ata-HGST_HUS726T4TALE6L4_V6K2L4RR-part1      -      -      -        -       64M      -      -      -    ONLINE
    /dev/disk/by-id/ata-HGST_HUS726T4TALE6L4_V6K2MHYR-part1      -      -      -        -       64M      -      -      -    ONLINE
  raidz1                                                     21.8T  6.13T  15.7T        -         -     0%  28.1%      -    ONLINE
    /dev/disk/by-id/ata-HGST_HUS728T8TALE6L4_VDKT237K-part1      -      -      -        -         -      -      -      -    ONLINE
    /dev/disk/by-id/ata-HGST_HUS728T8TALE6L4_VDGY075D-part1      -      -      -        -         -      -      -      -    ONLINE
    /dev/disk/by-id/ata-HGST_HUS728T8TALE6L4_VDKVRRJK-part1      -      -      -        -         -      -      -      -    ONLINE
cache                                                            -      -      -        -         -      -      -      -  -
  /dev/disk/by-partlabel/filling-cache                       63.0G   474M  62.5G        -         -     0%  0.73%      -    ONLINE

I'll update this when I next reboot if I remember.

nabijaczleweli commented 2 years ago

Happened again, I've updated to 2.1.4 in the meantime:

nabijaczleweli@tarta:~$ zpool list -v filling
NAME                                    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
filling                                25.5T  6.81T  18.7T        -       64M     4%    26%  1.00x    ONLINE  -
  mirror-0                             3.64T   495G  3.16T        -       64M    16%  13.3%      -    ONLINE
    ata-HGST_HUS726T4TALE6L4_V6K2L4RR      -      -      -        -       64M      -      -      -    ONLINE
    ata-HGST_HUS726T4TALE6L4_V6K2MHYR      -      -      -        -       64M      -      -      -    ONLINE
  raidz1-1                             21.8T  6.33T  15.5T        -         -     2%  29.0%      -    ONLINE
    ata-HGST_HUS728T8TALE6L4_VDKT237K      -      -      -        -         -      -      -      -    ONLINE
    ata-HGST_HUS728T8TALE6L4_VDGY075D      -      -      -        -         -      -      -      -    ONLINE
    ata-HGST_HUS728T8TALE6L4_VDKVRRJK      -      -      -        -         -      -      -      -    ONLINE
cache                                      -      -      -        -         -      -      -      -  -
  nvme0n1p4                            63.0G  39.3G  23.7G        -         -     0%  62.4%      -    ONLINE
stale[bot] commented 1 year ago

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.

nabijaczleweli commented 1 year ago

Last import was 2023-01-26.07:19:16 zpool import -aN -o cachefile=none:

$ zpool list -v filling
NAME                                    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
filling                                25.5T  9.21T  16.3T        -       64M     8%    36%  1.00x    ONLINE  -
  mirror-0                             3.64T   899G  2.76T        -       64M    26%  24.1%      -    ONLINE
    ata-HGST_HUS726T4TALE6L4_V6K2L4RR  3.64T      -      -        -       64M      -      -      -    ONLINE
    ata-HGST_HUS726T4TALE6L4_V6K2MHYR  3.64T      -      -        -       64M      -      -      -    ONLINE
  raidz1-1                             21.8T  8.33T  13.5T        -         -     6%  38.2%      -    ONLINE
    ata-HGST_HUS728T8TALE6L4_VDKT237K  7.28T      -      -        -         -      -      -      -    ONLINE
    ata-HGST_HUS728T8TALE6L4_VDGY075D  7.28T      -      -        -         -      -      -      -    ONLINE
    ata-HGST_HUS728T8TALE6L4_VDKVRRJK  7.28T      -      -        -         -      -      -      -    ONLINE
cache                                      -      -      -        -         -      -      -      -  -
  nvme0n1p4                            63.0G  59.7G  3.25G        -         -     0%  94.8%      -    ONLINE

Replacement still repros. Haven't tried remove/attach; I'll do when next rebooting if I remember.

$ zfs version
zfs-2.1.9-1~bpo11+1
zfs-kmod-2.1.7-1
putnam commented 1 year ago

Experienced the same, including the issues replacing the device. I am on 2.1.12. My device name swapped randomly back in late 2022 at some point, but worked until the block device name changed due to a kernel upgrade.

Not only could I not replace it, just as OP...

# zpool replace tank nvme0n1 "/dev/disk/by-id/nvme-Samsung_SSD_960_PRO_512GB_XXX"
invalid vdev specification
use '-f' to override the following errors:
/dev/disk/by-id/nvme-Samsung_SSD_960_PRO_512GB_XXX-part1 is part of unknown pool 'tank'

Here the "invalid vdev specification" error is nonsensical. it makes you think you need to specify a vdev, which you don't, or you'll get 'too many arguments' like this:

# zpool replace tank cache nvme0n1 "/dev/disk/by-id/nvme-Samsung_SSD_960_PRO_512GB_XXX"
too many arguments
usage:
        replace [-fsw] [-o property=value] <pool> <device> [new-device]

# zpool replace tank -f nvme0n1 "/dev/disk/by-id/nvme-Samsung_SSD_960_PRO_512GB_XXX"
cannot replace nvme0n1 with /dev/disk/by-id/nvme-Samsung_SSD_960_PRO_512GB_XXX: device is in use as a cache

... but attempting to remove it threw an error despite actually removing it:

# zpool status
--- the cache device is present, as 'nvme0n1'

# zpool remove tank nvme0n1
cannot remove nvme0n1: no such device in pool

# zpool status
# --- the cache device is now gone, its just the vdevs

I was then able to add it afterward as usual, and for now it shows as being imported via by-id.

Seems like multiple bugs all over the place.