openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.43k stars 1.73k forks source link

Writing to a pool with a high fragmented free space shows write errors #12647

Closed fcrg closed 1 year ago

fcrg commented 2 years ago

System information

Type Version/Name
Distribution Name gentoo
Distribution Version 2.4.1
Linux Kernel 4.7.10
Architecture amd64
OpenZFS Version 2.1.1

Describe the problem you're observing

I am using a pool that shows a high fragmentation of the free space.

# zpool list
NAME                                       SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  2.70T  1.53T  1.18T        -         -    77%    56%  1.00x    ONLINE  -

The pool uses 12 250GByte SSD Disks.

  pool: zfs-235cd8b3-835a-4216-a38b-b52c5c566f55
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
    attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
    using 'zpool clear' or replace the device with 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
  scan: resilvered 1.14G in 00:00:17 with 0 errors on Tue Oct 12 14:53:08 2021
config:

    NAME                                      STATE     READ WRITE CKSUM
    zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  ONLINE       0     0     0
      raidz3-0                                ONLINE       0     0     0
        zfs-0x5002538d4282b4e3                ONLINE       0     0     0
        zfs-0x5002538d4282b4e2                ONLINE       0     0     0
        zfs-0x5002538d4282b4e0                ONLINE       0     0     0
        zfs-0x5002538d4282b4df                ONLINE       0     0     0
        zfs-0x5002538d4282b4d3                ONLINE       0     0     0
        zfs-0x5002538d4282b4d4                ONLINE       0     0     0
        zfs-0x5002538d4282b4d1                ONLINE       0     0     0
        zfs-0x5002538d4282b4d2                ONLINE       0     0     0
        zfs-0x5002538d4280bd98                ONLINE       0     0     0
        zfs-0x5002538d423c659f                ONLINE       0     0     0
        zfs-0x5002538d4280c3ec                ONLINE       0     0     0
        zfs-0x5002538d423c657b                ONLINE       0     0     0

errors: No known data errors

This pool was created with version 0.8.2. With this version this pool shows a low write performance due to high CPU load during metaslab allocation / metaslab_load. All disks are working fine and no write errors are visible with version 0.8.2.

Fragmentation:

zdb -M zfs-235cd8b3-835a-4216-a38b-b52c5c566f55
    pool zfs-235cd8b3-835a-4216-a38b-b52c5c566f55   fragmentation    78%
             11: 8659773 ****************
             12: 16373067 ******************************
             13: 22102772 ****************************************
             14: 17029560 *******************************
             15: 10903563 ********************
             16: 521022 *
             17:  27324 *
             18:      1 *

sample space map object: "zdb -mmm zfs-235cd8b3-835a-4216-a38b-b52c5c566f55" is multiple times slower compared to version 0.8.2. "zdb -mm zfs-235cd8b3-835a-4216-a38b-b52c5c566f55" is fast.

zdb -mmm zfs-235cd8b3-835a-4216-a38b-b52c5c566f55

space map object 222:
  smp_length = 0x5698c0
  smp_alloc = 0x23b8ba000
    metaslab      1   offset    400000000   spacemap      4   free    6.98G
                      segments     166687   maxsize    202K   freepct   43%
    In-memory histogram:
             11:  39689 **************
             12:  86763 ******************************
             13: 117519 ****************************************
             14:  97159 **********************************
             15:  57320 ********************
             16:  11649 ****
             17:    559 *
    On-disk histogram:      fragmentation 78
             11:  41162 **************
             12:  89268 ******************************
             13: 121123 ****************************************
             14: 101229 **********************************
             15:  71358 ************************
             16:   1135 *
             17:     10 *

Using this pool with version 2.1.1 now shows write errors. Other pools are working fine on my setup.

Describe how to reproduce the problem

Using the fio tool to write data

fio --name=/opt/zfs_mount/235cd8b3-835a-4216-a38b-b52c5c566f55/voldata/tmp/write12 --rw=write --direct=0 --ioengine=libaio --bs=64k --numjobs=2 --size=100G --runtime=600 --group_reporting

Include any warning/errors/backtraces from the system logs

shows

2021-10-15T11:55:56.767372+02:00 controller-21 kernel: [  372.348892] blk_update_request: I/O error, dev sdf, sector 335937206
2021-10-15T11:55:56.767372+02:00 controller-21 kernel: [  372.348896] zio pool=zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 vdev=/opt/fast/dev/bricks/e5082739-2cc0-42ca-837a-e9aedf739b58/zfs-0x5002538d4282b4d3 error=5 type=2 offset=170921749504 size=9216 flags=40080c80
2021-10-15T11:56:48.734371+02:00 controller-21 kernel: [  424.310414] blk_update_request: I/O error, dev sdd, sector 335967046
2021-10-15T11:56:48.734371+02:00 controller-21 kernel: [  424.310418] zio pool=zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 vdev=/opt/fast/dev/bricks/e5082739-2cc0-42ca-837a-e9aedf739b58/zfs-0x5002538d4282b4e0 error=5 type=2 offset=170937027584 size=6656 flags=40080c80
2021-10-15T11:57:27.710369+02:00 controller-21 kernel: [  463.282832] blk_update_request: I/O error, dev sdf, sector 183934405
2021-10-15T11:57:27.710369+02:00 controller-21 kernel: [  463.282837] zio pool=zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 vdev=/opt/fast/dev/bricks/e5082739-2cc0-42ca-837a-e9aedf739b58/zfs-0x5002538d4282b4d3 error=5 type=2 offset=93096315392 size=8192 flags=40080c80
2021-10-15T11:59:22.720958+02:00 controller-21 kernel: [  578.280205] blk_update_request: I/O error, dev sdm, sector 249524636
2021-10-15T11:59:22.720958+02:00 controller-21 kernel: [  578.280210] zio pool=zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 vdev=/opt/fast/dev/bricks/e5082739-2cc0-42ca-837a-e9aedf739b58/zfs-0x5002538d423c657b error=5 type=2 offset=126678513664 size=8192 flags=40080c80
2021-10-15T12:00:02.654389+02:00 controller-21 kernel: [  618.212548] blk_update_request: I/O error, dev sdg, sector 336037547
2021-10-15T12:00:02.654389+02:00 controller-21 kernel: [  618.212552] zio pool=zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 vdev=/opt/fast/dev/bricks/e5082739-2cc0-42ca-837a-e9aedf739b58/zfs-0x5002538d4282b4d4 error=5 type=2 offset=170973124096 size=8192 flags=40080c80
2021-10-15T12:00:39.710381+02:00 controller-21 kernel: [  655.265140] blk_update_request: I/O error, dev sdi, sector 184003452
2021-10-15T12:00:39.710381+02:00 controller-21 kernel: [  655.265145] zio pool=zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 vdev=/opt/fast/dev/bricks/e5082739-2cc0-42ca-837a-e9aedf739b58/zfs-0x5002538d4282b4d2 error=5 type=2 offset=93131667456 size=11776 flags=40080c80
2021-10-15T12:01:18.686373+02:00 controller-21 kernel: [  694.237890] blk_update_request: I/O error, dev sdi, sector 336051191
2021-10-15T12:01:18.686373+02:00 controller-21 kernel: [  694.237894] zio pool=zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 vdev=/opt/fast/dev/bricks/e5082739-2cc0-42ca-837a-e9aedf739b58/zfs-0x5002538d4282b4d2 error=5 type=2 offset=170980109824 size=8192 flags=40080c80
2021-10-15T12:02:06.687818+02:00 controller-21 kernel: [  742.233127] blk_update_request: I/O error, dev sde, sector 184031306
2021-10-15T12:02:06.687818+02:00 controller-21 kernel: [  742.233131] zio pool=zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 vdev=/opt/fast/dev/bricks/e5082739-2cc0-42ca-837a-e9aedf739b58/zfs-0x5002538d4282b4df error=5 type=2 offset=93145928704 size=8704 flags=40080c80
2021-10-15T12:03:27.710380+02:00 controller-21 kernel: [  823.249653] blk_update_request: I/O error, dev sdl, sector 184058234
2021-10-15T12:03:27.710380+02:00 controller-21 kernel: [  823.249657] zio pool=zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 vdev=/opt/fast/dev/bricks/e5082739-2cc0-42ca-837a-e9aedf739b58/zfs-0x5002538d4280c3ec error=5 type=2 offset=93159715840 size=11264 flags=40080c80
2021-10-15T12:04:09.694372+02:00 controller-21 kernel: [  865.229807] blk_update_request: I/O error, dev sdc, sector 336111453
2021-10-15T12:04:09.694372+02:00 controller-21 kernel: [  865.229812] zio pool=zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 vdev=/opt/fast/dev/bricks/e5082739-2cc0-42ca-837a-e9aedf739b58/zfs-0x5002538d4282b4e2 error=5 type=2 offset=171010963968 size=8704 flags=40080c80

2021-10-15T12:03:27.242384+02:00 controller-21 kernel: [  822.781578] sd 14:0:0:0: [sdl] tag#3 abort scheduled
2021-10-15T12:03:27.252367+02:00 controller-21 kernel: [  822.791567] sd 14:0:0:0: [sdl] tag#3 aborting command
2021-10-15T12:03:27.252367+02:00 controller-21 kernel: [  822.791570] sd 14:0:0:0: [sdl] tag#3 cmd abort failed
2021-10-15T12:03:27.252367+02:00 controller-21 kernel: [  822.791576] scsi host14: scsi_eh_14: waking up 0/1/1
2021-10-15T12:03:27.252367+02:00 controller-21 kernel: [  822.791585] ata15.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
2021-10-15T12:03:27.252367+02:00 controller-21 kernel: [  822.791587] ata15.00: failed command: WRITE DMA
2021-10-15T12:03:27.252367+02:00 controller-21 kernel: [  822.791591] ata15.00: cmd ca/00:16:7a:81:f8/00:00:00:00:00/ea tag 3 dma 11264 out
2021-10-15T12:03:27.252367+02:00 controller-21 kernel: [  822.791591]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
2021-10-15T12:03:27.252367+02:00 controller-21 kernel: [  822.791593] ata15.00: status: { DRDY }
2021-10-15T12:03:27.252367+02:00 controller-21 kernel: [  822.791597] ata15: hard resetting link
2021-10-15T12:03:27.708381+02:00 controller-21 kernel: [  823.247532] ata15: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
2021-10-15T12:03:27.708381+02:00 controller-21 kernel: [  823.247738] ata15.00: supports DRM functions and may not be fully accessible
2021-10-15T12:03:27.708381+02:00 controller-21 kernel: [  823.248269] ata15.00: disabling queued TRIM support
2021-10-15T12:03:27.709366+02:00 controller-21 kernel: [  823.248822] ata15.00: supports DRM functions and may not be fully accessible
2021-10-15T12:03:27.709366+02:00 controller-21 kernel: [  823.249305] ata15.00: disabling queued TRIM support
2021-10-15T12:03:27.710380+02:00 controller-21 kernel: [  823.249633] ata15.00: configured for UDMA/133
2021-10-15T12:03:27.710380+02:00 controller-21 kernel: [  823.249637] ata15.00: device reported invalid CHS sector 0
2021-10-15T12:03:27.710380+02:00 controller-21 kernel: [  823.249641] sd 14:0:0:0: [sdl] tag#3 scsi_eh_14: flush finish cmd
2021-10-15T12:03:27.710380+02:00 controller-21 kernel: [  823.249646] sd 14:0:0:0: [sdl] tag#3 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
2021-10-15T12:03:27.710380+02:00 controller-21 kernel: [  823.249648] sd 14:0:0:0: [sdl] tag#3 Sense Key : Illegal Request [current]
2021-10-15T12:03:27.710380+02:00 controller-21 kernel: [  823.249649] sd 14:0:0:0: [sdl] tag#3 Add. Sense: Unaligned write command
2021-10-15T12:03:27.710380+02:00 controller-21 kernel: [  823.249652] sd 14:0:0:0: [sdl] tag#3 CDB: Write(10) 2a 00 0a f8 81 7a 00 00 16 00
2021-10-15T12:03:27.710380+02:00 controller-21 kernel: [  823.249653] blk_update_request: I/O error, dev sdl, sector 184058234
2021-10-15T12:03:27.710380+02:00 controller-21 kernel: [  823.249657] zio pool=zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 vdev=/opt/fast/dev/bricks/e5082739-2cc0-42ca-837a-e9aedf739b58/zfs-0x5002538d4280c3ec error=5 type=2 offset=93159715840 size=11264 flags=40080c80
2021-10-15T12:03:27.710380+02:00 controller-21 kernel: [  823.249664] ata15: EH complete
2021-10-15T12:03:27.710474+02:00 controller-21 kernel: [  823.249666] scsi host14: waking up host to restart
2021-10-15T12:03:27.710474+02:00 controller-21 kernel: [  823.249671] scsi host14: scsi_eh_14: sleeping
2021-10-15T12:04:09.226380+02:00 controller-21 kernel: [  864.761706] sd 17:0:0:0: [sdc] tag#18 abort scheduled
2021-10-15T12:04:09.236366+02:00 controller-21 kernel: [  864.771700] sd 17:0:0:0: [sdc] tag#18 aborting command
2021-10-15T12:04:09.236366+02:00 controller-21 kernel: [  864.771703] sd 17:0:0:0: [sdc] tag#18 cmd abort failed
2021-10-15T12:04:09.236366+02:00 controller-21 kernel: [  864.771708] scsi host17: scsi_eh_17: waking up 0/1/1
2021-10-15T12:04:09.236366+02:00 controller-21 kernel: [  864.771717] ata18.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
2021-10-15T12:04:09.236366+02:00 controller-21 kernel: [  864.771720] ata18.00: failed command: WRITE DMA EXT
2021-10-15T12:04:09.236366+02:00 controller-21 kernel: [  864.771724] ata18.00: cmd 35/00:11:5d:a7:08/00:00:14:00:00/e0 tag 18 dma 8704 out
2021-10-15T12:04:09.236366+02:00 controller-21 kernel: [  864.771724]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
2021-10-15T12:04:09.236366+02:00 controller-21 kernel: [  864.771726] ata18.00: status: { DRDY }
2021-10-15T12:04:09.236366+02:00 controller-21 kernel: [  864.771729] ata18: hard resetting link
2021-10-15T12:04:09.692387+02:00 controller-21 kernel: [  865.227673] ata18: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
2021-10-15T12:04:09.692387+02:00 controller-21 kernel: [  865.227881] ata18.00: supports DRM functions and may not be fully accessible
2021-10-15T12:04:09.692387+02:00 controller-21 kernel: [  865.228411] ata18.00: disabling queued TRIM support
2021-10-15T12:04:09.693367+02:00 controller-21 kernel: [  865.228969] ata18.00: supports DRM functions and may not be fully accessible
2021-10-15T12:04:09.693367+02:00 controller-21 kernel: [  865.229453] ata18.00: disabling queued TRIM support
2021-10-15T12:04:09.694372+02:00 controller-21 kernel: [  865.229787] ata18.00: configured for UDMA/133
2021-10-15T12:04:09.694372+02:00 controller-21 kernel: [  865.229790] ata18.00: device reported invalid CHS sector 0
2021-10-15T12:04:09.694372+02:00 controller-21 kernel: [  865.229794] sd 17:0:0:0: [sdc] tag#18 scsi_eh_17: flush finish cmd
2021-10-15T12:04:09.694372+02:00 controller-21 kernel: [  865.229800] sd 17:0:0:0: [sdc] tag#18 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
2021-10-15T12:04:09.694372+02:00 controller-21 kernel: [  865.229802] sd 17:0:0:0: [sdc] tag#18 Sense Key : Illegal Request [current]
2021-10-15T12:04:09.694372+02:00 controller-21 kernel: [  865.229803] sd 17:0:0:0: [sdc] tag#18 Add. Sense: Unaligned write command
2021-10-15T12:04:09.694372+02:00 controller-21 kernel: [  865.229805] sd 17:0:0:0: [sdc] tag#18 CDB: Write(10) 2a 00 14 08 a7 5d 00 00 11 00
2021-10-15T12:04:09.694372+02:00 controller-21 kernel: [  865.229807] blk_update_request: I/O error, dev sdc, sector 336111453
2021-10-15T12:04:09.694372+02:00 controller-21 kernel: [  865.229812] zio pool=zfs-235cd8b3-835a-4216-a38b-b52c5c566f55 vdev=/opt/fast/dev/bricks/e5082739-2cc0-42ca-837a-e9aedf739b58/zfs-0x5002538d4282b4e2 error=5 type=2 offset=171010963968 size=8704 flags=40080c80
2021-10-15T12:04:09.694372+02:00 controller-21 kernel: [  865.229820] ata18: EH complete
2021-10-15T12:04:09.694452+02:00 controller-21 kernel: [  865.229822] scsi host17: waking up host to restart
2021-10-15T12:04:09.694452+02:00 controller-21 kernel: [  865.229828] scsi host17: scsi_eh_17: sleeping

resulting in

  pool: zfs-235cd8b3-835a-4216-a38b-b52c5c566f55
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
    attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
    using 'zpool clear' or replace the device with 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
  scan: resilvered 1.14G in 00:00:17 with 0 errors on Tue Oct 12 14:53:08 2021
config:

    NAME                                      STATE     READ WRITE CKSUM
    zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  ONLINE       0     0     0
      raidz3-0                                ONLINE       0     0     0
        zfs-0x5002538d4282b4e3                ONLINE       0     0     0
        zfs-0x5002538d4282b4e2                ONLINE       0    18     0
        zfs-0x5002538d4282b4e0                ONLINE       0    14     0
        zfs-0x5002538d4282b4df                ONLINE       0    18     0
        zfs-0x5002538d4282b4d3                ONLINE       0    36     0
        zfs-0x5002538d4282b4d4                ONLINE       0    17     0
        zfs-0x5002538d4282b4d1                ONLINE       0     0     0
        zfs-0x5002538d4282b4d2                ONLINE       0    41     0
        zfs-0x5002538d4280bd98                ONLINE       0     0     0
        zfs-0x5002538d423c659f                ONLINE       0     0     0
        zfs-0x5002538d4280c3ec                ONLINE       0    23     0
        zfs-0x5002538d423c657b                ONLINE       0    17     0

errors: No known data errors
rincebrain commented 2 years ago

That really seems like the disk is timing out underneath ZFS and resetting, not necessarily ZFS doing something mad, and I don't know that it has anything to do with the FRAG levels? What models are the disks?

fcrg commented 2 years ago

The disks: Samsung SSD 850 EVO 250GB

One issue that is caused by the FRAG levels that's visible with 2.1.1 "zdb -mmm zfs-235cd8b3-835a-4216-a38b-b52c5c566f55" is multiple times slower compared to version 0.8.2 - calling metaslab_load

rincebrain commented 2 years ago

Do you have autotrim on, perchance? I understand the Samsung 8xx SSDs do not play nice in the sandbox if you issue them TRIMs sometimes, which is why Linux has quirks for them to try and avoid those problems.

4.7 is quite, quite old though, it's not impossible it doesn't know that...

No, that seems to be in 4.7.x and dated to 4.1-rc4. Hm.

There have been weird interactions before with the 8xx EVO drives spitting up DMA errors when you ask them to TRIM on some controllers in some ways, though.

What does zpool get all [pool] say?

That's curious - I believe 2.1 added things to try and limit how much digging through metaslabs it tries to do for fitting allocations, which should have the opposite effect...I stand by suggesting you stick your nose in perf, maybe with a FlameGraph, and see where it's spending time, possibly on 0.8 versus 2.1.

fcrg commented 2 years ago

Autotrim is on.

I will retest the pool with a newer linux version.

zpool get all

zpool get all zfs-235cd8b3-835a-4216-a38b-b52c5c566f55
NAME                                      PROPERTY                       VALUE                          SOURCE
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  size                           2.70T                          -
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  capacity                       57%                            -
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  altroot                        -                              default
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  health                         ONLINE                         -
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  guid                           4928019990347038273            -
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  version                        -                              default
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  bootfs                         -                              default
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  delegation                     on                             default
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  autoreplace                    off                            default
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  cachefile                      -                              default
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  failmode                       continue                       local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  listsnapshots                  off                            default
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  autoexpand                     off                            default
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  dedupratio                     1.00x                          -
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  free                           1.14T                          -
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  allocated                      1.56T                          -
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  readonly                       off                            -
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  ashift                         9                              local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  comment                        -                              default
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  expandsize                     -                              -
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  freeing                        0                              -
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  fragmentation                  78%                            -
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  leaked                         0                              -
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  multihost                      off                            default
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  checkpoint                     -                              -
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  load_guid                      4585150976694631634            -
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  autotrim                       on                             local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  compatibility                  off                            default
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  feature@async_destroy          enabled                        local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  feature@empty_bpobj            active                         local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  feature@lz4_compress           active                         local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  feature@multi_vdev_crash_dump  disabled                       local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  feature@spacemap_histogram     active                         local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  feature@enabled_txg            active                         local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  feature@hole_birth             active                         local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  feature@extensible_dataset     active                         local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  feature@embedded_data          active                         local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  feature@bookmarks              enabled                        local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  feature@filesystem_limits      enabled                        local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  feature@large_blocks           active                         local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  feature@large_dnode            enabled                        local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  feature@sha512                 disabled                       local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  feature@skein                  disabled                       local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  feature@edonr                  disabled                       local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  feature@userobj_accounting     disabled                       local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  feature@encryption             disabled                       local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  feature@project_quota          disabled                       local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  feature@device_removal         disabled                       local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  feature@obsolete_counts        disabled                       local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  feature@zpool_checkpoint       disabled                       local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  feature@spacemap_v2            disabled                       local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  feature@allocation_classes     disabled                       local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  feature@resilver_defer         disabled                       local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  feature@bookmark_v2            disabled                       local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  feature@redaction_bookmarks    disabled                       local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  feature@redacted_datasets      disabled                       local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  feature@bookmark_written       disabled                       local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  feature@log_spacemap           disabled                       local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  feature@livelist               disabled                       local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  feature@device_rebuild         disabled                       local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  feature@zstd_compress          disabled                       local
zfs-235cd8b3-835a-4216-a38b-b52c5c566f55  feature@draid                  disabled                       local

Yes 2.1 should improve the metaslab issue with https://www.illumos.org/issues/11971

The output of perf top which gives an indication where the CPU spends it's time during zdm -mmm <pool>

    42.22%  libzpool.so.5.0.0     [.] zfs_btree_verify_counts_helper
    15.66%  libzpool.so.5.0.0     [.] zfs_btree_verify_height_helper
    11.73%  libzpool.so.5.0.0     [.] zfs_btree_verify_pointers_helper
     0.34%  libzpool.so.5.0.0     [.] zfs_btree_find_in_buf.isra.8
     0.34%  libpthread-2.26.so    [.] __pthread_mutex_lock
     0.31%  libzpool.so.5.0.0     [.] range_tree_seg32_compare
     0.28%  libzpool.so.5.0.0     [.] range_tree_add_impl
     0.24%  libruby23.so.2.3.0    [.] 0x000000000008ac37
     0.24%  libpthread-2.26.so    [.] __pthread_mutex_unlock_usercnt
 74.43%     0.08%  libzpool.so.5.0.0  [.] zfs_btree_verify
            |
             --13.88%--zfs_btree_verify
                       |
                       |--16.76%--zfs_btree_verify_counts_helper
                       |          |
                       |           --42.25%--zfs_btree_verify_counts_helper
                       |                     |
                       |                      --31.50%--zfs_btree_verify_counts_helper
                       |
                       |--13.06%--zfs_btree_verify_pointers_helper
                       |          |
                       |           --12.76%--zfs_btree_verify_pointers_helper
                       |
                        --8.98%--zfs_btree_verify_height_helper
                                  |
                                   --18.47%--zfs_btree_verify_height_helper
                                             |
                                              --11.54%--zfs_btree_verify_height_helper

    69.52%     0.04%  libzpool.so.5.0.0  [.] zfs_btree_add_idx
            |
             --9.92%--zfs_btree_add_idx
                       |
                        --10.20%--zfs_btree_verify
                                  |
                                  |--14.50%--zfs_btree_verify_counts_helper
                                  |          |
                                  |           --39.38%--zfs_btree_verify_counts_helper
                                  |                     |
                                  |                      --29.32%--zfs_btree_verify_counts_helper
                                  |
                                  |--12.15%--zfs_btree_verify_pointers_helper
                                  |          |
                                  |           --11.86%--zfs_btree_verify_pointers_helper
                                  |
                                   --7.86%--zfs_btree_verify_height_helper
                                             |
                                              --17.14%--zfs_btree_verify_height_helper
                                                        |
                                                         --10.67%--zfs_btree_verify_height_helper

    66.01%     0.06%  libzpool.so.5.0.0  [.] space_map_iterate
            |
             --7.36%--space_map_iterate
                       |
                        --7.51%--space_map_load_callback
                                  |
                                  |--5.04%--range_tree_add_impl
                                  |          |
                                  |          |--3.53%--zfs_btree_add_idx
                                  |          |          |
                                  |          |           --3.80%--zfs_btree_verify
                                  |          |                     |
                                  |          |                     |--10.65%--zfs_btree_verify_pointers_helper
                                  |          |                     |          |
                                  |          |                     |           --10.47%--zfs_btree_verify_pointers_helper
                                  |          |                     |
                                  |          |                     |--9.70%--zfs_btree_verify_counts_helper
                                  |          |                     |          |
                                  |          |                     |           --33.14%--zfs_btree_verify_counts_helper
                                  |          |                     |                     |
                                  |          |                     |                      --24.85%--zfs_btree_verify_counts_helper
                                  |          |                     |
                                  |          |                      --6.34%--zfs_btree_verify_height_helper
                                  |          |                                |
                                  |          |                                 --15.27%--zfs_btree_verify_height_helper
                                  |          |                                           |
                                  |          |                                            --9.67%--zfs_btree_verify_height_helper
                                  |          |
                                  |           --1.28%--zfs_btree_verify
                                  |
                                   --2.60%--range_tree_remove_impl
                                             |
                                              --2.78%--zfs_btree_verify
                                                        |
                                                        |--2.05%--zfs_btree_verify_counts_helper
                                                        |          |
                                                        |           --2.05%--zfs_btree_verify_counts_helper
                                                        |                     |
                                                        |                      --1.56%--zfs_btree_verify_counts_helper
                                                        |
                                                         --0.95%--zfs_btree_verify_height_helper
                                                                   |
                                                                    --0.95%--zfs_btree_verify_height_helper

    65.40%     0.02%  libzpool.so.5.0.0  [.] space_map_load_callback
            |
             --7.50%--space_map_load_callback
                       |
                       |--5.04%--range_tree_add_impl
                       |          |
                       |          |--3.53%--zfs_btree_add_idx
                       |          |          |
                       |          |           --3.79%--zfs_btree_verify
                       |          |                     |
                       |          |                     |--10.61%--zfs_btree_verify_pointers_helper
                       |          |                     |          |
                       |          |                     |           --10.43%--zfs_btree_verify_pointers_helper
                       |          |                     |
                       |          |                     |--9.66%--zfs_btree_verify_counts_helper
                       |          |                     |          |
                       |          |                     |           --32.99%--zfs_btree_verify_counts_helper
                       |          |                     |                     |
                       |          |                     |                      --24.75%--zfs_btree_verify_counts_helper
                       |          |                     |
                       |          |                      --6.31%--zfs_btree_verify_height_helper
                       |          |                                |
                       |          |                                 --15.22%--zfs_btree_verify_height_helper
                       |          |                                           |
                       |          |                                            --9.64%--zfs_btree_verify_height_helper
                       |          |
                       |           --1.28%--zfs_btree_verify
                       |
                        --2.60%--range_tree_remove_impl
                                  |
                                   --2.74%--zfs_btree_verify
                                             |
                                             |--1.96%--zfs_btree_verify_counts_helper
                                             |          |
                                             |           --1.95%--zfs_btree_verify_counts_helper
                                             |                     |
                                             |                      --1.48%--zfs_btree_verify_counts_helper
                                             |
                                              --0.92%--zfs_btree_verify_height_helper
                                                        |
                                                         --0.91%--zfs_btree_verify_height_helper

    61.77%     0.19%  libzpool.so.5.0.0  [.] range_tree_add_impl
            |
             --4.86%--range_tree_add_impl
                       |
                       |--3.53%--zfs_btree_add_idx
                       |          |
                       |           --3.79%--zfs_btree_verify
                       |                     |
                       |                     |--10.61%--zfs_btree_verify_pointers_helper
                       |                     |          |
                       |                     |           --10.43%--zfs_btree_verify_pointers_helper
                       |                     |
                       |                     |--9.66%--zfs_btree_verify_counts_helper
                       |                     |          |
                       |                     |           --32.99%--zfs_btree_verify_counts_helper
                       |                     |                     |
                       |                     |                      --24.75%--zfs_btree_verify_counts_helper
                       |                     |
                       |                      --6.31%--zfs_btree_verify_height_helper
                       |                                |
                       |                                 --15.22%--zfs_btree_verify_height_helper
                       |                                           |
                       |                                            --9.64%--zfs_btree_verify_height_helper
                       |
                        --1.28%--zfs_btree_verify

    42.84%    42.72%  libzpool.so.5.0.0  [.] zfs_btree_verify_counts_helper
            |
             --8.01%--0xa026258d4c544155
                       __libc_start_main
                       main
                       dump_zpool
                       dump_metaslab
                       metaslab_load
                       |
                       |--4.83%--range_tree_walk
                       |          metaslab_size_sorted_add
                       |          zfs_btree_add
                       |          zfs_btree_add_idx
                       |          zfs_btree_verify
                       |          zfs_btree_verify_counts_helper
                       |          |
                       |           --6.39%--zfs_btree_verify_counts_helper
                       |                     |
                       |                      --4.57%--zfs_btree_verify_counts_helper
                       |
                        --3.83%--space_map_load_length
                                  space_map_iterate
                                  space_map_load_callback
                                  |
                                  |--2.56%--range_tree_add_impl
                                  |          |
                                  |           --9.70%--zfs_btree_add_idx
                                  |                     zfs_btree_verify
                                  |                     zfs_btree_verify_counts_helper
                                  |                     |
                                  |                      --33.13%--zfs_btree_verify_counts_helper
                                  |                                |
                                  |                                 --24.85%--zfs_btree_verify_counts_helper
                                  |
                                   --1.56%--range_tree_remove_impl
                                             |
                                              --2.05%--zfs_btree_verify
                                                        zfs_btree_verify_counts_helper
                                                        |
                                                         --2.05%--zfs_btree_verify_counts_helper
                                                                   |
                                                                    --1.56%--zfs_btree_verify_counts_helper

    18.74%    18.69%  libzpool.so.5.0.0  [.] zfs_btree_verify_height_helper
            |
             --3.17%--0xa026258d4c544155
                       __libc_start_main
                       main
                       dump_zpool
                       dump_metaslab
                       metaslab_load
                       |
                       |--1.99%--space_map_load_length
                       |          space_map_iterate
                       |          space_map_load_callback
                       |          |
                       |           --1.42%--range_tree_add_impl
                       |                     |
                       |                      --6.34%--zfs_btree_add_idx
                       |                                zfs_btree_verify
                       |                                zfs_btree_verify_height_helper
                       |                                |
                       |                                 --15.27%--zfs_btree_verify_height_helper
                       |                                           |
                       |                                            --9.67%--zfs_btree_verify_height_helper
                       |
                        --1.54%--range_tree_walk
                                  metaslab_size_sorted_add
                                  zfs_btree_add
                                  zfs_btree_add_idx
                                  zfs_btree_verify
                                  zfs_btree_verify_height_helper
                                  |
                                   --1.91%--zfs_btree_verify_height_helper
                                             |
                                              --1.02%--zfs_btree_verify_height_helper
stale[bot] commented 1 year ago

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.