openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.65k stars 1.75k forks source link

Scrub never completes after rsyncing from .zfs/snapshot dir #16321

Open brenc opened 4 months ago

brenc commented 4 months ago

System information

Type Version/Name
Distribution Name FreeBSD
Distribution Version 14.1-RELEASE-p1
Kernel Version 14.1-RELEASE
Architecture x86_64
OpenZFS Version zfs-2.2.4-FreeBSD_g256659204, zfs-kmod-2.2.4-FreeBSD_g256659204

Describe the problem you're observing

We have a remote backup process that uses rsync to back up files from a server using snapshots in the .zfs/snapshot directories. I noticed that after I started using this process subsequent scrubs (which run once per month on the 1st) never complete. I have been troubleshooting this for months and can confidently say that accessing files from these snapshot dirs is what triggers this bug. If I stop the remote backup process, reboot the server, and then run a scrub it completes normally in < 12 hours. As soon as I start the remote backup backup process, subsequent scrubs that run never complete even if left running for 5+ days.

I have tried pausing and resuming the scrub as well as stopping the scrub completely and restarting it. Nothing seems to correct this issue except a reboot. It always gets to around 70-80ish% done and hangs unless I reboot.

Some additional observations: scrub speed appears to be much reduced. This is what it looks like while stuck:

  pool: storage
 state: ONLINE
  scan: scrub in progress since Mon Jul  1 00:00:00 2024
        73.3T / 102T scanned at 595M/s, 73.3T / 102T issued at 595M/s
        0B repaired, 71.81% done, 14:05:39 to go

While a normal scrub will run at ~3 G/s. There is definitely still high read IO on the drives:

                        capacity     operations     bandwidth    syncq_read    syncq_write   asyncq_read  asyncq_write   scrubq_read   trimq_write  rebuildq_write
pool                  alloc   free   read  write   read  write   pend  activ   pend  activ   pend  activ   pend  activ   pend  activ   pend  activ   pend  activ
--------------------  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----
storage                102T   291T  4.87K      0  52.9M      0      0      0      0      0      0      0     97     48      0      0      0      0      0      0
  raidz2-0            25.6T  72.6T  1.19K      0  12.6M      0      0      0      0      0      0      0     18     12      0      0      0      0      0      0
    DRIVE      -      -    192      0  1.88M      0      0      0      0      0      0      0      5      2      0      0      0      0      0      0
    DRIVE      -      -    213      0  2.30M    807      0      0      0      0      0      0      2      2      0      0      0      0      0      0
    DRIVE      -      -    180      0  1.72M    807      0      0      0      0      0      0      1      2      0      0      0      0      0      0
    DRIVE      -      -    203      0  1.99M    807      0      0      0      0      0      0      1      2      0      0      0      0      0      0
    DRIVE      -      -    231      0  2.48M    807      0      0      0      0      0      0      1      2      0      0      0      0      0      0
    DRIVE      -      -    194      0  2.21M    807      0      0      0      0      0      0      2      2      0      0      0      0      0      0
  raidz2-1            25.6T  72.6T  1.27K      0  13.9M  3.94K      0      0      0      0      0      0     49     12      0      0      0      0      0      0
    DRIVE      -      -    202      0  2.34M    807      0      0      0      0      0      0      9      2      0      0      0      0      0      0
    DRIVE      -      -    214      0  2.26M    807      0      0      0      0      0      0      6      2      0      0      0      0      0      0
    DRIVE      -      -    202      0  1.94M    807      0      0      0      0      0      0      5      2      0      0      0      0      0      0
    DRIVE      -      -    198      0  1.88M  1.58K      0      0      0      0      0      0      2      2      0      0      0      0      0      0
    DRIVE      -      -    253      0  2.61M  1.58K      0      0      0      0      0      0      1      2      0      0      0      0      0      0
    DRIVE      -      -    226      0  2.87M  1.58K      0      0      0      0      0      0      9      2      0      0      0      0      0      0
  raidz2-2            25.6T  72.6T  1.10K      2  12.1M  9.47K      0      0      0      0      0      0     72     12      0      0      0      0      0      0
    DRIVE      -      -    211      0  2.18M  1.58K      0      0      0      0      0      0     12      2      0      0      0      0      0      0
    DRIVE      -      -    172      0  2.06M  1.58K      0      0      0      0      0      0     12      2      0      0      0      0      0      0
    DRIVE      -      -    189      0  2.17M  5.52K      0      0      0      0      0      0     10      2      0      0      0      0      0      0
    DRIVE      -      -    207      0  2.13M  1.58K      0      0      0      0      0      0     13      2      0      0      0      0      0      0
    DRIVE      -      -    161      0  1.76M  1.58K      0      0      0      0      0      0     13      2      0      0      0      0      0      0
    DRIVE      -      -    187      0  1.84M  1.58K      0      0      0      0      0      0      9      2      0      0      0      0      0      0
  raidz2-3            25.6T  72.6T  1.31K      4  14.3M  31.6K      0      0      0      0      0      0     15     12      0      0      0      0      0      0
    DRIVE      -      -    205      0  2.01M  3.94K      0      0      0      0      0      0      2      2      0      0      0      0      0      0
    DRIVE      -      -    215      0  2.28M  3.94K      0      0      0      0      0      0      3      2      0      0      0      0      0      0
    DRIVE      -      -    235      0  2.46M  6.31K      0      0      0      0      0      0      5      2      0      0      0      0      0      0
    DRIVE      -      -    218      0  2.37M  9.47K      0      0      0      0      0      0      3      2      0      0      0      0      0      0
    DRIVE      -      -    220      0  2.49M  3.94K      0      0      0      0      0      0      1      2      0      0      0      0      0      0
    DRIVE      -      -    242      0  2.67M  3.94K      0      0      0      0      0      0      1      2      0      0      0      0      0      0
--------------------  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----

The scrub appears to be running even when no progress is made. % done never advances.

ZFS replication (this server pulls from the primary) essentially grinds to a halt as well.

Pool config:

# zpool get all storage
NAME     PROPERTY                       VALUE                          SOURCE
storage  size                           393T                           -
storage  capacity                       26%                            -
storage  altroot                        -                              default
storage  health                         ONLINE                         -
storage  guid                           15571001037900998829           -
storage  version                        -                              default
storage  bootfs                         -                              default
storage  delegation                     on                             default
storage  autoreplace                    off                            default
storage  cachefile                      -                              default
storage  failmode                       wait                           default
storage  listsnapshots                  off                            default
storage  autoexpand                     off                            default
storage  dedupratio                     1.00x                          -
storage  free                           291T                           -
storage  allocated                      102T                           -
storage  readonly                       off                            -
storage  ashift                         12                             local
storage  comment                        -                              default
storage  expandsize                     -                              -
storage  freeing                        0                              -
storage  fragmentation                  3%                             -
storage  leaked                         0                              -
storage  multihost                      off                            default
storage  checkpoint                     -                              -
storage  load_guid                      4612684389734077050            -
storage  autotrim                       off                            default
storage  compatibility                  off                            default
storage  bcloneused                     0                              -
storage  bclonesaved                    0                              -
storage  bcloneratio                    1.00x                          -
storage  feature@async_destroy          enabled                        local
storage  feature@empty_bpobj            active                         local
storage  feature@lz4_compress           active                         local
storage  feature@multi_vdev_crash_dump  enabled                        local
storage  feature@spacemap_histogram     active                         local
storage  feature@enabled_txg            active                         local
storage  feature@hole_birth             active                         local
storage  feature@extensible_dataset     active                         local
storage  feature@embedded_data          active                         local
storage  feature@bookmarks              enabled                        local
storage  feature@filesystem_limits      enabled                        local
storage  feature@large_blocks           active                         local
storage  feature@large_dnode            enabled                        local
storage  feature@sha512                 enabled                        local
storage  feature@skein                  enabled                        local
storage  feature@edonr                  enabled                        local
storage  feature@userobj_accounting     active                         local
storage  feature@encryption             active                         local
storage  feature@project_quota          active                         local
storage  feature@device_removal         enabled                        local
storage  feature@obsolete_counts        enabled                        local
storage  feature@zpool_checkpoint       enabled                        local
storage  feature@spacemap_v2            active                         local
storage  feature@allocation_classes     enabled                        local
storage  feature@resilver_defer         enabled                        local
storage  feature@bookmark_v2            enabled                        local
storage  feature@redaction_bookmarks    enabled                        local
storage  feature@redacted_datasets      enabled                        local
storage  feature@bookmark_written       enabled                        local
storage  feature@log_spacemap           active                         local
storage  feature@livelist               enabled                        local
storage  feature@device_rebuild         enabled                        local
storage  feature@zstd_compress          enabled                        local
storage  feature@draid                  enabled                        local
storage  feature@zilsaxattr             active                         local
storage  feature@head_errlog            active                         local
storage  feature@blake3                 enabled                        local
storage  feature@block_cloning          enabled                        local
storage  feature@vdev_zaps_v2           active                         local

I don't know if it matters but I'm using encryption. Here are the encryption root settings:

# zfs get all storage/e
NAME       PROPERTY              VALUE                           SOURCE
storage/e  type                  filesystem                      -
storage/e  creation              Wed Sep  6 15:10 2023           -
storage/e  used                  68.2T                           -
storage/e  available             193T                            -
storage/e  referenced            432K                            -
storage/e  compressratio         1.05x                           -
storage/e  mounted               yes                             -
storage/e  quota                 none                            default
storage/e  reservation           none                            default
storage/e  recordsize            1M                              inherited from storage
storage/e  mountpoint            /storage/e                      default
storage/e  sharenfs              off                             default
storage/e  checksum              on                              default
storage/e  compression           lz4                             inherited from storage
storage/e  atime                 off                             inherited from storage
storage/e  devices               on                              default
storage/e  exec                  off                             inherited from storage
storage/e  setuid                off                             inherited from storage
storage/e  readonly              off                             default
storage/e  jailed                off                             default
storage/e  snapdir               visible                         inherited from storage
storage/e  aclmode               discard                         default
storage/e  aclinherit            restricted                      default
storage/e  createtxg             62704                           -
storage/e  canmount              on                              default
storage/e  xattr                 on                              default
storage/e  copies                1                               default
storage/e  version               5                               -
storage/e  utf8only              off                             -
storage/e  normalization         none                            -
storage/e  casesensitivity       sensitive                       -
storage/e  vscan                 off                             default
storage/e  nbmand                off                             default
storage/e  sharesmb              off                             default
storage/e  refquota              none                            default
storage/e  refreservation        none                            default
storage/e  guid                  14413502527107633131            -
storage/e  primarycache          all                             default
storage/e  secondarycache        all                             default
storage/e  usedbysnapshots       0B                              -
storage/e  usedbydataset         432K                            -
storage/e  usedbychildren        68.2T                           -
storage/e  usedbyrefreservation  0B                              -
storage/e  logbias               latency                         default
storage/e  objsetid              282                             -
storage/e  dedup                 off                             default
storage/e  mlslabel              none                            default
storage/e  sync                  standard                        default
storage/e  dnodesize             legacy                          default
storage/e  refcompressratio      1.00x                           -
storage/e  written               432K                            -
storage/e  logicalused           70.7T                           -
storage/e  logicalreferenced     103K                            -
storage/e  volmode               default                         default
storage/e  filesystem_limit      none                            default
storage/e  snapshot_limit        none                            default
storage/e  filesystem_count      none                            default
storage/e  snapshot_count        none                            default
storage/e  snapdev               hidden                          default
storage/e  acltype               nfsv4                           default
storage/e  context               none                            default
storage/e  fscontext             none                            default
storage/e  defcontext            none                            default
storage/e  rootcontext           none                            default
storage/e  relatime              on                              default
storage/e  redundant_metadata    all                             default
storage/e  overlay               on                              default
storage/e  encryption            aes-256-gcm                     -
storage/e  keylocation           file:///etc/zfs_encryption_key  local
storage/e  keyformat             passphrase                      -
storage/e  pbkdf2iters           350000                          -
storage/e  encryptionroot        storage/e                       -
storage/e  keystatus             available                       -
storage/e  special_small_blocks  0                               default
storage/e  prefetch              all                             default

Describe how to reproduce the problem

rsync files from .zfs/snapshot directories then run a scrub.

Include any warning/errors/backtraces from the system logs

There are no relevant logs as far as I can tell.

I believe that's all the information I have available on this right now. Please let me know if you need anything else.

awused commented 2 months ago

I think I triggered this same bug but from cancelling a scrub instead, also on FreeBSD 14.1 with the exact same OpenZFS version.

I cancelled and restarted a scrub, then it seemed to get stuck at a certain point and the symptoms seem to match. The "issued" number and the completion percentage made no progress over about 12 hours before I gave up. The "scanned" number was going up but only by about 5-10MB/s for those same hours, and I was seeing 1-2k scrub read IOPS on the pool for the scrub but only 10-15MB/s of reads.

Restarting the scrub just caused it to get stuck at the same point again, and pausing/unpausing seemed to make a little progress but it just got stuck again slightly later. After restarting the machine the scrubbing is back to normal.

brenc commented 1 month ago

This is still happening for us. This month I paused the scrub and rebooted and even though the remote backup process immediately reconnected and started pulling data from the snapshot dirs, the scrub completed. I'm having to reboot once per month.

exscape commented 1 month ago

I seem to be having the same bug, but I've never figured out what triggers it. I have two similar systems with ZFS, and on one of them, the monthly scrub has gotten stuck every single month now for maybe 6-7 months in a row. On the other, it has only happened once -- today.
Both run Debian bookworm with ZFS from backports.

Currently (on both systems):
Kernel: 6.1.0-26-amd64 (Debian 6.1.112-1)
ZFS: zfs-2.2.6-1~bpo12+2

Both are kept updated, so the this has happened on older kernels and ZFS versions as well.
Both run ZFS native encryption (on the pool root). One is a 5-disk RAIDZ2 and the other (that hanged today) is a 2-disk mirror. The second system has a 1-disk zpool that is still scrubbing properly.

It's possible that it's related to the snapshot dir for me as well, as I use it now and then... but I'm doubtful that I use it every single month, as I don't have any automated tools that use it. There hasn't been a monthly scrub on system 1 (RAIDZ2) that has finished by itself since possibly early this year.
Pausing the scrub and rebooting always solves the issue, and as far as I can remember, I've never needed more than one reboot for a scrub to finish. It never seems to get stuck again, until next month.

The zpool status output sometimes progresses a miniscule amount, like 0.01% per 15+ minutes, but typically stands still. The (average) speed displayed always keeps going down towards 0 MB/s, and the remaining time always keeps going up, as no real progress is being made. The disks are still working quite hard, but I've no idea what they are actually doing.

This is what it looks like when it's working (iowait <2%, r/s >500 for most disks):

serenity@hyperion:~ $ iostat -mxy 30
Linux 6.1.0-23-amd64 (hyperion)         08/11/2024      _x86_64_        (8 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.49    0.00    7.35    0.96    0.00   91.19

Device            r/s     rMB/s   rrqm/s  %rrqm r_await rareq-sz     w/s     wMB/s   wrqm/s  %wrqm w_await wareq-sz     d/s     dMB/s   drqm/s  %drqm d_await dareq-sz     f/s f_await  aqu-sz  %util
sda            922.60     65.89     1.27   0.14    0.86    73.13    9.43      0.08     0.00   0.00    3.24     8.51    0.00      0.00     0.00   0.00    0.00     0.00    0.33   88.10    0.85  34.25
sdb            904.43     65.88     1.30   0.14    0.89    74.59    9.27      0.08     0.00   0.00    3.48     8.83    0.00      0.00     0.00   0.00    0.00     0.00    0.33   93.00    0.87  35.00
sdc             68.33     65.98     0.00   0.00   41.71   988.76    9.57      0.08     0.00   0.00    1.77     8.56    0.00      0.00     0.00   0.00    0.00     0.00    0.33   46.20    2.88  96.53
sdd            882.97     65.90     1.50   0.17    1.01    76.43    9.43      0.08     0.00   0.00    3.08     8.71    0.00      0.00     0.00   0.00    0.00     0.00    0.33   83.30    0.94  37.19
sde            584.67     66.07     0.37   0.06    2.48   115.71    9.57      0.08     0.03   0.35    3.38     8.70    0.00      0.00     0.00   0.00    0.00     0.00    0.33   90.90    1.51  55.33

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.35    0.00    6.42    1.52    0.00   91.71

Device            r/s     rMB/s   rrqm/s  %rrqm r_await rareq-sz     w/s     wMB/s   wrqm/s  %wrqm w_await wareq-sz     d/s     dMB/s   drqm/s  %drqm d_await dareq-sz     f/s f_await  aqu-sz  %util
sda            745.07     63.06     1.03   0.14    1.46    86.67   10.17      0.09     0.00   0.00    3.79     9.13    0.00      0.00     0.00   0.00    0.00     0.00    0.40   86.92    1.16  43.68
sdb            754.57     63.09     0.80   0.11    1.43    85.62   10.37      0.09     0.00   0.00    2.86     9.02    0.00      0.00     0.00   0.00    0.00     0.00    0.40   66.67    1.13  42.37
sdc             84.13     63.31     0.00   0.00   33.47   770.56   10.63      0.09     0.00   0.00    1.84     8.59    0.00      0.00     0.00   0.00    0.00     0.00    0.40   43.75    2.85  95.44
sdd            664.97     63.07     1.00   0.15    2.01    97.12   10.27      0.09     0.00   0.00    3.29     8.90    0.00      0.00     0.00   0.00    0.00     0.00    0.40   76.08    1.40  50.20
sde            440.00     62.99     0.27   0.06    3.97   146.60   10.83      0.09     0.00   0.00    4.74     8.63    0.00      0.00     0.00   0.00    0.00     0.00    0.40   77.75    1.83  64.47

Below, it had stopped (iowait >10%, r/s <200 for all disks)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.56    0.00    3.21   11.95    0.00   84.28

Device            r/s     rMB/s   rrqm/s  %rrqm r_await rareq-sz     w/s     wMB/s   wrqm/s  %wrqm w_await wareq-sz     d/s     dMB/s   drqm/s  %drqm d_await dareq-sz     f/s f_await  aqu-sz  %util
sda            161.87      5.57     0.00   0.00   17.10    35.23    4.57      0.05     0.00   0.00    4.91    10.39    0.00      0.00     0.00   0.00    0.00     0.00    0.20   89.83    2.81  93.91
sdb            159.80      5.70     0.00   0.00   17.75    36.53    4.67      0.05     0.00   0.00    4.03    10.06    0.00      0.00     0.00   0.00    0.00     0.00    0.20   77.83    2.87  95.64
sdc            150.60      5.50     0.00   0.00   15.60    37.40    5.13      0.05     0.03   0.65    1.86     9.35    0.00      0.00     0.00   0.00    0.00     0.00    0.20   41.00    2.37  78.81
sdd            162.23      5.72     0.00   0.00   17.85    36.10    5.00      0.05     0.00   0.00    4.00     9.52    0.00      0.00     0.00   0.00    0.00     0.00    0.20   92.67    2.93  97.68
sde            151.87      5.57     0.00   0.00   14.42    37.58    4.80      0.05     0.00   0.00    3.72    10.08    0.00      0.00     0.00   0.00    0.00     0.00    0.20   83.83    2.23  73.96

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.42    0.00    2.26   11.23    0.00   86.10

Device            r/s     rMB/s   rrqm/s  %rrqm r_await rareq-sz     w/s     wMB/s   wrqm/s  %wrqm w_await wareq-sz     d/s     dMB/s   drqm/s  %drqm d_await dareq-sz     f/s f_await  aqu-sz  %util
sda            166.73      5.85     0.00   0.00   17.29    35.92    3.50      0.03     0.00   0.00    3.99     9.07    0.00      0.00     0.00   0.00    0.00     0.00    0.13   84.00    2.91  96.89
sdb            165.30      6.00     0.00   0.00   17.31    37.17    3.53      0.03     0.07   1.85    3.27     8.83    0.00      0.00     0.00   0.00    0.00     0.00    0.13   79.75    2.88  96.15
sdc            163.43      5.78     0.00   0.00   15.55    36.24    3.70      0.03     0.00   0.00    1.54     8.29    0.00      0.00     0.00   0.00    0.00     0.00    0.13   33.50    2.55  84.71
sdd            166.33      5.96     0.00   0.00   17.54    36.69    3.80      0.03     0.00   0.00    3.67     8.39    0.00      0.00     0.00   0.00    0.00     0.00    0.13   92.25    2.94  98.04
sde            163.93      5.72     0.00   0.00   14.48    35.75    3.63      0.03     0.00   0.00    3.55     8.55    0.00      0.00     0.00   0.00    0.00     0.00    0.13   79.50    2.40  79.45

After a reboot, iostat returns to the pre-hang numbers where most disks have 500+ r/s and the scrub proceeds normally, finishing in maybe 12 hours. Without a reboot it won't ever finish.

This is what the second system looks like when the scrub is stuck. The mirror that is stuck is sdb+sdc; sda is a 1-disk zpool that doesn't need redundancy, and the scrub keeps working properly on that one. sdd is the boot SSD.

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    5.48   15.50    0.00   79.02

Device            r/s     rMB/s   rrqm/s  %rrqm r_await rareq-sz     w/s     wMB/s   wrqm/s  %wrqm w_await wareq-sz     d/s     dMB/s   drqm/s  %drqm d_await dareq-sz     f/s f_await  aqu-sz  %util
sda            173.00    171.57     0.00   0.00   17.26  1015.51    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    2.99  98.60
sdb            185.00     12.57     0.00   0.00   14.65    69.56   33.50      0.46     0.00   0.00    2.12    14.03    0.00      0.00     0.00   0.00    0.00     0.00    1.00   65.50    2.85  93.00
sdc            199.00     12.15     0.00   0.00   13.13    62.51   34.00      0.46     0.00   0.00    3.06    13.82    0.00      0.00     0.00   0.00    0.00     0.00    1.00   98.00    2.82  97.00
sdd              0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00      0.00     0.00   0.00    0.00     0.00    0.00    0.00    0.00   0.00

The scrubs haven't ever found any issues while this has happened, and AFAIK the pools keep working as usual during the stuck scrubs.

Properties for the RAIDZ2 pool that has been getting stuck every month:

$ zpool get all laniakea
NAME      PROPERTY                       VALUE                          SOURCE
laniakea  size                           18.2T                          -
laniakea  capacity                       85%                            -
laniakea  altroot                        -                              default
laniakea  health                         ONLINE                         -
laniakea  guid                           7937552882459901146            -
laniakea  version                        -                              default
laniakea  bootfs                         -                              default
laniakea  delegation                     on                             default
laniakea  autoreplace                    off                            default
laniakea  cachefile                      -                              default
laniakea  failmode                       wait                           default
laniakea  listsnapshots                  off                            default
laniakea  autoexpand                     off                            default
laniakea  dedupratio                     1.00x                          -
laniakea  free                           2.64T                          -
laniakea  allocated                      15.6T                          -
laniakea  readonly                       off                            -
laniakea  ashift                         12                             local
laniakea  comment                        -                              default
laniakea  expandsize                     -                              -
laniakea  freeing                        0                              -
laniakea  fragmentation                  10%                            -
laniakea  leaked                         0                              -
laniakea  multihost                      off                            default
laniakea  checkpoint                     -                              -
laniakea  load_guid                      5213226827043222604            -
laniakea  autotrim                       off                            default
laniakea  compatibility                  off                            default
laniakea  bcloneused                     0                              -
laniakea  bclonesaved                    0                              -
laniakea  bcloneratio                    1.00x                          -
laniakea  feature@async_destroy          enabled                        local
laniakea  feature@empty_bpobj            active                         local
laniakea  feature@lz4_compress           active                         local
laniakea  feature@multi_vdev_crash_dump  enabled                        local
laniakea  feature@spacemap_histogram     active                         local
laniakea  feature@enabled_txg            active                         local
laniakea  feature@hole_birth             active                         local
laniakea  feature@extensible_dataset     active                         local
laniakea  feature@embedded_data          active                         local
laniakea  feature@bookmarks              enabled                        local
laniakea  feature@filesystem_limits      enabled                        local
laniakea  feature@large_blocks           enabled                        local
laniakea  feature@large_dnode            active                         local
laniakea  feature@sha512                 enabled                        local
laniakea  feature@skein                  enabled                        local
laniakea  feature@edonr                  enabled                        local
laniakea  feature@userobj_accounting     active                         local
laniakea  feature@encryption             active                         local
laniakea  feature@project_quota          active                         local
laniakea  feature@device_removal         enabled                        local
laniakea  feature@obsolete_counts        enabled                        local
laniakea  feature@zpool_checkpoint       enabled                        local
laniakea  feature@spacemap_v2            active                         local
laniakea  feature@allocation_classes     enabled                        local
laniakea  feature@resilver_defer         enabled                        local
laniakea  feature@bookmark_v2            enabled                        local
laniakea  feature@redaction_bookmarks    enabled                        local
laniakea  feature@redacted_datasets      enabled                        local
laniakea  feature@bookmark_written       enabled                        local
laniakea  feature@log_spacemap           active                         local
laniakea  feature@livelist               enabled                        local
laniakea  feature@device_rebuild         enabled                        local
laniakea  feature@zstd_compress          active                         local
laniakea  feature@draid                  enabled                        local
laniakea  feature@zilsaxattr             active                         local
laniakea  feature@head_errlog            active                         local
laniakea  feature@blake3                 disabled                       local
laniakea  feature@block_cloning          disabled                       local
laniakea  feature@vdev_zaps_v2           disabled                       local
$ zfs get all laniakea
NAME      PROPERTY              VALUE                     SOURCE
laniakea  type                  filesystem                -
laniakea  creation              Fri Aug 18  2:14 2023     -
laniakea  used                  9.20T                     -
laniakea  available             1.44T                     -
laniakea  referenced            746K                      -
laniakea  compressratio         1.03x                     -
laniakea  mounted               yes                       -
laniakea  quota                 none                      default
laniakea  reservation           none                      default
laniakea  recordsize            128K                      default
laniakea  mountpoint            /laniakea                 default
laniakea  sharenfs              off                       default
laniakea  checksum              on                        default
laniakea  compression           zstd                      local
laniakea  atime                 off                       local
laniakea  devices               off                       local
laniakea  exec                  off                       local
laniakea  setuid                off                       local
laniakea  readonly              off                       default
laniakea  zoned                 off                       default
laniakea  snapdir               hidden                    local
laniakea  aclmode               discard                   default
laniakea  aclinherit            restricted                default
laniakea  createtxg             1                         -
laniakea  canmount              on                        default
laniakea  xattr                 sa                        local
laniakea  copies                1                         default
laniakea  version               5                         -
laniakea  utf8only              off                       -
laniakea  normalization         none                      -
laniakea  casesensitivity       sensitive                 -
laniakea  vscan                 off                       default
laniakea  nbmand                off                       default
laniakea  sharesmb              off                       default
laniakea  refquota              none                      default
laniakea  refreservation        none                      default
laniakea  guid                  16392863579896121217      -
laniakea  primarycache          all                       default
laniakea  secondarycache        all                       default
laniakea  usedbysnapshots       0B                        -
laniakea  usedbydataset         746K                      -
laniakea  usedbychildren        9.20T                     -
laniakea  usedbyrefreservation  0B                        -
laniakea  logbias               latency                   default
laniakea  objsetid              54                        -
laniakea  dedup                 off                       default
laniakea  mlslabel              none                      default
laniakea  sync                  standard                  default
laniakea  dnodesize             auto                      local
laniakea  refcompressratio      1.23x                     -
laniakea  written               0                         -
laniakea  logicalused           9.49T                     -
laniakea  logicalreferenced     476K                      -
laniakea  volmode               default                   default
laniakea  filesystem_limit      none                      default
laniakea  snapshot_limit        none                      default
laniakea  filesystem_count      none                      default
laniakea  snapshot_count        none                      default
laniakea  snapdev               hidden                    default
laniakea  acltype               off                       default
laniakea  context               none                      default
laniakea  fscontext             none                      default
laniakea  defcontext            none                      default
laniakea  rootcontext           none                      default
laniakea  relatime              on                        default
laniakea  redundant_metadata    all                       default
laniakea  overlay               on                        default
laniakea  encryption            aes-256-gcm               -
laniakea  keylocation           prompt                    local
laniakea  keyformat             passphrase                -
laniakea  pbkdf2iters           2000000                   -
laniakea  encryptionroot        laniakea                  -
laniakea  keystatus             available                 -
laniakea  special_small_blocks  0                         default
laniakea  snapshots_changed     Sun Oct 13  9:02:30 2024  -
laniakea  prefetch              all                       default

Update: The scrub on the second machine finished after just running scrub -p followed by scrub to resume, so perhaps a reboot is not necessary at all.
For the time being I added a cron job that pauses and resumes any active scrubs every now and then as an ugly workaround... I guess we'll see how well it works in a month or two.