scrub on encrypted pool very slow Ubuntu 16.04 (dp_sync_taskq runs hot)

rkrten commented 5 years ago

System information

Type	Version/Name
Distribution Name	16.04.1-Ubuntu
Distribution Version	SMP Mon Jun 24 13:21:41 UTC 2019
Linux Kernel	4.15.0-54-generic
Architecture	x86_64
ZFS Version	zfs-0.8.0-1
SPL Version	zfs-kmod-0.8.1-1ubuntu1~16.04.york0

Describe the problem you're observing

zpool scrub on encrypted pool takes days; previously the same same scrub on unencrypted pool took about 6 hours (pool info below). I tried with echo 1 >/sys/module/zfs/parameters/zfs_scan_legacy as well, (in progress) but that too is taking a long time. Also noticed that dp_sync_taskq (distributed over 6 threads) takes 50-75% CPU for significant periods of time; this was on an otherwise idle system.

Oddly, doing an rsync to this system (mostly comparing files, a few GB of writes) sped up the zpool scrub, almost to the point of matching performance with the previous version (the rsync was really slow, but ...)

Describe how to reproduce the problem

Nothing special, just:

zpool scrub main

Include any warning/errors/backtraces from the system logs

Various logs for the "new" scrub (i.e., zfs_scan_legacy set to zero) that took several days to complete, on a relatively modern 3+GHz AMD64 8320 8-core processor) are below.

zpool status

zpool status gives:

# zpool status
  pool: main
 state: ONLINE
  scan: scrub in progress since Fri Jul  5 09:26:33 2019
    32.5G scanned at 877K/s, 22.5G issued at 607K/s, 6.89T total
    0B repaired, 0.32% done, no estimated completion time
config:

    NAME                                          STATE     READ WRITE CKSUM
    main                                          ONLINE       0     0     0
      raidz1-0                                    ONLINE       0     0     0
        ata-WDC_WD40EZRZ-00GXCB0_WD-WCC7K1TVP9P2  ONLINE       0     0     0
        ata-WDC_WD40EZRZ-00GXCB0_WD-WCC7K7KD384P  ONLINE       0     0     0
        ata-WDC_WD40EZRZ-00GXCB0_WD-WCC7K4SFJS7U  ONLINE       0     0     0

errors: No known data errors

The previous version of ZFS on a non-encrypted partition would have been finished by now, whereas this one was less than 1% done with sub-megabyte-per-second data rates.

dp_sync_taskq consuming CPU

I noticed that top shows dp_sync_taskq taking up a lot of CPU time:

top - 20:17:42 up 13:00,  1 user,  load average: 2.48, 2.63, 2.60
Tasks: 431 total,   2 running, 357 sleeping,   0 stopped,   0 zombie
%Cpu(s):  3.8 us, 11.2 sy,  0.0 ni, 73.2 id, 11.5 wa,  0.0 hi,  0.3 si,  0.0 st
KiB Mem : 16323184 total,   790128 free,  9465000 used,  6068056 buff/cache
KiB Swap:        0 total,        0 free,        0 used.  2924248 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
...
 1049 root      39  19       0      0      0 D  71.5  0.0  22:57.35  `- dp_sync_taskq
 1050 root      39  19       0      0      0 S   0.0  0.0  18:36.26  `- dp_sync_taskq
 1051 root      39  19       0      0      0 S   0.0  0.0  22:12.08  `- dp_sync_taskq
 1052 root      39  19       0      0      0 S   0.0  0.0  20:43.01  `- dp_sync_taskq
 1053 root      39  19       0      0      0 S   0.0  0.0  19:06.58  `- dp_sync_taskq
 1054 root      39  19       0      0      0 S   0.0  0.0  20:32.20  `- dp_sync_taskq

Which particular thread of dp_sync_taskq takes up the CPU time alternates, as you can tell by the distribution of the "TIME+" column values.

zfs get all

# zfs get all
NAME                             PROPERTY              VALUE                  SOURCE
main                             type                  filesystem             -
main                             creation              Sat Jun 29 11:59 2019  -
main                             used                  4.59T                  -
main                             available             2.45T                  -
main                             referenced            128K                   -
main                             compressratio         1.00x                  -
main                             mounted               no                     -
main                             quota                 none                   default
main                             reservation           none                   default
main                             recordsize            128K                   default
main                             mountpoint            none                   local
main                             sharenfs              off                    default
main                             checksum              on                     default
main                             compression           off                    default
main                             atime                 on                     default
main                             devices               on                     default
main                             exec                  on                     default
main                             setuid                on                     default
main                             readonly              off                    default
main                             zoned                 off                    default
main                             snapdir               hidden                 default
main                             aclinherit            restricted             default
main                             createtxg             1                      -
main                             canmount              on                     default
main                             xattr                 on                     default
main                             copies                1                      default
main                             version               5                      -
main                             utf8only              off                    -
main                             normalization         none                   -
main                             casesensitivity       sensitive              -
main                             vscan                 off                    default
main                             nbmand                off                    default
main                             sharesmb              off                    default
main                             refquota              none                   default
main                             refreservation        none                   default
main                             guid                  1042426617081375957    -
main                             primarycache          all                    default
main                             secondarycache        all                    default
main                             usedbysnapshots       0B                     -
main                             usedbydataset         128K                   -
main                             usedbychildren        4.59T                  -
main                             usedbyrefreservation  0B                     -
main                             logbias               latency                default
main                             objsetid              54                     -
main                             dedup                 off                    default
main                             mlslabel              none                   default
main                             sync                  standard               default
main                             dnodesize             legacy                 default
main                             refcompressratio      1.00x                  -
main                             written               128K                   -
main                             logicalused           4.58T                  -
main                             logicalreferenced     42K                    -
main                             volmode               default                default
main                             filesystem_limit      none                   default
main                             snapshot_limit        none                   default
main                             filesystem_count      none                   default
main                             snapshot_count        none                   default
main                             snapdev               hidden                 default
main                             acltype               off                    default
main                             context               none                   default
main                             fscontext             none                   default
main                             defcontext            none                   default
main                             rootcontext           none                   default
main                             relatime              off                    default
main                             redundant_metadata    all                    default
main                             overlay               off                    default
main                             encryption            off                    default
main                             keylocation           none                   default
main                             keyformat             none                   default
main                             pbkdf2iters           0                      default
main                             special_small_blocks  0                      default
main/secure                      type                  filesystem             -
main/secure                      creation              Sat Jun 29 12:03 2019  -
main/secure                      used                  4.59T                  -
main/secure                      available             2.45T                  -
main/secure                      referenced            4.53T                  -
main/secure                      compressratio         1.00x                  -
main/secure                      mounted               yes                    -
main/secure                      quota                 none                   default
main/secure                      reservation           none                   default
main/secure                      recordsize            128K                   default
main/secure                      mountpoint            /main                  local
main/secure                      sharenfs              off                    default
main/secure                      checksum              on                     default
main/secure                      compression           off                    default
main/secure                      atime                 on                     default
main/secure                      devices               on                     default
main/secure                      exec                  on                     default
main/secure                      setuid                on                     default
main/secure                      readonly              off                    default
main/secure                      zoned                 off                    default
main/secure                      snapdir               hidden                 default
main/secure                      aclinherit            restricted             default
main/secure                      createtxg             61                     -
main/secure                      canmount              on                     default
main/secure                      xattr                 on                     default
main/secure                      copies                1                      default
main/secure                      version               5                      -
main/secure                      utf8only              off                    -
main/secure                      normalization         none                   -
main/secure                      casesensitivity       sensitive              -
main/secure                      vscan                 off                    default
main/secure                      nbmand                off                    default
main/secure                      sharesmb              off                    default
main/secure                      refquota              none                   default
main/secure                      refreservation        none                   default
main/secure                      guid                  12662092637133358698   -
main/secure                      primarycache          all                    default
main/secure                      secondarycache        all                    default
main/secure                      usedbysnapshots       57.9G                  -
main/secure                      usedbydataset         4.53T                  -
main/secure                      usedbychildren        0B                     -
main/secure                      usedbyrefreservation  0B                     -
main/secure                      logbias               latency                default
main/secure                      objsetid              136                    -
main/secure                      dedup                 off                    default
main/secure                      mlslabel              none                   default
main/secure                      sync                  standard               default
main/secure                      dnodesize             legacy                 default
main/secure                      refcompressratio      1.00x                  -
main/secure                      written               6.94G                  -
main/secure                      logicalused           4.58T                  -
main/secure                      logicalreferenced     4.53T                  -
main/secure                      volmode               default                default
main/secure                      filesystem_limit      none                   default
main/secure                      snapshot_limit        none                   default
main/secure                      filesystem_count      none                   default
main/secure                      snapshot_count        none                   default
main/secure                      snapdev               hidden                 default
main/secure                      acltype               off                    default
main/secure                      context               none                   default
main/secure                      fscontext             none                   default
main/secure                      defcontext            none                   default
main/secure                      rootcontext           none                   default
main/secure                      relatime              off                    default
main/secure                      redundant_metadata    all                    default
main/secure                      overlay               off                    default
main/secure                      encryption            aes-256-ccm            -
main/secure                      keylocation           prompt                 local
main/secure                      keyformat             passphrase             -
main/secure                      pbkdf2iters           342K                   -
main/secure                      encryptionroot        main/secure            -
main/secure                      keystatus             available              -
main/secure                      special_small_blocks  0                      default
main/secure@20190630-173111-Sun  type                  snapshot               -
main/secure@20190630-173111-Sun  creation              Sun Jun 30 17:31 2019  -
main/secure@20190630-173111-Sun  used                  661K                   -
main/secure@20190630-173111-Sun  referenced            4.55T                  -
main/secure@20190630-173111-Sun  compressratio         1.00x                  -
main/secure@20190630-173111-Sun  devices               on                     default
main/secure@20190630-173111-Sun  exec                  on                     default
main/secure@20190630-173111-Sun  setuid                on                     default
main/secure@20190630-173111-Sun  createtxg             17582                  -
main/secure@20190630-173111-Sun  xattr                 on                     default
main/secure@20190630-173111-Sun  version               5                      -
main/secure@20190630-173111-Sun  utf8only              off                    -
main/secure@20190630-173111-Sun  normalization         none                   -
main/secure@20190630-173111-Sun  casesensitivity       sensitive              -
main/secure@20190630-173111-Sun  nbmand                off                    default
main/secure@20190630-173111-Sun  guid                  11877251372032796098   -
main/secure@20190630-173111-Sun  primarycache          all                    default
main/secure@20190630-173111-Sun  secondarycache        all                    default
main/secure@20190630-173111-Sun  defer_destroy         off                    -
main/secure@20190630-173111-Sun  userrefs              0                      -
main/secure@20190630-173111-Sun  objsetid              343                    -
main/secure@20190630-173111-Sun  mlslabel              none                   default
main/secure@20190630-173111-Sun  refcompressratio      1.00x                  -
main/secure@20190630-173111-Sun  written               4.55T                  -
main/secure@20190630-173111-Sun  clones                                       -
main/secure@20190630-173111-Sun  logicalreferenced     4.55T                  -
main/secure@20190630-173111-Sun  acltype               off                    default
main/secure@20190630-173111-Sun  context               none                   default
main/secure@20190630-173111-Sun  fscontext             none                   default
main/secure@20190630-173111-Sun  defcontext            none                   default
main/secure@20190630-173111-Sun  rootcontext           none                   default
main/secure@20190630-173111-Sun  encryption            aes-256-ccm            -
main/secure@20190630-173111-Sun  encryptionroot        main/secure            -
main/secure@20190630-173111-Sun  keystatus             available              -
main/secure@20190701-080934-Mon  type                  snapshot               -
main/secure@20190701-080934-Mon  creation              Mon Jul  1  8:09 2019  -
main/secure@20190701-080934-Mon  used                  554K                   -
main/secure@20190701-080934-Mon  referenced            4.55T                  -
main/secure@20190701-080934-Mon  compressratio         1.00x                  -
main/secure@20190701-080934-Mon  devices               on                     default
main/secure@20190701-080934-Mon  exec                  on                     default
main/secure@20190701-080934-Mon  setuid                on                     default
main/secure@20190701-080934-Mon  createtxg             17597                  -
main/secure@20190701-080934-Mon  xattr                 on                     default
main/secure@20190701-080934-Mon  version               5                      -
main/secure@20190701-080934-Mon  utf8only              off                    -
main/secure@20190701-080934-Mon  normalization         none                   -
main/secure@20190701-080934-Mon  casesensitivity       sensitive              -
main/secure@20190701-080934-Mon  nbmand                off                    default
main/secure@20190701-080934-Mon  guid                  18292300121458879677   -
main/secure@20190701-080934-Mon  primarycache          all                    default
main/secure@20190701-080934-Mon  secondarycache        all                    default
main/secure@20190701-080934-Mon  defer_destroy         off                    -
main/secure@20190701-080934-Mon  userrefs              0                      -
main/secure@20190701-080934-Mon  objsetid              2                      -
main/secure@20190701-080934-Mon  mlslabel              none                   default
main/secure@20190701-080934-Mon  refcompressratio      1.00x                  -
main/secure@20190701-080934-Mon  written               1.12G                  -
main/secure@20190701-080934-Mon  clones                                       -
main/secure@20190701-080934-Mon  logicalreferenced     4.55T                  -
main/secure@20190701-080934-Mon  acltype               off                    default
main/secure@20190701-080934-Mon  context               none                   default
main/secure@20190701-080934-Mon  fscontext             none                   default
main/secure@20190701-080934-Mon  defcontext            none                   default
main/secure@20190701-080934-Mon  rootcontext           none                   default
main/secure@20190701-080934-Mon  encryption            aes-256-ccm            -
main/secure@20190701-080934-Mon  encryptionroot        main/secure            -
main/secure@20190701-080934-Mon  keystatus             available              -
main/secure@20190702-072808-Tue  type                  snapshot               -
main/secure@20190702-072808-Tue  creation              Tue Jul  2  7:28 2019  -
main/secure@20190702-072808-Tue  used                  4.61G                  -
main/secure@20190702-072808-Tue  referenced            4.55T                  -
main/secure@20190702-072808-Tue  compressratio         1.00x                  -
main/secure@20190702-072808-Tue  devices               on                     default
main/secure@20190702-072808-Tue  exec                  on                     default
main/secure@20190702-072808-Tue  setuid                on                     default
main/secure@20190702-072808-Tue  createtxg             17945                  -
main/secure@20190702-072808-Tue  xattr                 on                     default
main/secure@20190702-072808-Tue  version               5                      -
main/secure@20190702-072808-Tue  utf8only              off                    -
main/secure@20190702-072808-Tue  normalization         none                   -
main/secure@20190702-072808-Tue  casesensitivity       sensitive              -
main/secure@20190702-072808-Tue  nbmand                off                    default
main/secure@20190702-072808-Tue  guid                  13574331463321150992   -
main/secure@20190702-072808-Tue  primarycache          all                    default
main/secure@20190702-072808-Tue  secondarycache        all                    default
main/secure@20190702-072808-Tue  defer_destroy         off                    -
main/secure@20190702-072808-Tue  userrefs              0                      -
main/secure@20190702-072808-Tue  objsetid              4                      -
main/secure@20190702-072808-Tue  mlslabel              none                   default
main/secure@20190702-072808-Tue  refcompressratio      1.00x                  -
main/secure@20190702-072808-Tue  written               5.40G                  -
main/secure@20190702-072808-Tue  clones                                       -
main/secure@20190702-072808-Tue  logicalreferenced     4.54T                  -
main/secure@20190702-072808-Tue  acltype               off                    default
main/secure@20190702-072808-Tue  context               none                   default
main/secure@20190702-072808-Tue  fscontext             none                   default
main/secure@20190702-072808-Tue  defcontext            none                   default
main/secure@20190702-072808-Tue  rootcontext           none                   default
main/secure@20190702-072808-Tue  encryption            aes-256-ccm            -
main/secure@20190702-072808-Tue  encryptionroot        main/secure            -
main/secure@20190702-072808-Tue  keystatus             available              -
main/secure@20190703-063704-Wed  type                  snapshot               -
main/secure@20190703-063704-Wed  creation              Wed Jul  3  6:37 2019  -
main/secure@20190703-063704-Wed  used                  6.19G                  -
main/secure@20190703-063704-Wed  referenced            4.55T                  -
main/secure@20190703-063704-Wed  compressratio         1.00x                  -
main/secure@20190703-063704-Wed  devices               on                     default
main/secure@20190703-063704-Wed  exec                  on                     default
main/secure@20190703-063704-Wed  setuid                on                     default
main/secure@20190703-063704-Wed  createtxg             22422                  -
main/secure@20190703-063704-Wed  xattr                 on                     default
main/secure@20190703-063704-Wed  version               5                      -
main/secure@20190703-063704-Wed  utf8only              off                    -
main/secure@20190703-063704-Wed  normalization         none                   -
main/secure@20190703-063704-Wed  casesensitivity       sensitive              -
main/secure@20190703-063704-Wed  nbmand                off                    default
main/secure@20190703-063704-Wed  guid                  14551728156668565789   -
main/secure@20190703-063704-Wed  primarycache          all                    default
main/secure@20190703-063704-Wed  secondarycache        all                    default
main/secure@20190703-063704-Wed  defer_destroy         off                    -
main/secure@20190703-063704-Wed  userrefs              0                      -
main/secure@20190703-063704-Wed  objsetid              6                      -
main/secure@20190703-063704-Wed  mlslabel              none                   default
main/secure@20190703-063704-Wed  refcompressratio      1.00x                  -
main/secure@20190703-063704-Wed  written               7.15G                  -
main/secure@20190703-063704-Wed  clones                                       -
main/secure@20190703-063704-Wed  logicalreferenced     4.54T                  -
main/secure@20190703-063704-Wed  acltype               off                    default
main/secure@20190703-063704-Wed  context               none                   default
main/secure@20190703-063704-Wed  fscontext             none                   default
main/secure@20190703-063704-Wed  defcontext            none                   default
main/secure@20190703-063704-Wed  rootcontext           none                   default
main/secure@20190703-063704-Wed  encryption            aes-256-ccm            -
main/secure@20190703-063704-Wed  encryptionroot        main/secure            -
main/secure@20190703-063704-Wed  keystatus             available              -
main/secure@20190704-075206-Thu  type                  snapshot               -
main/secure@20190704-075206-Thu  creation              Thu Jul  4  7:52 2019  -
main/secure@20190704-075206-Thu  used                  6.27G                  -
main/secure@20190704-075206-Thu  referenced            4.55T                  -
main/secure@20190704-075206-Thu  compressratio         1.00x                  -
main/secure@20190704-075206-Thu  devices               on                     default
main/secure@20190704-075206-Thu  exec                  on                     default
main/secure@20190704-075206-Thu  setuid                on                     default
main/secure@20190704-075206-Thu  createtxg             22728                  -
main/secure@20190704-075206-Thu  xattr                 on                     default
main/secure@20190704-075206-Thu  version               5                      -
main/secure@20190704-075206-Thu  utf8only              off                    -
main/secure@20190704-075206-Thu  normalization         none                   -
main/secure@20190704-075206-Thu  casesensitivity       sensitive              -
main/secure@20190704-075206-Thu  nbmand                off                    default
main/secure@20190704-075206-Thu  guid                  14731437223421575129   -
main/secure@20190704-075206-Thu  primarycache          all                    default
main/secure@20190704-075206-Thu  secondarycache        all                    default
main/secure@20190704-075206-Thu  defer_destroy         off                    -
main/secure@20190704-075206-Thu  userrefs              0                      -
main/secure@20190704-075206-Thu  objsetid              9                      -
main/secure@20190704-075206-Thu  mlslabel              none                   default
main/secure@20190704-075206-Thu  refcompressratio      1.00x                  -
main/secure@20190704-075206-Thu  written               7.99G                  -
main/secure@20190704-075206-Thu  clones                                       -
main/secure@20190704-075206-Thu  logicalreferenced     4.54T                  -
main/secure@20190704-075206-Thu  acltype               off                    default
main/secure@20190704-075206-Thu  context               none                   default
main/secure@20190704-075206-Thu  fscontext             none                   default
main/secure@20190704-075206-Thu  defcontext            none                   default
main/secure@20190704-075206-Thu  rootcontext           none                   default
main/secure@20190704-075206-Thu  encryption            aes-256-ccm            -
main/secure@20190704-075206-Thu  encryptionroot        main/secure            -
main/secure@20190704-075206-Thu  keystatus             available              -
main/secure@20190705-071730-Fri  type                  snapshot               -
main/secure@20190705-071730-Fri  creation              Fri Jul  5  7:17 2019  -
main/secure@20190705-071730-Fri  used                  6.29G                  -
main/secure@20190705-071730-Fri  referenced            4.55T                  -
main/secure@20190705-071730-Fri  compressratio         1.00x                  -
main/secure@20190705-071730-Fri  devices               on                     default
main/secure@20190705-071730-Fri  exec                  on                     default
main/secure@20190705-071730-Fri  setuid                on                     default
main/secure@20190705-071730-Fri  createtxg             29129                  -
main/secure@20190705-071730-Fri  xattr                 on                     default
main/secure@20190705-071730-Fri  version               5                      -
main/secure@20190705-071730-Fri  utf8only              off                    -
main/secure@20190705-071730-Fri  normalization         none                   -
main/secure@20190705-071730-Fri  casesensitivity       sensitive              -
main/secure@20190705-071730-Fri  nbmand                off                    default
main/secure@20190705-071730-Fri  guid                  15724935697790917441   -
main/secure@20190705-071730-Fri  primarycache          all                    default
main/secure@20190705-071730-Fri  secondarycache        all                    default
main/secure@20190705-071730-Fri  defer_destroy         off                    -
main/secure@20190705-071730-Fri  userrefs              0                      -
main/secure@20190705-071730-Fri  objsetid              14                     -
main/secure@20190705-071730-Fri  mlslabel              none                   default
main/secure@20190705-071730-Fri  refcompressratio      1.00x                  -
main/secure@20190705-071730-Fri  written               8.28G                  -
main/secure@20190705-071730-Fri  clones                                       -
main/secure@20190705-071730-Fri  logicalreferenced     4.55T                  -
main/secure@20190705-071730-Fri  acltype               off                    default
main/secure@20190705-071730-Fri  context               none                   default
main/secure@20190705-071730-Fri  fscontext             none                   default
main/secure@20190705-071730-Fri  defcontext            none                   default
main/secure@20190705-071730-Fri  rootcontext           none                   default
main/secure@20190705-071730-Fri  encryption            aes-256-ccm            -
main/secure@20190705-071730-Fri  encryptionroot        main/secure            -
main/secure@20190705-071730-Fri  keystatus             available              -

zpool get all

# zpool get all
NAME  PROPERTY                       VALUE                          SOURCE
main  size                           10.9T                          -
main  capacity                       63%                            -
main  altroot                        -                              default
main  health                         ONLINE                         -
main  guid                           7259764908259248895            -
main  version                        -                              default
main  bootfs                         -                              default
main  delegation                     on                             default
main  autoreplace                    off                            default
main  cachefile                      -                              default
main  failmode                       wait                           default
main  listsnapshots                  on                             local
main  autoexpand                     off                            default
main  dedupditto                     0                              default
main  dedupratio                     1.00x                          -
main  free                           4.01T                          -
main  allocated                      6.89T                          -
main  readonly                       off                            -
main  ashift                         0                              default
main  comment                        -                              default
main  expandsize                     -                              -
main  freeing                        0                              -
main  fragmentation                  0%                             -
main  leaked                         0                              -
main  multihost                      off                            default
main  checkpoint                     -                              -
main  load_guid                      4867795306623353939            -
main  autotrim                       off                            default
main  feature@async_destroy          enabled                        local
main  feature@empty_bpobj            active                         local
main  feature@lz4_compress           active                         local
main  feature@multi_vdev_crash_dump  enabled                        local
main  feature@spacemap_histogram     active                         local
main  feature@enabled_txg            active                         local
main  feature@hole_birth             active                         local
main  feature@extensible_dataset     active                         local
main  feature@embedded_data          active                         local
main  feature@bookmarks              enabled                        local
main  feature@filesystem_limits      enabled                        local
main  feature@large_blocks           enabled                        local
main  feature@large_dnode            enabled                        local
main  feature@sha512                 enabled                        local
main  feature@skein                  enabled                        local
main  feature@edonr                  enabled                        local
main  feature@userobj_accounting     active                         local
main  feature@encryption             active                         local
main  feature@project_quota          active                         local
main  feature@device_removal         enabled                        local
main  feature@obsolete_counts        enabled                        local
main  feature@zpool_checkpoint       enabled                        local
main  feature@spacemap_v2            active                         local
main  feature@allocation_classes     enabled                        local
main  feature@resilver_defer         enabled                        local
main  feature@bookmark_v2            enabled                        local

Happy to supply any other info needed or run diags.

rkrten commented 5 years ago

Jonathon's patch https://github.com/zfsonlinux/zfs/commit/e5db31349484e5e859c7a942eb15b98d68ce5b4d has solved the problem for me; I was able to complete a scrub in the usual 6 hours, with no exceptional dp_sync_taskq activity. I did this twice just to make sure :-)

rkrten commented 5 years ago

I spoke too soon :-( The issue, while better, still remains. Here's a latest top, again demonstrating the dp_sync_taskq problem:

top - 16:07:09 up  8:33,  1 user,  load average: 2.38, 2.55, 2.50
Tasks: 415 total,   1 running, 341 sleeping,   0 stopped,   1 zombie
%Cpu(s):  3.6 us,  4.4 sy,  0.0 ni, 80.5 id, 11.2 wa,  0.0 hi,  0.4 si,  0.0 st
KiB Mem : 16323184 total,  4531884 free,  7944208 used,  3847092 buff/cache
KiB Swap:        0 total,        0 free,        0 used.  6953072 avail Mem 
...
 1089 root      39  19       0      0      0 S   0.0  0.0  18:44.30  `- dp_sync_taskq
 1090 root      39  19       0      0      0 S   0.0  0.0  18:15.41  `- dp_sync_taskq                                                                                                                                                                                                                                                                                                                                                                                                                                           
 1091 root      39  19       0      0      0 D  57.1  0.0  18:54.27  `- dp_sync_taskq                                                                                                                                                                                                                                                                                                                                                                                                                                           
 1092 root      39  19       0      0      0 S   0.0  0.0  19:08.11  `- dp_sync_taskq                                                                                                                                                                                                                                                                                                                                                                                                                                           
 1093 root      39  19       0      0      0 S   0.0  0.0  17:44.40  `- dp_sync_taskq                                                                                                                                                                                                                                                                                                                                                                                                                                           
 1094 root      39  19       0      0      0 S   0.0  0.0  19:46.78  `- dp_sync_taskq

The problem is weird; when there is other filesystem activity (e.g., an rsync), the scrub proceeds at a good pace — the scanned and issued rates are in the 350-500 MB/s range. But once the filesystem goes idle, the performance of the scrub drops off, and the CPU utilization of dq_sync_taskq increases to the 50-75% range.

Scrub started at 07:36, status at 12:17

# zpool status
  pool: main
 state: ONLINE
  scan: scrub in progress since Sun Jul 28 07:36:33 2019
    5.75T scanned at 358M/s, 4.67T issued at 290M/s, 6.89T total
    0B repaired, 67.77% done, 0 days 02:13:34 to go
config:

    NAME                                          STATE     READ WRITE CKSUM
    main                                          ONLINE       0     0     0
      raidz1-0                                    ONLINE       0     0     0
        ata-WDC_WD40EZRZ-00GXCB0_WD-WCC7K1TVP9P2  ONLINE       0     0     0
        ata-WDC_WD40EZRZ-00GXCB0_WD-WCC7K7KD384P  ONLINE       0     0     0
        ata-WDC_WD40EZRZ-00GXCB0_WD-WCC7K4SFJS7U  ONLINE       0     0     0

errors: No known data errors

zpool status at 14:19

# zpool status
  pool: main
 state: ONLINE
  scan: scrub in progress since Sun Jul 28 07:36:33 2019
    5.85T scanned at 254M/s, 5.75T issued at 249M/s, 6.89T total
    0B repaired, 83.51% done, 0 days 01:19:32 to go
config:

    NAME                                          STATE     READ WRITE CKSUM
    main                                          ONLINE       0     0     0
      raidz1-0                                    ONLINE       0     0     0
        ata-WDC_WD40EZRZ-00GXCB0_WD-WCC7K1TVP9P2  ONLINE       0     0     0
        ata-WDC_WD40EZRZ-00GXCB0_WD-WCC7K7KD384P  ONLINE       0     0     0
        ata-WDC_WD40EZRZ-00GXCB0_WD-WCC7K4SFJS7U  ONLINE       0     0     0

errors: No known data errors

zpool status at 16:06

# zpool status
  pool: main
 state: ONLINE
  scan: scrub in progress since Sun Jul 28 07:36:33 2019
    5.85T scanned at 200M/s, 5.85T issued at 200M/s, 6.89T total
    0B repaired, 84.95% done, 0 days 01:30:25 to go
config:

    NAME                                          STATE     READ WRITE CKSUM
    main                                          ONLINE       0     0     0
      raidz1-0                                    ONLINE       0     0     0
        ata-WDC_WD40EZRZ-00GXCB0_WD-WCC7K1TVP9P2  ONLINE       0     0     0
        ata-WDC_WD40EZRZ-00GXCB0_WD-WCC7K7KD384P  ONLINE       0     0     0
        ata-WDC_WD40EZRZ-00GXCB0_WD-WCC7K4SFJS7U  ONLINE       0     0     0

errors: No known data errors

Notice that almost two hours later, the estimated completion is later by about 10 minutes and the speed has dropped by 50MB/s.

The interesting thing about the type of filesystem activity required to boost the speed is that it doesn't have to be with the ZFS filesystem. What I mean is, my ZFS is mounted at /main, and I was doing lots of filesystem activity between a DVD drive and the network filesystem NFS on a different machine. That's why I closed this bug, figuring "unrelated" filesystem activity wouldn't affect ZFS performance, and hence thinking that the ZFS problem had been fixed.

Do let me know what else I can provide.

rkrten commented 5 years ago

suspect it's not entropy related

Jonathon suggested I run haveged in the background to see if entropy was being exhausted. I did, but there was no difference in zpool scrub speed.

rkrten commented 5 years ago

Upgraded from AMD FX8320 to AMD 3900X (a serious boost in performance, overall) but no change in speed. In fact, it may have gotten worse; at one point I was down to several hundred bytes per second on the issued speed.

rkrten commented 5 years ago

I updated a second machine to AMD 3900X and encrypted ZFS (4 x 6T RAIDZ1 configuration; the one described above is 3 x 4T RAIDZ1). This machine completes the scrub in just a little over FIVE hours, as opposed to SEVENTEEN previously! The second machine is a server, so it has more filesystem activity than the slow machine above. Just another datapoint :-) The slow machine continues to be slow...

rkrten commented 4 years ago

Same thing on the latest Ubuntu 19.10 beta; eventually, after about half an hour, it slowed down to kilobytes/second.

peterdey commented 4 years ago

I'm seeing the same issue here.

CPU: 2x Xeon L5640 4x 4TB WD Red in raidz1

Ubuntu 19.10 Kernel 5.3.0-24-generic ZFS version 0.8.1-1ubuntu14.2

Moved from LUKS encryption underneath to ZFS encrypted raidz1 on the bare drives.

Scrub used to finish in around 8 hours (depending on system usage). Currently still ongoing for over 18 hours now. dp_sync_taskq is still very active, and has accumulated a 6h15m of runtime.

stale[bot] commented 3 years ago

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.

openzfs / zfs