openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.55k stars 1.74k forks source link

zpool hangs after some load with zvols #9693

Closed glztmf closed 3 years ago

glztmf commented 4 years ago

System information

Type Version/Name
Distribution Name Debian
Distribution Version 10 (buster)
Linux Kernel 5.0.15-1-pve
Architecture x86_64
ZFS Version 0.8.1-pve1
SPL Version 0.8.1-pve1

Describe the problem you're observing

Some operations with specific zpool/zvols hang. Virtual machines freeze.

Describe how to reproduce the problem

Install PVE 6.0-4. Create additional mirror zpool. Create several VMs with vdisks on created zpool. Add load on VMs disks and wait for a few days.

Include any warning/errors/backtraces from the system logs

Inside VMs terminals:

INFO: task jbd2/vda1-8:478 blocked for more than 120 seconds.
INFO: task systemd-journal:563 blocked for more than 120 seconds.
...

Processes:

# ps aux | egrep ' D | D< |ST[A]T'
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root       485  0.0  0.0      0     0 ?        D<   Dec05   2:17 [zvol]
root      4323  0.0  0.0      0     0 ?        D    Dec05   0:01 [txg_quiesce]
root     18842  1.2  0.0      0     0 ?        D    13:14   0:45 [md15_resync]
root     28115  0.0  0.0      0     0 ?        D<   Dec06   0:00 [zvol]
root     28117  0.0  0.0      0     0 ?        D<   Dec06   0:00 [zvol]
root     28119  0.0  0.0      0     0 ?        D<   Dec06   0:00 [zvol]
root     28120  0.0  0.0      0     0 ?        D<   Dec06   0:00 [zvol]
root     28126  0.0  0.0      0     0 ?        D<   Dec06   0:00 [zvol]
root     28127  0.0  0.0      0     0 ?        D<   Dec06   0:00 [zvol]
root     28128  0.0  0.0      0     0 ?        D<   Dec06   0:00 [zvol]
root     28129  0.0  0.0      0     0 ?        D<   Dec06   0:00 [zvol]
root     28130  0.0  0.0      0     0 ?        D<   Dec06   0:00 [zvol]
root     28131  0.0  0.0      0     0 ?        D<   Dec06   0:00 [zvol]
root     28132  0.0  0.0      0     0 ?        D<   Dec06   0:00 [zvol]
root     28133  0.0  0.0      0     0 ?        D<   Dec06   0:00 [zvol]
root     28134  0.0  0.0      0     0 ?        D<   Dec06   0:00 [zvol]
root     28135  0.0  0.0      0     0 ?        D<   Dec06   0:00 [zvol]
root     28136  0.0  0.0      0     0 ?        D<   Dec06   0:00 [zvol]
root     28137  0.0  0.0      0     0 ?        D<   Dec06   0:00 [zvol]
root     28138  0.0  0.0      0     0 ?        D<   Dec06   0:00 [zvol]
root     28139  0.0  0.0      0     0 ?        D<   Dec06   0:00 [zvol]
root     28140  0.0  0.0      0     0 ?        D<   Dec06   0:00 [zvol]
root     28141  0.0  0.0      0     0 ?        D<   Dec06   0:00 [zvol]
root     28142  0.0  0.0      0     0 ?        D<   Dec06   0:00 [zvol]
root     28143  0.0  0.0      0     0 ?        D<   Dec06   0:00 [zvol]
root     28145  0.0  0.0      0     0 ?        D<   Dec06   0:00 [zvol]
root     28150  0.0  0.0      0     0 ?        D<   Dec06   0:00 [zvol]
root     28151  0.0  0.0      0     0 ?        D<   Dec06   0:00 [zvol]
root     28152  0.0  0.0      0     0 ?        D<   Dec06   0:00 [zvol]
root     28153  0.0  0.0      0     0 ?        D<   Dec06   0:00 [zvol]
root     28154  0.0  0.0      0     0 ?        D<   Dec06   0:29 [zvol]
root     28155  0.0  0.0      0     0 ?        D<   Dec06   0:00 [zvol]
root     28156  0.0  0.0      0     0 ?        D<   Dec06   0:00 [zvol]
root     28157  0.0  0.0      0     0 ?        D<   Dec06   0:00 [zvol]

from dmesg:

[Fri Dec  6 16:30:23 2019] INFO: task zvol:485 blocked for more than 120 seconds.
[Fri Dec  6 16:30:23 2019]       Tainted: P          IO      5.0.15-1-pve #1
[Fri Dec  6 16:30:23 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Fri Dec  6 16:30:23 2019] zvol            D    0   485      2 0x80000000
[Fri Dec  6 16:30:23 2019] Call Trace:
[Fri Dec  6 16:30:23 2019]  __schedule+0x2d4/0x870
[Fri Dec  6 16:30:23 2019]  ? try_to_wake_up+0x59/0x4f0
[Fri Dec  6 16:30:23 2019]  schedule+0x2c/0x70
[Fri Dec  6 16:30:23 2019]  rwsem_down_write_failed+0x160/0x340
[Fri Dec  6 16:30:23 2019]  call_rwsem_down_write_failed+0x17/0x30
[Fri Dec  6 16:30:23 2019]  ? spl_kmem_free+0x33/0x40 [spl]
[Fri Dec  6 16:30:23 2019]  down_write+0x2d/0x40
[Fri Dec  6 16:30:23 2019]  dmu_zfetch+0x134/0x590 [zfs]
[Fri Dec  6 16:30:23 2019]  dmu_buf_hold_array_by_dnode+0x379/0x450 [zfs]
[Fri Dec  6 16:30:23 2019]  dmu_write_uio_dnode+0x4c/0x140 [zfs]
[Fri Dec  6 16:30:23 2019]  zvol_write+0x190/0x620 [zfs]
[Fri Dec  6 16:30:23 2019]  taskq_thread+0x2ec/0x4d0 [spl]
[Fri Dec  6 16:30:23 2019]  ? wake_up_q+0x80/0x80
[Fri Dec  6 16:30:23 2019]  kthread+0x120/0x140
[Fri Dec  6 16:30:23 2019]  ? task_done+0xb0/0xb0 [spl]
[Fri Dec  6 16:30:23 2019]  ? __kthread_parkme+0x70/0x70
[Fri Dec  6 16:30:23 2019]  ret_from_fork+0x35/0x40
[Fri Dec  6 16:30:23 2019] INFO: task txg_quiesce:4323 blocked for more than 120 seconds.
[Fri Dec  6 16:30:23 2019]       Tainted: P          IO      5.0.15-1-pve #1
[Fri Dec  6 16:30:23 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Fri Dec  6 16:30:23 2019] txg_quiesce     D    0  4323      2 0x80000000
[Fri Dec  6 16:30:23 2019] Call Trace:
[Fri Dec  6 16:30:23 2019]  __schedule+0x2d4/0x870
[Fri Dec  6 16:30:23 2019]  schedule+0x2c/0x70
[Fri Dec  6 16:30:23 2019]  cv_wait_common+0x104/0x130 [spl]
[Fri Dec  6 16:30:23 2019]  ? wait_woken+0x80/0x80
[Fri Dec  6 16:30:23 2019]  __cv_wait+0x15/0x20 [spl]
[Fri Dec  6 16:30:23 2019]  txg_quiesce_thread+0x2ac/0x3a0 [zfs]
[Fri Dec  6 16:30:23 2019]  ? txg_sync_thread+0x4c0/0x4c0 [zfs]
[Fri Dec  6 16:30:23 2019]  thread_generic_wrapper+0x74/0x90 [spl]
[Fri Dec  6 16:30:23 2019]  kthread+0x120/0x140
[Fri Dec  6 16:30:23 2019]  ? __thread_exit+0x20/0x20 [spl]
[Fri Dec  6 16:30:23 2019]  ? __kthread_parkme+0x70/0x70
[Fri Dec  6 16:30:23 2019]  ret_from_fork+0x35/0x40

zfs create command hangs:

# strace zfs create zpl2/test2
stat("/sys/module/zfs/features.pool/org.zfsonlinux:allocation_classes", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
stat("/sys/module/zfs/features.pool/com.datto:resilver_defer", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
openat(AT_FDCWD, "/usr/share/locale/locale.alias", O_RDONLY|O_CLOEXEC) = 6
fstat(6, {st_mode=S_IFREG|0644, st_size=2995, ...}) = 0
read(6, "# Locale name alias data base.\n#"..., 3072) = 2995
read(6, "", 3072)                       = 0
close(6)                                = 0
openat(AT_FDCWD, "/usr/share/locale/en_US.UTF-8/LC_MESSAGES/zfs-linux-user.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/en_US.utf8/LC_MESSAGES/zfs-linux-user.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/en_US/LC_MESSAGES/zfs-linux-user.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/en.UTF-8/LC_MESSAGES/zfs-linux-user.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/en.utf8/LC_MESSAGES/zfs-linux-user.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/en/LC_MESSAGES/zfs-linux-user.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
ioctl(3, _IOC(_IOC_NONE, 0x5a, 0x12, 0), 0x7fff81e9c610) = 0
mmap(NULL, 266240, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f154c79b000
ioctl(3, _IOC(_IOC_NONE, 0x5a, 0x12, 0), 0x7fff81e99010) = 0
ioctl(3, _IOC(_IOC_NONE, 0x5a, 0x5, 0), 0x7fff81e95980) = 0
brk(0x559c52bc3000)                     = 0x559c52bc3000
munmap(0x7f154c79b000, 266240)          = 0
brk(0x559c52c0c000)                     = 0x559c52c0c000
ioctl(3, _IOC(_IOC_NONE, 0x5a, 0x12, 0), 0x7fff81e9cb00) = -1 ENOENT (No such file or directory)
ioctl(3, _IOC(_IOC_NONE, 0x5a, 0x5, 0), 0x7fff81e9cac0) = 0
ioctl(3, _IOC(_IOC_NONE, 0x5a, 0x12, 0), 0x7fff81e9c690) = 0
ioctl(3, _IOC(_IOC_NONE, 0x5a, 0x27, 0), 0x7fff81e9c610) = 0
ioctl(5, _IOC(_IOC_NONE, 0x5a, 0x17, 0)

creating snapshot:

# strace zfs snapshot zpl2/vm-1414-disk-1@test
execve("/usr/sbin/zfs", ["zfs", "snapshot", "zpl2/vm-1414-disk-1@test"], 0x7fffde7df1d0 /* 21 vars */) = 0
brk(NULL)                               = 0x55e876424000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
...
stat("/sys/module/zfs/features.pool/com.datto:resilver_defer", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
openat(AT_FDCWD, "/usr/share/locale/locale.alias", O_RDONLY|O_CLOEXEC) = 6
fstat(6, {st_mode=S_IFREG|0644, st_size=2995, ...}) = 0
read(6, "# Locale name alias data base.\n#"..., 3072) = 2995
read(6, "", 3072)                       = 0
close(6)                                = 0
openat(AT_FDCWD, "/usr/share/locale/en_US.UTF-8/LC_MESSAGES/zfs-linux-user.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/en_US.utf8/LC_MESSAGES/zfs-linux-user.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/en_US/LC_MESSAGES/zfs-linux-user.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/en.UTF-8/LC_MESSAGES/zfs-linux-user.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/en.utf8/LC_MESSAGES/zfs-linux-user.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/en/LC_MESSAGES/zfs-linux-user.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
mmap(NULL, 266240, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7faaa3621000
ioctl(3, _IOC(_IOC_NONE, 0x5a, 0x12, 0), 0x7ffd4c711990) = 0
ioctl(3, _IOC(_IOC_NONE, 0x5a, 0x5, 0), 0x7ffd4c70e300) = 0
brk(0x55e876466000)                     = 0x55e876466000
munmap(0x7faaa3621000, 266240)          = 0
ioctl(3, _IOC(_IOC_NONE, 0x5a, 0x5, 0), 0x7ffd4c711920) = 0
brk(0x55e87648f000)                     = 0x55e87648f000
ioctl(5, _IOC(_IOC_NONE, 0x5a, 0x23, 0)

zpool history command hangs:

stat("/sys/module/zfs/features.pool/org.zfsonlinux:project_quota", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
stat("/sys/module/zfs/features.pool/org.zfsonlinux:allocation_classes", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
stat("/sys/module/zfs/features.pool/com.datto:resilver_defer", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
ioctl(3, _IOC(_IOC_NONE, 0x5a, 0x5, 0), 0x7ffe6f208270) = 0
brk(0x564e6bc5c000)                     = 0x564e6bc5c000
openat(AT_FDCWD, "/usr/share/locale/locale.alias", O_RDONLY|O_CLOEXEC) = 6
fstat(6, {st_mode=S_IFREG|0644, st_size=2995, ...}) = 0
read(6, "# Locale name alias data base.\n#"..., 3072) = 2995
read(6, "", 3072)                       = 0
close(6)                                = 0
openat(AT_FDCWD, "/usr/share/locale/en_US.UTF-8/LC_MESSAGES/zfs-linux-user.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/en_US.utf8/LC_MESSAGES/zfs-linux-user.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/en_US/LC_MESSAGES/zfs-linux-user.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/en.UTF-8/LC_MESSAGES/zfs-linux-user.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/en.utf8/LC_MESSAGES/zfs-linux-user.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/en/LC_MESSAGES/zfs-linux-user.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
fstat(1, {st_mode=S_IFCHR|0600, st_rdev=makedev(0x88, 0x8), ...}) = 0
write(1, "History for 'zpl2':\n", 20History for 'zpl2':
)   = 20
mmap(NULL, 135168, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f217eed1000
ioctl(3, _IOC(_IOC_NONE, 0x5a, 0xa, 0)

zpool status works fine:

# zpool status
  pool: rpool
 state: ONLINE
  scan: resilvered 48.7M in 0 days 00:00:01 with 0 errors on Wed Nov 27 11:41:49 2019
config:

        NAME                                              STATE     READ WRITE CKSUM
        rpool                                             ONLINE       0     0     0
          mirror-0                                        ONLINE       0     0     0
            scsi-3600508b1001cd542127ea37ea7a8acf9-part3  ONLINE       0     0     0
            scsi-3600508b1001c39b7de4f60b664d58be3-part3  ONLINE       0     0     0

errors: No known data errors

  pool: zpl1
 state: ONLINE
  scan: resilvered 1007M in 0 days 00:00:09 with 0 errors on Tue Nov 26 19:49:33 2019
config:

        NAME        STATE     READ WRITE CKSUM
        zpl1        ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            sda4    ONLINE       0     0     0
            sdb4    ONLINE       0     0     0

errors: No known data errors

  pool: zpl2
 state: ONLINE
  scan: none requested
config:

        NAME                                        STATE     READ WRITE CKSUM
        zpl2                                        ONLINE       0     0     0
          mirror-0                                  ONLINE       0     0     0
            wwn-0x600508b1001cf654d572b375c56c4532  ONLINE       0     0     0
            wwn-0x600508b1001cd152a9b5ce7a105d408f  ONLINE       0     0     0
          mirror-1                                  ONLINE       0     0     0
            wwn-0x600508b1001c1b5698104782c7f381cd  ONLINE       0     0     0
            wwn-0x600508b1001c25c5205c24bf340de661  ONLINE       0     0     0

errors: No known data errors

zdb commands work fine after zpool set cachefile=/etc/zfs/zpool.cache zpl2

# zdb -C zpl2

MOS Configuration:
        version: 5000
        name: 'zpl2'
        state: 0
        txg: 348433
        pool_guid: 8066186125374951052
        errata: 0
        hostid: 319288231
        hostname: 'docker14'
        com.delphix:has_per_vdev_zaps
        vdev_children: 2
        vdev_tree:
            type: 'root'
            id: 0
            guid: 8066186125374951052
            create_txg: 4
            children[0]:
                type: 'mirror'
                id: 0
                guid: 13861948267117057889
                metaslab_array: 144
                metaslab_shift: 33
                ashift: 12
                asize: 1000156430336
                is_log: 0
                create_txg: 4
                com.delphix:vdev_zap_top: 129
                children[0]:
                    type: 'disk'
                    id: 0
                    guid: 4319338434976575004
                    path: '/dev/disk/by-id/wwn-0x600508b1001cf654d572b375c56c4532-part1'
                    devid: 'scsi-3600508b1001cf654d572b375c56c4532-part1'
                    phys_path: 'pci-0000:05:00.0-scsi-0:1:0:4'
                    whole_disk: 1
                    create_txg: 4
                    com.delphix:vdev_zap_leaf: 130
                children[1]:
                    type: 'disk'
                    id: 1
                    guid: 14373262178903517398
                    path: '/dev/disk/by-id/wwn-0x600508b1001cd152a9b5ce7a105d408f-part1'
                    devid: 'scsi-3600508b1001cd152a9b5ce7a105d408f-part1'
                    phys_path: 'pci-0000:05:00.0-scsi-0:1:0:5'
                    whole_disk: 1
                    create_txg: 4
                    com.delphix:vdev_zap_leaf: 131
            children[1]:
                type: 'mirror'
                id: 1
                guid: 6843532063015181877
                metaslab_array: 135
                metaslab_shift: 33
                ashift: 12
                asize: 1000156430336
                is_log: 0
                create_txg: 4
                com.delphix:vdev_zap_top: 132
                children[0]:
                    type: 'disk'
                    id: 0
                    guid: 15637452578554080026
                    path: '/dev/disk/by-id/wwn-0x600508b1001c1b5698104782c7f381cd-part1'
                    devid: 'scsi-3600508b1001c1b5698104782c7f381cd-part1'
                    phys_path: 'pci-0000:05:00.0-scsi-0:1:0:6'
                    whole_disk: 1
                    create_txg: 4
                    com.delphix:vdev_zap_leaf: 133
                children[1]:
                    type: 'disk'
                    id: 1
                    guid: 995581400251175409
                    path: '/dev/disk/by-id/wwn-0x600508b1001c25c5205c24bf340de661-part1'
                    devid: 'scsi-3600508b1001c25c5205c24bf340de661-part1'
                    phys_path: 'pci-0000:05:00.0-scsi-0:1:0:7'
                    whole_disk: 1
                    create_txg: 4
                    com.delphix:vdev_zap_leaf: 134
        features_for_read:
            com.delphix:hole_birth
            com.delphix:embedded_data

# zdb -b zpl2

Traversing all blocks to verify nothing leaked ...

loading concrete vdev 1, metaslab 115 of 116 ...
 267G completed (3891MB/s) estimated time remaining: 0hr 00min 00sec
        No leaks (block sum matches space maps exactly)

        bp count:              35053166
        ganged count:                 0
        bp logical:        292479080960      avg:   8343
        bp physical:       287418612736      avg:   8199     compression:   1.02
        bp allocated:      288112029696      avg:   8219     compression:   1.02
        bp deduped:                   0    ref>1:      0   deduplication:   1.00
        Normal class:      288112029696     used: 14.46%

        additional, non-pointer bps of type 0:        118
        Dittoed blocks on same vdev: 8616

# zdb -h zpl2

History:
2019-11-12.13:54:54 zpool create zpl2 mirror sde sdf mirror sdg sdh
2019-11-12.17:09:49 zfs create -V 209715200k zpl2/vm-100-disk-0
2019-11-14.10:10:52 zpool import -c /etc/zfs/zpool.cache -aN
2019-11-15.13:29:14 zpool import -c /etc/zfs/zpool.cache -aN
2019-11-15.14:37:57 zpool import -c /etc/zfs/zpool.cache -aN
2019-02-14.13:12:06 zpool import -c /etc/zfs/zpool.cache -aN
2019-11-15.15:28:33 zpool import -c /etc/zfs/zpool.cache -aN
2019-11-20.00:03:44 zfs snapshot zpl2/vm-100-disk-0@repos_deletion
2019-11-20.00:16:17 zfs rollback zpl2/vm-100-disk-0@repos_deletion
2019-11-21.14:00:25 zfs create -V 209715200k zpl2/vm-1410-disk-0
2019-11-22.12:43:12 zfs destroy -r zpl2/vm-100-disk-0
2019-11-26.13:59:13 zpool import -d /dev/disk/by-id/ -o cachefile=none zpl2
2019-11-26.15:29:08 zfs snapshot zpl2/vm-1410-disk-0@failed
2019-11-26.15:30:56 zpool set listsnapshots=on zpl2
2019-11-26.15:32:14 zfs clone zpl2/vm-1410-disk-0@failed zpl2/vm-1410-disk-0-clone
2019-11-26.19:33:11 zfs destroy zpl2/vm-1410-disk-0-clone
2019-11-27.12:24:23 zpool import -d /dev/disk/by-id/ -o cachefile=none zpl2
2019-11-29.14:44:12 zpool import -d /dev/disk/by-id/ -o cachefile=none zpl2
2019-12-02.10:14:15 zfs create -V 5242880k zpl2/vm-1414-disk-0
2019-12-02.10:23:46 zpool import -d /dev/disk/by-id/ -o cachefile=none zpl2
2019-12-02.10:26:55 zfs create -V 5242880k zpl2/vm-1414-disk-1
2019-12-02.10:30:24 zfs create -V 31457280k zpl2/vm-1413-disk-0
2019-12-02.10:36:06 zfs recv -ve zpl2
2019-12-02.10:37:50 zfs recv -ve zpl2
2019-12-02.10:38:27 zfs destroy -r zpl2/vm-1413-disk-0
2019-12-02.10:41:56 zfs recv -ve zpl2
2019-12-02.11:19:41 zfs create zpl2/docker
2019-12-02.11:19:52 zfs destroy zpl2/docker
2019-12-02.11:52:18 zfs create zpl2/docker -o mountpoint=/var/lib/docker
2019-12-02.11:53:28 zfs destroy zpl2/vm-1413-disk-0@migrate
2019-12-02.11:53:32 zfs destroy zpl2/vm-1412-disk-0@migrate
2019-12-02.11:53:38 zfs destroy zpl2/vm-1411-disk-0@migrate
2019-12-02.11:53:48 zfs destroy zpl2/vm-1410-disk-0@failed
2019-12-02.11:53:58 zfs snapshot zpl2/vm-1410-disk-0@norm
2019-12-05.12:40:02 zpool import -d /dev/disk/by-id/ -o cachefile=none zpl2
2019-12-05.13:15:39 zfs set atime=off zpl2
2019-12-05.13:21:14 zfs set compression=off zpl2
Fabian-Gruenbichler commented 4 years ago

On December 7, 2019 12:57 pm, glztmf wrote:

System information

Type Version/Name
Distribution Name Debian
Distribution Version 10 (buster)
Linux Kernel 5.0.15-1-pve
Architecture x86_64
ZFS Version 0.8.1-pve1
SPL Version 0.8.1-pve1

please upgrade to the current version of PVE and report back if you still experience the issue - there have been several important fixes in ZoL 0.8.2!

glztmf commented 4 years ago

@Fabian-Gruenbichler , thanks, i will try ZoL 0.8.2 a little later. now i'm trying to find the exact way to reproduce the problem in order to test same way on other versions

glztmf commented 4 years ago

Same hangs with pve 6.1-3 and zfs 0.8.2.

Created 2 VMs and ran following fio tests simultaneously:

fio --loops=100 --size=500m --filename=/mnt/fiotest.tmp --stonewall --ioengine=libaio \
  --name=1 --bs=8k --rw=randrw \
  --name=2 --bs=512k --rw=randrw \
  --name=3 --bs=512k --rw=randrw \
  --name=4 --bs=512k --rw=randrw \
  --name=5 --bs=512k --iodepth=32 --rw=randrw \
  --name=6 --bs=512k --iodepth=32 --rw=randrw

After about an hour VMs and zfs hang with similar symptoms.

Processes with D state:

# ps -e -o pid,state,start,command | egrep ' D | D< | Dl |STAT'
  506 D 15:13:13 [zvol]
 7014 D 15:19:04 [txg_sync]
19278 D 16:26:38 [zvol]
19279 D 16:26:38 [zvol]
19280 D 16:26:38 [zvol]
19281 D 16:26:38 [zvol]
19282 D 16:26:38 [zvol]
19283 D 16:26:38 [zvol]
19284 D 16:26:38 [zvol]
19285 D 16:26:38 [zvol]
19286 D 16:26:38 [zvol]
19287 D 16:26:38 [zvol]
19288 D 16:26:38 [zvol]
19289 D 16:26:38 [zvol]
19291 D 16:26:38 [zvol]
19293 D 16:26:38 [zvol]
19294 D 16:26:38 [zvol]
19295 D 16:26:38 [zvol]
19296 D 16:26:38 [zvol]
19297 D 16:26:38 [zvol]
19298 D 16:26:38 [zvol]
19300 D 16:26:38 [zvol]
19301 D 16:26:38 [zvol]
19303 D 16:26:38 [zvol]
19304 D 16:26:38 [zvol]
19305 D 16:26:38 [zvol]
19322 D 16:26:38 [zvol]
19348 D 16:26:38 [zvol]
19349 D 16:26:38 [zvol]
19350 D 16:26:38 [zvol]
19351 D 16:26:38 [zvol]
19352 D 16:26:38 [zvol]
19353 D 16:26:38 [zvol]

Inside VMs terminals similar messages:

task sometask:xxx blocked for more than 120 seconds.

And fio IO drops to zero, but doesn't hang:

fio-3.7
Starting 6 processes
Jobs: 1 (f=1): [m(1),P(5)][36.6%][r=0KiB/s,w=0KiB/s][r=0,w=0 IOPS][eta 24m:15s]

Also found zfs module option doesn't work at startup:

# cat /etc/modprobe.d/zfs.conf
options zfs zfs_arc_max=10737418240

... because after startup this parameter with default value:

# cat /sys/module/zfs/parameters/zfs_arc_max
0

Am I missing something? Or should I create one more issue...

glztmf commented 4 years ago

Same thing with 4k volblocksize-d virtual disks.

  PID STAT  STARTED COMMAND
  510 D<   20:52:21 [zvol]
 3298 S+   12:57:01 grep -E  D | D< | Dl |STAT
19814 D    20:52:42 [txg_sync]
25706 D<   12:50:13 [zvol]
25707 D<   12:50:13 [zvol]
25709 D<   12:50:13 [zvol]
25710 D<   12:50:13 [zvol]
25711 D<   12:50:13 [zvol]
25712 D<   12:50:13 [zvol]
25713 D<   12:50:13 [zvol]
25714 D<   12:50:13 [zvol]
25715 D<   12:50:13 [zvol]
25716 D<   12:50:13 [zvol]
25717 D<   12:50:13 [zvol]
25718 D<   12:50:13 [zvol]
25719 D<   12:50:13 [zvol]
25720 D<   12:50:13 [zvol]
25721 D<   12:50:13 [zvol]
25722 D<   12:50:13 [zvol]
25723 D<   12:50:13 [zvol]
25724 D<   12:50:13 [zvol]
25725 D<   12:50:13 [zvol]
25726 D<   12:50:13 [zvol]
25727 D<   12:50:13 [zvol]
25728 D<   12:50:13 [zvol]
25729 D<   12:50:13 [zvol]
25730 D<   12:50:13 [zvol]
25731 D<   12:50:13 [zvol]
25732 D<   12:50:13 [zvol]
25733 D<   12:50:13 [zvol]
25734 D<   12:50:13 [zvol]
25735 D<   12:50:13 [zvol]
25736 D<   12:50:13 [zvol]
25737 D<   12:50:13 [zvol]

Interesting what means that there are always 31 new [zvol] freezing processes + 1 old [zvol] process.

devZer0 commented 4 years ago

i think this one is related: https://github.com/openzfs/zfs/issues/10095

devZer0 commented 4 years ago

just for my interest - why has this been closed without further comment/notice?

devZer0 commented 4 years ago

probably related: https://github.com/openzfs/zfs/issues/9172

glztmf commented 4 years ago

probably a duplicate of #9172

similar errors, but zfs versions are too much different and in #9172 hangs only zvol with compression on, whereas I have no compression and my hangs affect management commands of any zfs volume/dataset also we moved to pve5.4-3 with zfs version 0.7.13 and such errors have never occurred on it

glztmf commented 4 years ago

just for my interest - why has this been closed without further comment/notice?

it was an accident...from my smartphone :)

stale[bot] commented 3 years ago

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.

devZer0 commented 3 years ago

i still think it sucks that tickets getting closed automatically because of inactivity.

it's like sweeping dirt under the carpet