openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.65k stars 1.75k forks source link

zfs sync hang #7038

Closed ltz3317 closed 2 years ago

ltz3317 commented 6 years ago

System information

Type Version/Name
Distribution Name centos 7.2 ,sync hang
Distribution Version
Linux Kernel 3.10.0-327.13.1.el7.x86_64
Architecture
ZFS Version v0.7.5-1
SPL Version v0.7.5-1

Describe the problem you're observing

sync hang ,mysql hang,kworker cpu 100

Describe how to reproduce the problem

high frequency create/destroy/clone

Include any warning/errors/backtraces from the system logs

dmsg: [ 189.990968] Adjusting tsc more than 11% (8039035 vs 7759471) [ 2522.644734] INFO: task mysqld:6608 blocked for more than 120 seconds. [ 2522.644790] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 2522.644854] mysqld D 0000000000000000 0 6608 5602 0x00000000 [ 2522.644860] ffff880feb913da0 0000000000000086 ffff881006eac500 ffff880feb913fd8 [ 2522.644865] ffff880feb913fd8 ffff880feb913fd8 ffff881006eac500 ffff880fffc8a180 [ 2522.644869] ffff880fffc8a000 ffff880fffc8a188 ffff880fffc8a028 0000000000000000 [ 2522.644873] Call Trace: [ 2522.644882] [] schedule+0x29/0x70 [ 2522.644906] [] cv_wait_common+0x125/0x150 [spl] [ 2522.644911] [] ? wake_up_atomic_t+0x30/0x30 [ 2522.644922] [] cv_wait+0x15/0x20 [spl] [ 2522.644995] [] zil_commit.part.12+0x8b/0x830 [zfs] [ 2522.645006] [] ? spl_kmem_alloc+0xc7/0x170 [spl] [ 2522.645017] [] ? tsd_set+0x324/0x500 [spl] [ 2522.645022] [] ? mutex_lock+0x12/0x2f [ 2522.645075] [] zil_commit+0x17/0x20 [zfs] [ 2522.645128] [] zfs_fsync+0x77/0xf0 [zfs] [ 2522.645179] [] zpl_fsync+0x65/0x90 [zfs] [ 2522.645186] [] do_fsync+0x65/0xa0 [ 2522.645191] [] SyS_fsync+0x10/0x20 [ 2522.645196] [] system_call_fastpath+0x16/0x1b [ 2522.645202] INFO: task mysqld:7514 blocked for more than 120 seconds. [ 2522.645245] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 2522.645296] mysqld D ffff880fb554bda0 0 7514 5602 0x00000000 [ 2522.645299] ffff880fb554bd30 0000000000000086 ffff880ff4c8f300 ffff880fb554bfd8 [ 2522.645303] ffff880fb554bfd8 ffff880fb554bfd8 ffff880ff4c8f300 ffff880fb554bdb0 [ 2522.645307] ffff88107ff8dec0 0000000000000002 ffffffff811f9420 ffff880fb554bda0 [ 2522.645312] Call Trace: [ 2522.645316] [] ? unlock_two_nondirectories+0x60/0x60 [ 2522.645321] [] schedule+0x29/0x70 [ 2522.645324] [] inode_wait+0xe/0x20 [ 2522.645328] [] wait_on_bit+0x60/0x90 [ 2522.645335] [] inode_wait_for_writeback+0xaf/0xf0 [ 2522.645339] [] ? wake_atomic_t_function+0x40/0x40 [ 2522.645345] [] inode_wait_for_writeback+0x26/0x40 [ 2522.645348] [] evict+0x95/0x170 [ 2522.645351] [] iput+0xf5/0x180 [ 2522.645357] [] do_unlinkat+0x1ae/0x2b0 [ 2522.645362] [] ? SyS_readlinkat+0xd1/0x140 [ 2522.645367] [] SyS_unlink+0x16/0x20 [ 2522.645371] [] system_call_fastpath+0x16/0x1b [ 2522.645402] INFO: task sync:2653 blocked for more than 120 seconds. [ 2522.645443] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 2522.645494] sync D ffffffff8120f8c0 0 2653 1455 0x00000000 [ 2522.645498] ffff880ffedd7d50 0000000000000082 ffff880f01449700 ffff880ffedd7fd8 [ 2522.645502] ffff880ffedd7fd8 ffff880ffedd7fd8 ffff880f01449700 ffff880ffedd7e80 [ 2522.645506] ffff880ffedd7e88 7fffffffffffffff ffff880f01449700 ffffffff8120f8c0 [ 2522.645509] Call Trace: [ 2522.645515] [] ? generic_write_sync+0x60/0x60 [ 2522.645519] [] schedule+0x29/0x70 [ 2522.645523] [] schedule_timeout+0x209/0x2d0 [ 2522.645527] [] ? __queue_work+0x136/0x320 [ 2522.645530] [] ? queue_delayed_work+0xaa/0x1a0 [ 2522.645534] [] ? try_to_grab_pending+0xb1/0x160 [ 2522.645538] [] ? generic_write_sync+0x60/0x60 [ 2522.645543] [] wait_for_completion+0x116/0x170 [ 2522.645548] [] ? wake_up_state+0x20/0x20 [ 2522.645552] [] sync_inodes_sb+0xb7/0x1e0 [ 2522.645557] [] ? generic_write_sync+0x60/0x60 [ 2522.645561] [] sync_inodes_one_sb+0x19/0x20 [ 2522.645565] [] iterate_supers+0xb2/0x110 [ 2522.645570] [] sys_sync+0x44/0xb0 [ 2522.645575] [] system_call_fastpath+0x16/0x1b [ 2642.739175] INFO: task mysqld:6608 blocked for more than 120 seconds. [ 2642.739228] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 2642.739284] mysqld D 0000000000000000 0 6608 5602 0x00000000 [ 2642.739290] ffff880feb913da0 0000000000000086 ffff881006eac500 ffff880feb913fd8 [ 2642.739296] ffff880feb913fd8 ffff880feb913fd8 ffff881006eac500 ffff880fffc8a180 [ 2642.739300] ffff880fffc8a000 ffff880fffc8a188 ffff880fffc8a028 0000000000000000 [ 2642.739305] Call Trace: [ 2642.739315] [] schedule+0x29/0x70 [ 2642.739340] [] cv_wait_common+0x125/0x150 [spl] [ 2642.739345] [] ? wake_up_atomic_t+0x30/0x30 [ 2642.739357] [] cv_wait+0x15/0x20 [spl] [ 2642.739434] [] zil_commit.part.12+0x8b/0x830 [zfs] [ 2642.739447] [] ? spl_kmem_alloc+0xc7/0x170 [spl] [ 2642.739460] [] ? tsd_set+0x324/0x500 [spl] [ 2642.739466] [] ? mutex_lock+0x12/0x2f [ 2642.739526] [] zil_commit+0x17/0x20 [zfs] [ 2642.739587] [] zfs_fsync+0x77/0xf0 [zfs] [ 2642.739647] [] zpl_fsync+0x65/0x90 [zfs] [ 2642.739654] [] do_fsync+0x65/0xa0 [ 2642.739659] [] SyS_fsync+0x10/0x20 [ 2642.739665] [] system_call_fastpath+0x16/0x1b [ 2642.739670] INFO: task mysqld:7514 blocked for more than 120 seconds. [ 2642.739727] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 2642.739793] mysqld D ffff880fb554bda0 0 7514 5602 0x00000000 [ 2642.739798] ffff880fb554bd30 0000000000000086 ffff880ff4c8f300 ffff880fb554bfd8 [ 2642.739803] ffff880fb554bfd8 ffff880fb554bfd8 ffff880ff4c8f300 ffff880fb554bdb0 [ 2642.739808] ffff88107ff8dec0 0000000000000002 ffffffff811f9420 ffff880fb554bda0 [ 2642.739812] Call Trace: [ 2642.739818] [] ? unlock_two_nondirectories+0x60/0x60 [ 2642.739823] [] schedule+0x29/0x70 [ 2642.739826] [] inode_wait+0xe/0x20 [ 2642.739831] [] wait_on_bit+0x60/0x90 [ 2642.739839] [] inode_wait_for_writeback+0xaf/0xf0 [ 2642.739843] [] ? wake_atomic_t_function+0x40/0x40 [ 2642.739850] [] inode_wait_for_writeback+0x26/0x40 [ 2642.739854] [] evict+0x95/0x170 [ 2642.739857] [] iput+0xf5/0x180 [ 2642.739862] [] do_unlinkat+0x1ae/0x2b0 [ 2642.739868] [] ? SyS_readlinkat+0xd1/0x140 [ 2642.739873] [] SyS_unlink+0x16/0x20 [ 2642.739878] [] system_call_fastpath+0x16/0x1b [ 2642.739897] INFO: task sync:2653 blocked for more than 120 seconds. [ 2642.739951] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 2642.740017] sync D ffffffff8120f8c0 0 2653 1455 0x00000000 [ 2642.740021] ffff880ffedd7d50 0000000000000082 ffff880f01449700 ffff880ffedd7fd8 [ 2642.740026] ffff880ffedd7fd8 ffff880ffedd7fd8 ffff880f01449700 ffff880ffedd7e80 [ 2642.740031] ffff880ffedd7e88 7fffffffffffffff ffff880f01449700 ffffffff8120f8c0 [ 2642.740036] Call Trace: [ 2642.740042] [] ? generic_write_sync+0x60/0x60 [ 2642.740047] [] schedule+0x29/0x70 [ 2642.740051] [] schedule_timeout+0x209/0x2d0 [ 2642.740056] [] ? __queue_work+0x136/0x320 [ 2642.740060] [] ? queue_delayed_work+0xaa/0x1a0 [ 2642.740064] [] ? try_to_grab_pending+0xb1/0x160 [ 2642.740069] [] ? generic_write_sync+0x60/0x60 [ 2642.740093] [] wait_for_completion+0x116/0x170 [ 2642.740100] [] ? wake_up_state+0x20/0x20 [ 2642.740105] [] sync_inodes_sb+0xb7/0x1e0 [ 2642.740110] [] ? generic_write_sync+0x60/0x60 [ 2642.740116] [] sync_inodes_one_sb+0x19/0x20 [ 2642.740121] [] iterate_supers+0xb2/0x110 [ 2642.740127] [] sys_sync+0x44/0xb0 [ 2642.740134] [] system_call_fastpath+0x16/0x1b [ 2762.834495] INFO: task mysqld:6608 blocked for more than 120 seconds. [ 2762.834547] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 2762.834604] mysqld D 0000000000000000 0 6608 5602 0x00000000 [ 2762.834609] ffff880feb913da0 0000000000000086 ffff881006eac500 ffff880feb913fd8 [ 2762.834615] ffff880feb913fd8 ffff880feb913fd8 ffff881006eac500 ffff880fffc8a180 [ 2762.834619] ffff880fffc8a000 ffff880fffc8a188 ffff880fffc8a028 0000000000000000 [ 2762.834624] Call Trace: [ 2762.834634] [] schedule+0x29/0x70 [ 2762.834659] [] cv_wait_common+0x125/0x150 [spl] [ 2762.834665] [] ? wake_up_atomic_t+0x30/0x30 [ 2762.834677] [] cv_wait+0x15/0x20 [spl] [ 2762.834748] [] zil_commit.part.12+0x8b/0x830 [zfs] [ 2762.834761] [] ? spl_kmem_alloc+0xc7/0x170 [spl] [ 2762.834774] [] ? tsd_set+0x324/0x500 [spl] [ 2762.834779] [] ? mutex_lock+0x12/0x2f [ 2762.834843] [] zil_commit+0x17/0x20 [zfs] [ 2762.834908] [] zfs_fsync+0x77/0xf0 [zfs] [ 2762.834971] [] zpl_fsync+0x65/0x90 [zfs] [ 2762.834978] [] do_fsync+0x65/0xa0 [ 2762.834983] [] SyS_fsync+0x10/0x20 [ 2762.834989] [] system_call_fastpath+0x16/0x1b [ 2762.834994] INFO: task mysqld:7514 blocked for more than 120 seconds. [ 2762.835051] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 2762.835118] mysqld D ffff880fb554bda0 0 7514 5602 0x00000000 [ 2762.835122] ffff880fb554bd30 0000000000000086 ffff880ff4c8f300 ffff880fb554bfd8 [ 2762.835127] ffff880fb554bfd8 ffff880fb554bfd8 ffff880ff4c8f300 ffff880fb554bdb0 [ 2762.835132] ffff88107ff8dec0 0000000000000002 ffffffff811f9420 ffff880fb554bda0 [ 2762.835137] Call Trace: [ 2762.835143] [] ? unlock_two_nondirectories+0x60/0x60 [ 2762.835148] [] schedule+0x29/0x70 [ 2762.835151] [] inode_wait+0xe/0x20 [ 2762.835156] [] wait_on_bit+0x60/0x90 [ 2762.835164] [] inode_wait_for_writeback+0xaf/0xf0 [ 2762.835168] [] ? wake_atomic_t_function+0x40/0x40 [ 2762.835175] [] inode_wait_for_writeback+0x26/0x40 [ 2762.835179] [] evict+0x95/0x170 [ 2762.835183] [] iput+0xf5/0x180 [ 2762.835188] [] do_unlinkat+0x1ae/0x2b0 [ 2762.835195] [] ? SyS_readlinkat+0xd1/0x140 [ 2762.835199] [] SyS_unlink+0x16/0x20 [ 2762.835205] [] system_call_fastpath+0x16/0x1b [ 2762.835223] INFO: task sync:2653 blocked for more than 120 seconds. [ 2762.835277] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 2762.835343] sync D ffffffff8120f8c0 0 2653 1455 0x00000000 [ 2762.835348] ffff880ffedd7d50 0000000000000082 ffff880f01449700 ffff880ffedd7fd8 [ 2762.835352] ffff880ffedd7fd8 ffff880ffedd7fd8 ffff880f01449700 ffff880ffedd7e80 [ 2762.835357] ffff880ffedd7e88 7fffffffffffffff ffff880f01449700 ffffffff8120f8c0 [ 2762.835362] Call Trace: [ 2762.835369] [] ? generic_write_sync+0x60/0x60 [ 2762.835374] [] schedule+0x29/0x70 [ 2762.835378] [] schedule_timeout+0x209/0x2d0 [ 2762.835382] [] ? __queue_work+0x136/0x320 [ 2762.835386] [] ? queue_delayed_work+0xaa/0x1a0 [ 2762.835390] [] ? try_to_grab_pending+0xb1/0x160 [ 2762.835396] [] ? generic_write_sync+0x60/0x60 [ 2762.835409] [] wait_for_completion+0x116/0x170 [ 2762.835417] [] ? wake_up_state+0x20/0x20 [ 2762.835422] [] sync_inodes_sb+0xb7/0x1e0 [ 2762.835427] [] ? generic_write_sync+0x60/0x60 [ 2762.835433] [] sync_inodes_one_sb+0x19/0x20 [ 2762.835438] [] iterate_supers+0xb2/0x110 [ 2762.835443] [] sys_sync+0x44/0xb0 [ 2762.835451] [] system_call_fastpath+0x16/0x1b [ 2882.929813] INFO: task mysqld:6608 blocked for more than 120 seconds. [ 2882.929861] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 2882.929913] mysqld D 0000000000000000 0 6608 5602 0x00000000 [ 2882.929919] ffff880feb913da0 0000000000000086 ffff881006eac500 ffff880feb913fd8 [ 2882.929924] ffff880feb913fd8 ffff880feb913fd8 ffff881006eac500 ffff880fffc8a180 [ 2882.929928] ffff880fffc8a000 ffff880fffc8a188 ffff880fffc8a028 0000000000000000 [ 2882.929932] Call Trace: [ 2882.929942] [] schedule+0x29/0x70 [ 2882.929967] [] cv_wait_common+0x125/0x150 [spl] [ 2882.929972] [] ? wake_up_atomic_t+0x30/0x30 [ 2882.929983] [] __cv_wait+0x15/0x20 [spl] [ 2882.930049] [] zil_commit.part.12+0x8b/0x830 [zfs] [ 2882.930059] [] ? spl_kmem_alloc+0xc7/0x170 [spl] [ 2882.930070] [] ? tsd_set+0x324/0x500 [spl] [ 2882.930075] [] ? mutex_lock+0x12/0x2f [ 2882.930129] [] zil_commit+0x17/0x20 [zfs] [ 2882.930181] [] zfs_fsync+0x77/0xf0 [zfs] [ 2882.930232] [] zpl_fsync+0x65/0x90 [zfs] [ 2882.930238] [] do_fsync+0x65/0xa0 [ 2882.930243] [] SyS_fsync+0x10/0x20 [ 2882.930248] [] system_call_fastpath+0x16/0x1b [ 4802.367346] perf interrupt took too long (2507 > 2500), lowering kernel.perf_event_max_sample_rate to 50000

sync process stack: [] sync_inodes_sb+0xb7/0x1e0 [] sync_inodes_one_sb+0x19/0x20 [] iterate_supers+0xb2/0x110 [] sys_sync+0x44/0xb0 [] system_call_fastpath+0x16/0x1b [] 0xffffffffffffffff

kworker process stack: cat /proc/3445/stack [] dmu_zfetch+0x455/0x4f0 [zfs] [] dbuf_read+0x8ea/0x9f0 [zfs] [] dnode_hold_impl+0xc6/0xc30 [zfs] [] 0xffffffffffffffff

cat /proc/3445/stack [] 0xffffffffffffffff [root@ifcos ~]# cat /proc/3445/stack [] zfs_zget+0xfc/0x250 [zfs] [] zfs_get_data+0x57/0x2d0 [zfs] [] zil_commit.part.12+0x41c/0x830 [zfs] [] zil_commit+0x17/0x20 [zfs] [] zpl_writepages+0xd6/0x170 [zfs] [] do_writepages+0x1e/0x40 [] __writeback_single_inode+0x40/0x220 [] writeback_sb_inodes+0x25e/0x420 [] wb_writeback+0xff/0x2f0 [] bdi_writeback_workfn+0x115/0x460 [] process_one_work+0x17b/0x470 [] worker_thread+0x11b/0x400 [] kthread+0xcf/0xe0 [] ret_from_fork+0x58/0x90 [] 0xffffffffffffffff

cat /proc/3445/stack [] dmu_zfetch+0x455/0x4f0 [zfs] [] __dbuf_hold_impl+0x135/0x5a0 [zfs] [] 0xffffffffffffffff

cat /proc/3445/stack [] dbuf_find+0x1c9/0x1d0 [zfs] [] __dbuf_hold_impl+0x42/0x5a0 [zfs] [] 0xffffffffffffffff

cat /proc/3445/stack [] dmu_zfetch+0x455/0x4f0 [zfs] [] dbuf_read+0x756/0x9f0 [zfs] [] dnode_hold_impl+0xc6/0xc30 [zfs]

xTire commented 6 years ago

The same situation with me too.

imathrowback commented 6 years ago

Same issue. No reproduction steps identified, seems "random".

Most commonly appears on long-running disk intensive tasks (in my case caused by BURP backup software) but getting the same blocked stack traces at generic_write_sync and kworker at 100%.

Here is output from 'echo l > /proc/sysrq-trigger'

[5490997.296870] sending NMI to all CPUs: [5490997.297924] NMI backtrace for cpu 0 [5490997.297926] CPU: 0 PID: 27539 Comm: kworker/u8:1 Tainted: P OE ------------ 3.10.0-693.el7.x86_64 #1 [5490997.297928] Hardware name: Gigabyte Technology Co., Ltd. G33-DS3R/G33-DS3R, BIOS F4 06/29/2007 [5490997.297929] Workqueue: writeback bdi_writeback_workfn (flush-zfs-1) [5490997.297931] task: ffff8800c7db3f40 ti: ffff880120aac000 task.ti: ffff880120aac000 [5490997.297932] RIP: 0010:[] [] read_tsc+0xa/0x20 [5490997.297933] RSP: 0018:ffff880120aaf7c8 EFLAGS: 00000246 [5490997.297935] RAX: ffff8800a5cdf0d8 RBX: ffffffff81a0e3c0 RCX: 0000000000000000 [5490997.297936] RDX: 0000000000000058 RSI: ffff8800a5cdf080 RDI: ffffffff81a0e3c0 [5490997.297937] RBP: ffff880120aaf7c8 R08: 000000000053c8f3 R09: 0000000000000000 [5490997.297939] R10: 00000000006db6e0 R11: ffffea0001935000 R12: ffff880120aaf828 [5490997.297940] R13: 000000008aaf12f6 R14: ffff8800a5cdf500 R15: ffff880127d6f3e8 [5490997.297942] FS: 0000000000000000(0000) GS:ffff88012fc00000(0000) knlGS:0000000000000000 [5490997.297943] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [5490997.297944] CR2: 0000000000dde6d0 CR3: 000000005a22f000 CR4: 00000000000007f0 [5490997.297946] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [5490997.297947] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [5490997.297948] Stack: [5490997.297949] ffff880120aaf7f0 ffffffff810ea6ae 0000000000000005 ffff8800a5cdf080 [5490997.297951] ffff880127d6f3b8 ffff880120aaf868 ffffffffc0b73a7c ffff8800c7db3f40 [5490997.297952] ffff8800c7db3f40 000000000000b048 000000000000b049 ffff880120aaf828 [5490997.297954] Call Trace: [5490997.297955] [] getrawmonotonic64+0x2e/0xc0 [5490997.297956] [] dmu_zfetch+0x39c/0x4c0 [zfs] [5490997.297957] [] dbuf_read+0x8e2/0x9e0 [zfs] [5490997.297959] [] ? spl_kmem_free+0x35/0x40 [spl] [5490997.297960] [] ? dbuf_hold_impl+0xb9/0xd0 [zfs] [5490997.297961] [] dnode_hold_impl+0xc6/0xc30 [zfs] [5490997.297963] [] ? avl_add+0x4a/0x80 [zavl] [5490997.297964] [] ? mutex_lock+0x12/0x2f [5490997.297965] [] dnode_hold+0x1b/0x20 [zfs] [5490997.297967] [] dmu_bonus_hold+0x32/0x1d0 [zfs] [5490997.297968] [] sa_buf_hold+0xe/0x10 [zfs] [5490997.297969] [] zfs_zget+0x123/0x250 [zfs] [5490997.297971] [] zfs_get_data+0x57/0x2d0 [zfs] [5490997.297972] [] zil_commit.part.12+0x41c/0x830 [zfs] [5490997.297973] [] zil_commit+0x17/0x20 [zfs] [5490997.297975] [] zpl_writepages+0xd6/0x170 [zfs] [5490997.297976] [] do_writepages+0x1e/0x40 [5490997.297977] [] __writeback_single_inode+0x40/0x220 [5490997.297979] [] writeback_sb_inodes+0x1c4/0x490 [5490997.297980] [] wb_writeback+0xff/0x2f0 [5490997.297981] [] ? set_worker_desc+0x86/0xb0 [5490997.297983] [] bdi_writeback_workfn+0x115/0x460 [5490997.297984] [] process_one_work+0x17a/0x440 [5490997.297985] [] worker_thread+0x126/0x3c0 [5490997.297987] [] ? manage_workers.isra.24+0x2a0/0x2a0 [5490997.297988] [] kthread+0xcf/0xe0 [5490997.297989] [] ? insert_kthread_work+0x40/0x40 [5490997.297991] [] ret_from_fork+0x58/0x90 [5490997.297992] [] ? insert_kthread_work+0x40/0x40 [5490997.297994] Code: 00 55 8b 05 39 cc ad 00 48 89 e5 5d c3 0f 1f 40 00 55 b8 ed ff ff ff 48 89 e5 5d c3 0f 1f 44 00 00 55 48 89 e5 66 66 90 0f ae e8 <0f> 31 89 c0 48 c1 e2 20 48 09 c2 48 89 d0 5d c3 66 0f 1f 44 00

and another shortly after

[5490939.586576] SysRq : Show backtrace of all active CPUs [5490939.587701] sending NMI to all CPUs: [5490939.588755] NMI backtrace for cpu 0 [5490939.588757] CPU: 0 PID: 27539 Comm: kworker/u8:1 Tainted: P OE ------------ 3.10.0-693.el7.x86_64 #1 [5490939.588759] Hardware name: Gigabyte Technology Co., Ltd. G33-DS3R/G33-DS3R, BIOS F4 06/29/2007 [5490939.588760] Workqueue: writeback bdi_writeback_workfn (flush-zfs-1) [5490939.588762] task: ffff8800c7db3f40 ti: ffff880120aac000 task.ti: ffff880120aac000 [5490939.588763] RIP: 0010:[] [] getrawmonotonic64+0x7a/0xc0 [5490939.588764] RSP: 0018:ffff880120aaf7d8 EFLAGS: 00000207 [5490939.588766] RAX: 00000000006db6e0 RBX: ffffffff81a0e3c0 RCX: 0000000000000018 [5490939.588767] RDX: 0000000021f8aba3 RSI: 00000000000cc2e8 RDI: 00000000001dc6c8 [5490939.588768] RBP: ffff880120aaf7f0 R08: 000000000053c8b9 R09: 0000000000000000 [5490939.588770] R10: 00000000006db6e0 R11: ffffea0001935000 R12: ffff880120aaf828 [5490939.588771] R13: 000000008aad501e R14: ffff8800a5cdf500 R15: ffff880127d6f3e8 [5490939.588772] FS: 0000000000000000(0000) GS:ffff88012fc00000(0000) knlGS:0000000000000000 [5490939.588774] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [5490939.588775] CR2: 00007f4379aae8e0 CR3: 00000000b9a26000 CR4: 00000000000007f0 [5490939.588777] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [5490939.588778] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [5490939.588779] Stack: [5490939.588781] 0000000000000007 0000000000000000 ffff880127d6f3b8 ffff880120aaf868 [5490939.588782] ffffffffc0b73a7c ffff8800c7db3f40 ffff8800c7db3f40 000000000000b048 [5490939.588784] 000000000000b049 ffff880120aaf828 000000000053c8b9 0000000022056e52 [5490939.588785] Call Trace: [5490939.588786] [] dmu_zfetch+0x39c/0x4c0 [zfs] [5490939.588787] [] dbuf_read+0x8e2/0x9e0 [zfs] [5490939.588789] [] ? spl_kmem_free+0x35/0x40 [spl] [5490939.588790] [] ? dbuf_hold_impl+0xb9/0xd0 [zfs] [5490939.588791] [] dnode_hold_impl+0xc6/0xc30 [zfs] [5490939.588793] [] ? avl_add+0x4a/0x80 [zavl] [5490939.588794] [] ? mutex_lock+0x12/0x2f [5490939.588795] [] dnode_hold+0x1b/0x20 [zfs] [5490939.588797] [] dmu_bonus_hold+0x32/0x1d0 [zfs] [5490939.588798] [] sa_buf_hold+0xe/0x10 [zfs] [5490939.588799] [] zfs_zget+0x123/0x250 [zfs] [5490939.588801] [] zfs_get_data+0x57/0x2d0 [zfs] [5490939.588802] [] zil_commit.part.12+0x41c/0x830 [zfs] [5490939.588803] [] zil_commit+0x17/0x20 [zfs] [5490939.588804] [] zpl_writepages+0xd6/0x170 [zfs] [5490939.588806] [] do_writepages+0x1e/0x40 [5490939.588807] [] __writeback_single_inode+0x40/0x220 [5490939.588809] [] writeback_sb_inodes+0x1c4/0x490 [5490939.588810] [] wb_writeback+0xff/0x2f0 [5490939.588811] [] ? set_worker_desc+0x86/0xb0 [5490939.588813] [] bdi_writeback_workfn+0x115/0x460 [5490939.588814] [] process_one_work+0x17a/0x440 [5490939.588815] [] worker_thread+0x126/0x3c0 [5490939.588817] [] ? manage_workers.isra.24+0x2a0/0x2a0 [5490939.588818] [] kthread+0xcf/0xe0 [5490939.588819] [] ? insert_kthread_work+0x40/0x40 [5490939.588820] [] ret_from_fork+0x58/0x90 [5490939.588822] [] ? insert_kthread_work+0x40/0x40 [5490939.588824] Code: da 00 48 8b 15 58 b1 da 00 44 3b 2d b9 b0 da 00 75 b0 48 29 f0 31 f6 48 21 c7 44 89 d0 48 0f 49 f7 48 0f af f0 4c 01 ce 48 d3 fe <48> 8d 04 16 48 3d ff c9 9a 3b 76 2e 31 d2 0f 1f 84 00 00 00 00

bud4 commented 6 years ago

same problem (hp-dataprotector backup software; vbda process). While run full backup on a snapshot zfs receive does not go on . attach file output of perf record -a and perf sched . i'm using deadline scheduler on block device to try to workaround this problem. some problem with noop scheduler. zfs version :0.6.5.6-0ubuntu16 Linux kernel: 4.4.0-101-generic #124-Ubuntu (Ubuntu 16.04.3 LTS) perf_record-a2.data.gz perf_shed_record2.data.gz

 iostat -xd 20
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdd               0,00     0,00    0,00    0,00     0,00     0,00     0,00     0,00    0,00    0,00    0,00   0,00   0,00
sdc               0,00     0,00    0,00    0,00     0,00     0,00     0,00     0,00    0,00    0,00    0,00   0,00   0,00
sdf               0,00     0,00    4,25   15,30   406,50  1061,58   150,19     0,03    1,73    1,08    1,91   0,40   0,78
sde               0,00     0,00    4,35   16,60   419,43   752,77   111,90     0,03    1,30    0,83    1,42   0,29   0,60
sdg               0,00     0,00  319,35    0,00  9188,00     0,00    57,54     1,01    3,16    3,16    0,00   1,71  54,56
sdh               0,00     0,00  423,45    0,00 12698,40     0,00    59,98     0,93    2,20    2,20    0,00   1,29  54,44
sdj               0,00     0,00  371,45    0,00 10857,80     0,00    58,46     1,22    3,28    3,28    0,00   1,69  62,82
sdi               0,00     0,00  515,95    0,00 15066,20     0,00    58,40     1,08    2,10    2,10    0,00   1,16  59,78
sdk               0,00     0,00  295,80    0,00  8717,00     0,00    58,94     0,60    2,05    2,05    0,00   1,39  41,16
sdl               0,00     0,00  338,10    0,00  9420,80     0,00    55,73     0,70    2,06    2,06    0,00   1,35  45,60
sdq               0,00     0,00  226,60    0,00  6327,40     0,00    55,85     0,90    3,98    3,98    0,00   2,39  54,10
sdo               0,00     0,00  482,35    0,00 14504,60     0,00    60,14     1,36    2,81    2,81    0,00   1,30  62,94
sdn               0,00     0,00  430,60    0,00 12728,60     0,00    59,12     2,08    4,82    4,82    0,00   1,85  79,58
sdm               0,00     0,00  414,70    0,00 12191,80     0,00    58,80     1,00    2,42    2,42    0,00   1,28  53,06
sdr               0,00     0,00  256,05    0,00  7346,40     0,00    57,38     0,52    2,04    2,04    0,00   1,48  37,96
sdp               0,00     0,00  438,55    0,00 12945,60     0,00    59,04     0,99    2,25    2,25    0,00   1,26  55,08
zpool status
  pool: pool_z2_samba
 state: ONLINE
  scan: scrub canceled on Fri Dec  1 09:06:55 2017
config:

    NAME        STATE     READ WRITE CKSUM
    pool_z2_samba  ONLINE       0     0     0
      raidz2-0  ONLINE       0     0     0
        sdg     ONLINE       0     0     0
        sdh     ONLINE       0     0     0
        sdi     ONLINE       0     0     0
        sdj     ONLINE       0     0     0
        sdk     ONLINE       0     0     0
        sdl     ONLINE       0     0     0
      raidz2-1  ONLINE       0     0     0
        sdm     ONLINE       0     0     0
        sdn     ONLINE       0     0     0
        sdo     ONLINE       0     0     0
        sdp     ONLINE       0     0     0
        sdq     ONLINE       0     0     0
        sdr     ONLINE       0     0     0
    logs
      mirror-2  ONLINE       0     0     0
        sdc     ONLINE       0     0     0
        sdd     ONLINE       0     0     0
    cache
      sde       ONLINE       0     0     0
      sdf       ONLINE       0     0     0

errors: No known data errors
bud4 commented 6 years ago

the problem occurs very often. I attach other data from : echo 'w' > /proc/sysrq-trigger and perf record -ag echo_w_sysrq-trigger.gz perf.data.gz

chadjoan commented 6 years ago

Hi all,

I experienced this problem as well (multiple times). Then I found this thread and tried switching from noop scheduler to deadline scheduler and still experienced the problem.

I am running a 4.16.8 Gentoo kernel, with musl libc, and using fairly recent ZFS from master branch (zfs commit 670d74b9cee4dd185a620510bffd4797a2e1066a , spl commit 1149b62d20b7ed9d8ae25d5da7a06213d79b7602).

This tends to happen after doing a lot of compilation. I merged about 1500 packages with the deadline scheduler before having this hit. The noop scheduler would do about 500-1500 packages before quitting. My data set is probably too small to do any comparison, and I didn't bother writing down specifics (the workflow can be complicated to begin with...).

Here's my dmesg just after the hang: https://pastebin.com/MayYqGaV

Note that the system has 56 threads to play with and would spend some fraction of its time with a load average around 60; perhaps that is plenty of opportunity for resource contention and deadlocks?

HTH.

vector-of-bool commented 6 years ago

I am also experiencing this issue. I don't have as many helpful logs or diagnostics, but I can also say I am similarly doing a lot of heavy compilation and small IO. I'm also using it as a Docker storage driver, but the lockups will happen with or without Docker. I'm on an older 0.6.5.6-0ubuntu18. A reboot temporarily fixes the issue.

bud4 commented 6 years ago

another time (4.4.0-116-generic #140-Ubuntu zfs 0.6.5.6-0ubuntu16) While run full backup zfsBlock.log.gz

AndCycle commented 6 years ago

hit similar issue, happened on Gentoo during emerge --emptytree -j4 @world, which do all package recompile with lot's parallel access and file locking, kworker 100% cpu usage,

kernel 4.14.39
ZFS 0.7.8
SPL 0.7.8

NMI backtrace https://pastebin.com/rYAhR1ri

imathrowback commented 6 years ago

As a troubleshooting step, I recommend pulling the drives and doing some read/write sector tests, make sure there are no controller or bad sector issues. I've hit this bug on a system where the drive seemed fine, except for the odd ZFS hang until the drive became completely unresponsive a few months later and failed.

stewartadam commented 6 years ago

Just chiming in to say 'me too'. Definitely seems to be a bug, disks are healthy.

bud4 commented 6 years ago

another time sync and other processes go lock during zfs receive and full backup (4.4.0-135-generic #161-Ubuntu) perf.data.gz kernelStateD.log

denizzzka commented 6 years ago

Have this issue too.

Not very big IO. If I use Docker and compile something inside Docker box in ~12 hours system hangs (X11 freezes, mouse cursor is not moves). HDD LED blinks few times immediately after it and no more.

Hardware reset fixes this issue but sometimes scrub finds errors and three times I lost my pool (import fails with different kernel stack traces)

Also, may be it is important, I have root on this ZFS pool.

Tried to change my hardware to AMD desktop and old Intel server without any result. So I think this is software issue.

I am on Debian:

Linux localhost 4.14.0-3-amd64 #1 SMP Debian 4.14.17-1 (2018-02-14) x86_64 GNU/Linux
zfs-dkms 0.7.9-3
bud4 commented 5 years ago

another time sync and other processes go lock during zfs receive and full backup attach echo w > /proc/sysrq-trigger echo l > /proc/sysrq-trigger kern.log

bud4 commented 5 years ago
Type Version/Name
Distribution Name Ubuntu
Distribution Version 16.04.5 LTS
Linux Kernel 4.4.0-141-generic
Architecture x86_64
ZFS Version 0.6.5.6-0ubuntu26
SPL Version 0.6.5.6-0ubuntu4

kernel.log.gz

it happened another time. Please help me!

AndCycle commented 5 years ago

@bud4 currently most developer already got their hand full,

so if you wanna they take on this problem first, you have to provide test script that 100% able to reproduce this problem.

so they can easily reproduce this problem on their end with some automation test environment.

and it also preferable you take a spare machine upgrade to 0.7.12 and still able to reproduce this issue.

denizzzka commented 5 years ago

@AndCycle This problem is a real pain in the ass. If one of us could have provided reproduce case this issue was been closed long time ago, I believe.

This issue may not appear for 2 months and then begin to appear twice per day.

Perhaps we need to turn on some additional logging and read it throught rs232? Tell it. Just in simple words, we are not all developers here and do not know how to hunt this bug.

bud4 commented 5 years ago

@AndCycle Very difficult for me to be able to create scripts that reproduce the event. Also because I only have a master and a slave both in production. The bug often occurs on the slave while the full backup is running and at the same time there is a zfs receive operation. The Hp-dataprotector agent reads the data from the mounted snapshots of the various filessytems. Maximum 6 simultaneous reading processes. The bug also occurs on the master while there are about 800 smbd connections. (Many read operations and few write operations, but some writing operations are very intensive). In the zpool there are about 150 filesystems with active dedup and compression. The total data is about 60 Tera in raidz2 on two vdev of 6 disks each. Please at least suggest what information to collect the next time the problem occurs. What would be useful? Do you have any scripts I can run?

bud4 commented 5 years ago

hopeless kernel.log.gz

bglod commented 5 years ago

I also have this issue, and I am running zfs on root. I also run several dockers.

When the issue occurs, I can always trace it back to a period of high I/O. The type of disk activity doesn't seem to matter because sometimes it will happen during a routine scrub, before which the disks had been fairly idle for hours. Not all periods of high I/O cause the problem though.

Seemingly random problems like this, to me, point to a hardware issue (likely disk or storage controller), but badblocks and SMART tests have been coming up clean, so I'm not sure what to think at this point. I'll keep an eye on this and try to help out with logs or test data where I can.

camaer commented 5 years ago

Same issue for us on multiple servers with high frequency create/destroy/clone.

Using Debian 9.7 and ZOL 0.7.12

Here are some logs:

execve("/usr/sbin/update-initramfs", ["update-initramfs", "-u"], [/* 16 vars */]) = 0
brk(NULL)                               = 0x56112fbcd000
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
mmap(NULL, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f17c42f7000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=65446, ...}) = 0
mmap(NULL, 65446, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f17c42e7000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\0\4\2\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1689360, ...}) = 0
mmap(NULL, 3795296, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f17c3d38000
mprotect(0x7f17c3ecd000, 2097152, PROT_NONE) = 0
mmap(0x7f17c40cd000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x195000) = 0x7f17c40cd000
mmap(0x7f17c40d3000, 14688, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f17c40d3000
close(3)                                = 0
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f17c42e5000
arch_prctl(ARCH_SET_FS, 0x7f17c42e5700) = 0
mprotect(0x7f17c40cd000, 16384, PROT_READ) = 0
mprotect(0x56112f1dd000, 8192, PROT_READ) = 0
mprotect(0x7f17c42fa000, 4096, PROT_READ) = 0
munmap(0x7f17c42e7000, 65446)           = 0
getpid()                                = 8030
rt_sigaction(SIGCHLD, {sa_handler=0x56112efd3ef0, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER, sa_restorer=0x7f17c3d6b060}, NULL, 8) = 0
geteuid()                               = 0
brk(NULL)                               = 0x56112fbcd000
brk(0x56112fbee000)                     = 0x56112fbee000
getppid()                               = 8028
stat("/root", {st_mode=S_IFDIR|0700, st_size=19, ...}) = 0
stat(".", {st_mode=S_IFDIR|0700, st_size=19, ...}) = 0
open("/usr/sbin/update-initramfs", O_RDONLY) = 3
fcntl(3, F_DUPFD, 10)                   = 10
close(3)                                = 0
fcntl(10, F_SETFD, FD_CLOEXEC)          = 0
rt_sigaction(SIGINT, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGINT, {sa_handler=0x56112efd3ef0, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER, sa_restorer=0x7f17c3d6b060}, NULL, 8) = 0
rt_sigaction(SIGQUIT, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGQUIT, {sa_handler=SIG_DFL, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER, sa_restorer=0x7f17c3d6b060}, NULL, 8) = 0
rt_sigaction(SIGTERM, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGTERM, {sa_handler=SIG_DFL, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER, sa_restorer=0x7f17c3d6b060}, NULL, 8) = 0
read(10, "#!/bin/sh\n\nSTATEDIR=/var/lib/ini"..., 8192) = 8192
faccessat(AT_FDCWD, "/etc/initramfs-tools/update-initramfs.conf", R_OK) = 0
open("/etc/initramfs-tools/update-initramfs.conf", O_RDONLY) = 3
fcntl(3, F_DUPFD, 10)                   = 11
close(3)                                = 0
fcntl(11, F_SETFD, FD_CLOEXEC)          = 0
read(11, "#\n# Configuration file for updat"..., 8192) = 378
read(11, "", 8192)                      = 0
close(11)                               = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f17c42e59d0) = 8031
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 1}], 0, NULL) = 8031
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=8031, si_uid=0, si_status=1, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 8031
read(10, ";\n\tu)\n\t\tupdate\n\t\t;;\nesac\n", 8192) = 25
pipe([3, 4])                            = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f17c42e59d0) = 8032
close(4)                                = 0
read(3, "4.15.18-10-pve\n4.15.18-9-pve\n4.1"..., 128) = 128
read(3, ".13.16-3-pve\n4.13.16-2-pve\n4.13."..., 128) = 128
read(3, "3.8-3-pve\n4.13.8-2-pve\n4.13.4-1-"..., 128) = 36
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=8032, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 36
read(3, "", 128)                        = 0
close(3)                                = 0
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 8032
faccessat(AT_FDCWD, "/proc/mounts", R_OK) = 0
stat("/usr/local/sbin/ischroot", 0x7ffe69edd4e0) = -1 ENOENT (No such file or directory)
stat("/usr/local/bin/ischroot", 0x7ffe69edd4e0) = -1 ENOENT (No such file or directory)
stat("/usr/sbin/ischroot", 0x7ffe69edd4e0) = -1 ENOENT (No such file or directory)
stat("/usr/bin/ischroot", {st_mode=S_IFREG|0755, st_size=10552, ...}) = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f17c42e59d0) = 8035
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 1}], 0, NULL) = 8035
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=8035, si_uid=0, si_status=1, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 8035
pipe([3, 4])                            = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f17c42e59d0) = 8036
close(4)                                = 0
read(3, "", 128)                        = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=8036, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 0
close(3)                                = 0
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 8036
stat("/boot/initrd.img-4.15.18-10-pve", {st_mode=S_IFREG|0644, st_size=36184842, ...}) = 0
stat("/usr/local/sbin/sha1sum", 0x7ffe69edd340) = -1 ENOENT (No such file or directory)
stat("/usr/local/bin/sha1sum", 0x7ffe69edd340) = -1 ENOENT (No such file or directory)
stat("/usr/sbin/sha1sum", 0x7ffe69edd340) = -1 ENOENT (No such file or directory)
stat("/usr/bin/sha1sum", {st_mode=S_IFREG|0755, st_size=47976, ...}) = 0
pipe([3, 4])                            = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f17c42e59d0) = 8037
close(4)                                = 0
stat("/usr/local/sbin/diff", 0x7ffe69edd340) = -1 ENOENT (No such file or directory)
stat("/usr/local/bin/diff", 0x7ffe69edd340) = -1 ENOENT (No such file or directory)
stat("/usr/sbin/diff", 0x7ffe69edd340)  = -1 ENOENT (No such file or directory)
stat("/usr/bin/diff", {st_mode=S_IFREG|0755, st_size=146824, ...}) = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f17c42e59d0) = 8038
close(3)                                = 0
close(-1)                               = -1 EBADF (Bad file descriptor)
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 8037
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=8037, si_uid=0, si_status=0, si_utime=9, si_stime=1} ---
rt_sigreturn({mask=[]})                 = 8037
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 8038
faccessat(AT_FDCWD, "/boot/initrd.img-4.15.18-10-pve", R_OK) = 0
faccessat(AT_FDCWD, "/boot/initrd.img-4.15.18-10-pve.dpkg-bak", R_OK) = 0
stat("/usr/local/sbin/rm", 0x7ffe69edd520) = -1 ENOENT (No such file or directory)
stat("/usr/local/bin/rm", 0x7ffe69edd520) = -1 ENOENT (No such file or directory)
stat("/usr/sbin/rm", 0x7ffe69edd520)    = -1 ENOENT (No such file or directory)
stat("/usr/bin/rm", 0x7ffe69edd520)     = -1 ENOENT (No such file or directory)
stat("/sbin/rm", 0x7ffe69edd520)        = -1 ENOENT (No such file or directory)
stat("/bin/rm", {st_mode=S_IFREG|0755, st_size=64424, ...}) = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f17c42e59d0) = 8039
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 8039
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=8039, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 8039
stat("/usr/local/sbin/ln", 0x7ffe69edd540) = -1 ENOENT (No such file or directory)
stat("/usr/local/bin/ln", 0x7ffe69edd540) = -1 ENOENT (No such file or directory)
stat("/usr/sbin/ln", 0x7ffe69edd540)    = -1 ENOENT (No such file or directory)
stat("/usr/bin/ln", 0x7ffe69edd540)     = -1 ENOENT (No such file or directory)
stat("/sbin/ln", 0x7ffe69edd540)        = -1 ENOENT (No such file or directory)
stat("/bin/ln", {st_mode=S_IFREG|0755, st_size=56240, ...}) = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f17c42e59d0) = 8040
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 8040
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=8040, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 8040
write(1, "update-initramfs: Generating /bo"..., 61update-initramfs: Generating /boot/initrd.img-4.15.18-10-pve
) = 61
stat("/usr/local/sbin/mkinitramfs", 0x7ffe69edd580) = -1 ENOENT (No such file or directory)
stat("/usr/local/bin/mkinitramfs", 0x7ffe69edd580) = -1 ENOENT (No such file or directory)
stat("/usr/sbin/mkinitramfs", {st_mode=S_IFREG|0755, st_size=10255, ...}) = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f17c42e59d0) = 8041
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 8041
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=8041, si_uid=0, si_status=0, si_utime=1, si_stime=3} ---
rt_sigreturn({mask=[]})                 = 8041
stat("/usr/local/sbin/mv", 0x7ffe69edd540) = -1 ENOENT (No such file or directory)
stat("/usr/local/bin/mv", 0x7ffe69edd540) = -1 ENOENT (No such file or directory)
stat("/usr/sbin/mv", 0x7ffe69edd540)    = -1 ENOENT (No such file or directory)
stat("/usr/bin/mv", 0x7ffe69edd540)     = -1 ENOENT (No such file or directory)
stat("/sbin/mv", 0x7ffe69edd540)        = -1 ENOENT (No such file or directory)
stat("/bin/mv", {st_mode=S_IFREG|0755, st_size=126416, ...}) = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f17c42e59d0) = 14543
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 14543
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=14543, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 14543
open("/var/lib/initramfs-tools/4.15.18-10-pve", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
fcntl(1, F_DUPFD, 10)                   = 11
close(1)                                = 0
fcntl(11, F_SETFD, FD_CLOEXEC)          = 0
dup2(3, 1)                              = 1
close(3)                                = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f17c42e59d0) = 14544
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 14544
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=14544, si_uid=0, si_status=0, si_utime=8, si_stime=1} ---
rt_sigreturn({mask=[]})                 = 14544
dup2(11, 1)                             = 1
close(11)                               = 0
stat("/usr/local/sbin/sync", 0x7ffe69edd560) = -1 ENOENT (No such file or directory)
stat("/usr/local/bin/sync", 0x7ffe69edd560) = -1 ENOENT (No such file or directory)
stat("/usr/sbin/sync", 0x7ffe69edd560)  = -1 ENOENT (No such file or directory)
stat("/usr/bin/sync", 0x7ffe69edd560)   = -1 ENOENT (No such file or directory)
stat("/sbin/sync", 0x7ffe69edd560)      = -1 ENOENT (No such file or directory)
stat("/bin/sync", {st_mode=S_IFREG|0755, st_size=31496, ...}) = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f17c42e59d0) = 14545
wait4(-1,
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
proc on /proc type proc (rw,relatime)
udev on /dev type devtmpfs (rw,nosuid,relatime,size=74244120k,nr_inodes=18561030,mode=755)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=14852988k,mode=755)
rpool/ROOT/pve-1 on / type zfs (rw,relatime,xattr,noacl)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k)
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755)
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/lib/systemd/systemd-cgroups-agent,name=systemd)
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)
cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=36,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=4157)
mqueue on /dev/mqueue type mqueue (rw,relatime)
debugfs on /sys/kernel/debug type debugfs (rw,relatime)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M)
sunrpc on /run/rpc_pipefs type rpc_pipefs (rw,relatime)
configfs on /sys/kernel/config type configfs (rw,relatime)
fusectl on /sys/fs/fuse/connections type fusectl (rw,relatime)
rpool on /rpool type zfs (rw,noatime,xattr,noacl)
rpool/ROOT on /rpool/ROOT type zfs (rw,noatime,xattr,noacl)
rpool/data on /rpool/data type zfs (rw,noatime,xattr,noacl)
rpool/data/subvol-102-disk-0 on /rpool/data/subvol-102-disk-0 type zfs (rw,noatime,xattr,posixacl)
rpool/data/subvol-110-disk-0 on /rpool/data/subvol-110-disk-0 type zfs (rw,noatime,xattr,posixacl)
rpool/data/subvol-111-disk-1 on /rpool/data/subvol-111-disk-1 type zfs (rw,noatime,xattr,posixacl)
rpool/data/subvol-120-disk-0 on /rpool/data/subvol-120-disk-0 type zfs (rw,noatime,xattr,posixacl)
rpool/data/subvol-134-disk-1 on /rpool/data/subvol-134-disk-1 type zfs (rw,noatime,xattr,posixacl)
rpool/data/subvol-137-disk-1 on /rpool/data/subvol-137-disk-1 type zfs (rw,noatime,xattr,posixacl)
lxcfs on /var/lib/lxcfs type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
dev-nas-1:/mnt/Pool-NAS/Backups on /mnt/dev-nas-1 type nfs (rw,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.1.104,mountvers=3,mountport=881,mountproto=udp,local_lock=none,addr=192.168.1.104)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,relatime)
dev-nas-1:/mnt/Pool-NAS/Proxmox on /mnt/pve/dev-nas-1 type nfs (rw,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.1.104,mountvers=3,mountport=881,mountproto=udp,local_lock=none,addr=192.168.1.104)
rpool/data/subvol-104-disk-0 on /rpool/data/subvol-104-disk-0 type zfs (rw,noatime,xattr,posixacl)
rpool/data/subvol-101-disk-0 on /rpool/data/subvol-101-disk-0 type zfs (rw,noatime,xattr,posixacl)
tmpfs on /run/user/0 type tmpfs (rw,nosuid,nodev,relatime,size=14852984k,mode=700)
rpool/data/subvol-109-disk-0 on /rpool/data/subvol-109-disk-0 type zfs (rw,noatime,xattr,posixacl)
/dev/fuse on /etc/pve type fuse (rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other)
tmpfs on /run/user/1006 type tmpfs (rw,nosuid,nodev,relatime,size=14852984k,mode=700,uid=1006,gid=1006)
tmpfs on /run/user/1001 type tmpfs (rw,nosuid,nodev,relatime,size=14852984k,mode=700,uid=1001,gid=1001)
[Mon Feb  4 11:08:25 2019] INFO: task kworker/u49:1:30643 blocked for more than 120 seconds.
[Mon Feb  4 11:08:25 2019]       Tainted: P        W IO     4.15.18-8-pve #1
[Mon Feb  4 11:08:25 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Mon Feb  4 11:08:25 2019] kworker/u49:1   D    0 30643      2 0x80000000
[Mon Feb  4 11:08:25 2019] Workqueue: writeback wb_workfn (flush-zfs-9)
[Mon Feb  4 11:08:25 2019] Call Trace:
[Mon Feb  4 11:08:25 2019]  __schedule+0x3e0/0x870
[Mon Feb  4 11:08:25 2019]  schedule+0x36/0x80
[Mon Feb  4 11:08:25 2019]  io_schedule+0x16/0x40
[Mon Feb  4 11:08:25 2019]  __lock_page+0xff/0x140
[Mon Feb  4 11:08:25 2019]  ? page_cache_tree_insert+0xe0/0xe0
[Mon Feb  4 11:08:25 2019]  write_cache_pages+0x2e9/0x4b0
[Mon Feb  4 11:08:25 2019]  ? zpl_write_common_iovec+0xe0/0xe0 [zfs]
[Mon Feb  4 11:08:25 2019]  ? _cond_resched+0x1a/0x50
[Mon Feb  4 11:08:25 2019]  ? mutex_lock+0x12/0x40
[Mon Feb  4 11:08:25 2019]  ? rrw_exit+0x5a/0x150 [zfs]
[Mon Feb  4 11:08:25 2019]  zpl_writepages+0x91/0x170 [zfs]
[Mon Feb  4 11:08:25 2019]  do_writepages+0x1f/0x70
[Mon Feb  4 11:08:25 2019]  __writeback_single_inode+0x45/0x330
[Mon Feb  4 11:08:25 2019]  writeback_sb_inodes+0x266/0x590
[Mon Feb  4 11:08:25 2019]  __writeback_inodes_wb+0x92/0xc0
[Mon Feb  4 11:08:25 2019]  wb_writeback+0x288/0x320
[Mon Feb  4 11:08:25 2019]  wb_workfn+0x1a3/0x440
[Mon Feb  4 11:08:25 2019]  ? wb_workfn+0x1a3/0x440
[Mon Feb  4 11:08:25 2019]  process_one_work+0x1e0/0x400
[Mon Feb  4 11:08:25 2019]  worker_thread+0x4b/0x420
[Mon Feb  4 11:08:25 2019]  kthread+0x105/0x140
[Mon Feb  4 11:08:25 2019]  ? process_one_work+0x400/0x400
[Mon Feb  4 11:08:25 2019]  ? kthread_create_worker_on_cpu+0x70/0x70
[Mon Feb  4 11:08:25 2019]  ? do_syscall_64+0x73/0x130
[Mon Feb  4 11:08:25 2019]  ? SyS_exit_group+0x14/0x20
[Mon Feb  4 11:08:25 2019]  ret_from_fork+0x35/0x40
NAME              PROPERTY              VALUE                  SOURCE
rpool/ROOT/pve-1  type                  filesystem             -
rpool/ROOT/pve-1  creation              Fri Oct 27 11:46 2017  -
rpool/ROOT/pve-1  used                  7.23G                  -
rpool/ROOT/pve-1  available             1.08T                  -
rpool/ROOT/pve-1  referenced            7.23G                  -
rpool/ROOT/pve-1  compressratio         1.60x                  -
rpool/ROOT/pve-1  mounted               yes                    -
rpool/ROOT/pve-1  quota                 none                   default
rpool/ROOT/pve-1  reservation           none                   default
rpool/ROOT/pve-1  recordsize            128K                   default
rpool/ROOT/pve-1  mountpoint            /                      local
rpool/ROOT/pve-1  sharenfs              off                    default
rpool/ROOT/pve-1  checksum              on                     default
rpool/ROOT/pve-1  compression           on                     inherited from rpool
rpool/ROOT/pve-1  atime                 off                    inherited from rpool
rpool/ROOT/pve-1  devices               on                     default
rpool/ROOT/pve-1  exec                  on                     default
rpool/ROOT/pve-1  setuid                on                     default
rpool/ROOT/pve-1  readonly              off                    default
rpool/ROOT/pve-1  zoned                 off                    default
rpool/ROOT/pve-1  snapdir               hidden                 default
rpool/ROOT/pve-1  aclinherit            restricted             default
rpool/ROOT/pve-1  createtxg             10                     -
rpool/ROOT/pve-1  canmount              on                     default
rpool/ROOT/pve-1  xattr                 on                     default
rpool/ROOT/pve-1  copies                1                      default
rpool/ROOT/pve-1  version               5                      -
rpool/ROOT/pve-1  utf8only              off                    -
rpool/ROOT/pve-1  normalization         none                   -
rpool/ROOT/pve-1  casesensitivity       sensitive              -
rpool/ROOT/pve-1  vscan                 off                    default
rpool/ROOT/pve-1  nbmand                off                    default
rpool/ROOT/pve-1  sharesmb              off                    default
rpool/ROOT/pve-1  refquota              none                   default
rpool/ROOT/pve-1  refreservation        none                   default
rpool/ROOT/pve-1  guid                  15527408840965499781   -
rpool/ROOT/pve-1  primarycache          all                    default
rpool/ROOT/pve-1  secondarycache        all                    default
rpool/ROOT/pve-1  usedbysnapshots       0B                     -
rpool/ROOT/pve-1  usedbydataset         7.23G                  -
rpool/ROOT/pve-1  usedbychildren        0B                     -
rpool/ROOT/pve-1  usedbyrefreservation  0B                     -
rpool/ROOT/pve-1  logbias               latency                default
rpool/ROOT/pve-1  dedup                 off                    default
rpool/ROOT/pve-1  mlslabel              none                   default
rpool/ROOT/pve-1  sync                  standard               inherited from rpool
rpool/ROOT/pve-1  dnodesize             legacy                 default
rpool/ROOT/pve-1  refcompressratio      1.60x                  -
rpool/ROOT/pve-1  written               7.23G                  -
rpool/ROOT/pve-1  logicalused           11.3G                  -
rpool/ROOT/pve-1  logicalreferenced     11.3G                  -
rpool/ROOT/pve-1  volmode               default                default
rpool/ROOT/pve-1  filesystem_limit      none                   default
rpool/ROOT/pve-1  snapshot_limit        none                   default
rpool/ROOT/pve-1  filesystem_count      none                   default
rpool/ROOT/pve-1  snapshot_count        none                   default
rpool/ROOT/pve-1  snapdev               hidden                 default
rpool/ROOT/pve-1  acltype               off                    default
rpool/ROOT/pve-1  context               none                   default
rpool/ROOT/pve-1  fscontext             none                   default
rpool/ROOT/pve-1  defcontext            none                   default
rpool/ROOT/pve-1  rootcontext           none                   default
rpool/ROOT/pve-1  relatime              on                     temporary
rpool/ROOT/pve-1  redundant_metadata    all                    default
rpool/ROOT/pve-1  overlay               off                    default

Let me know if you need anything else

CySlider commented 5 years ago

Happens to me too on my Manjaro 4.19 desktop system with zol 0.7.12-11 Really annoying. once ever few hours. Since my homes (not root) are on the pool a lot starts hanging. Most annoying I can't even reboot cleanly as the reboot never manages to unmount everything and reboot also hangs.

[87611.236429] INFO: task txg_sync:1006 blocked for more than 120 seconds.
[87611.236434]       Tainted: P           OE     4.19.23-1-MANJARO #1
[87611.236435] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[87611.236437] txg_sync        D    0  1006      2 0x80000080
[87611.236440] Call Trace:
[87611.236448]  ? __schedule+0x29b/0x8b0
[87611.236482]  ? zio_taskq_dispatch+0x6f/0x90 [zfs]
[87611.236484]  schedule+0x32/0x90
[87611.236487]  io_schedule+0x12/0x40
[87611.236493]  cv_wait_common+0xaa/0x130 [spl]
[87611.236497]  ? wait_woken+0x80/0x80
[87611.236524]  zio_wait+0x113/0x1b0 [zfs]
[87611.236548]  dsl_pool_sync+0xb9/0x400 [zfs]
[87611.236576]  spa_sync+0x48b/0xd30 [zfs]
[87611.236604]  txg_sync_thread+0x2c6/0x4a0 [zfs]
[87611.236607]  ? _raw_spin_unlock_irq+0x1d/0x30
[87611.236633]  ? txg_thread_exit.isra.4+0x60/0x60 [zfs]
[87611.236636]  ? __thread_exit+0x20/0x20 [spl]
[87611.236639]  thread_generic_wrapper+0x6f/0x80 [spl]
[87611.236642]  kthread+0x112/0x130
[87611.236644]  ? kthread_park+0x80/0x80
[87611.236646]  ret_from_fork+0x22/0x40
[87611.236779] INFO: task TaskSchedulerFo:9759 blocked for more than 120 seconds.
[87611.236781]       Tainted: P           OE     4.19.23-1-MANJARO #1
[87611.236782] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[87611.236783] TaskSchedulerFo D    0  9759   9346 0x00000080
[87611.236785] Call Trace:
[87611.236787]  ? __schedule+0x29b/0x8b0
[87611.236789]  schedule+0x32/0x90
[87611.236790]  io_schedule+0x12/0x40
[87611.236794]  cv_wait_common+0xaa/0x130 [spl]
[87611.236796]  ? wait_woken+0x80/0x80
[87611.236823]  zio_wait+0x113/0x1b0 [zfs]
[87611.236849]  zil_commit.part.8+0x85c/0x940 [zfs]
[87611.236875]  zfs_fsync+0x70/0xd0 [zfs]
[87611.236900]  zpl_fsync+0x63/0x90 [zfs]
[87611.236903]  ? __fget+0x6e/0xa0
[87611.236906]  do_fsync+0x38/0x70
[87611.236908]  __x64_sys_fdatasync+0x13/0x20
[87611.236910]  do_syscall_64+0x65/0x180
[87611.236912]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[87611.236914] RIP: 0033:0x7f28924bf73f
[87611.236919] Code: Bad RIP value.
[87611.236920] RSP: 002b:00007f28810473e0 EFLAGS: 00000293 ORIG_RAX: 000000000000004b
[87611.236922] RAX: ffffffffffffffda RBX: 0000000000000073 RCX: 00007f28924bf73f
[87611.236923] RDX: 0000000000000000 RSI: 000037ea5e9df880 RDI: 0000000000000073
[87611.236924] RBP: 00007f2881047430 R08: 0000000000000000 R09: 00000000f1b86835
[87611.236925] R10: 00000000f1b86835 R11: 0000000000000293 R12: 00007f28924bf700
[87611.236925] R13: 00007f2896b3a3f0 R14: 0000000000000000 R15: 000037ea5e9df8a0
[87734.114347] INFO: task txg_sync:1006 blocked for more than 120 seconds.
[87734.114351]       Tainted: P           OE     4.19.23-1-MANJARO #1
[87734.114352] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[87734.114353] txg_sync        D    0  1006      2 0x80000080
[87734.114355] Call Trace:
[87734.114363]  ? __schedule+0x29b/0x8b0
[87734.114392]  ? zio_taskq_dispatch+0x6f/0x90 [zfs]
[87734.114394]  schedule+0x32/0x90
[87734.114396]  io_schedule+0x12/0x40
[87734.114402]  cv_wait_common+0xaa/0x130 [spl]
[87734.114404]  ? wait_woken+0x80/0x80
[87734.114426]  zio_wait+0x113/0x1b0 [zfs]
[87734.114446]  dsl_pool_sync+0xb9/0x400 [zfs]
[87734.114468]  spa_sync+0x48b/0xd30 [zfs]
[87734.114491]  txg_sync_thread+0x2c6/0x4a0 [zfs]
[87734.114493]  ? _raw_spin_unlock_irq+0x1d/0x30
[87734.114514]  ? txg_thread_exit.isra.4+0x60/0x60 [zfs]
[87734.114517]  ? __thread_exit+0x20/0x20 [spl]
[87734.114519]  thread_generic_wrapper+0x6f/0x80 [spl]
[87734.114522]  kthread+0x112/0x130
[87734.114523]  ? kthread_park+0x80/0x80
[87734.114525]  ret_from_fork+0x22/0x40
[87734.114652] INFO: task TaskSchedulerFo:9759 blocked for more than 120 seconds.
[87734.114653]       Tainted: P           OE     4.19.23-1-MANJARO #1
[87734.114654] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[87734.114655] TaskSchedulerFo D    0  9759   9346 0x00000080
[87734.114656] Call Trace:
[87734.114658]  ? __schedule+0x29b/0x8b0
[87734.114660]  schedule+0x32/0x90
[87734.114661]  io_schedule+0x12/0x40
[87734.114664]  cv_wait_common+0xaa/0x130 [spl]
[87734.114666]  ? wait_woken+0x80/0x80
[87734.114687]  zio_wait+0x113/0x1b0 [zfs]
[87734.114708]  zil_commit.part.8+0x85c/0x940 [zfs]
[87734.114729]  zfs_fsync+0x70/0xd0 [zfs]
[87734.114750]  zpl_fsync+0x63/0x90 [zfs]
[87734.114753]  ? __fget+0x6e/0xa0
[87734.114755]  do_fsync+0x38/0x70
[87734.114756]  __x64_sys_fdatasync+0x13/0x20
[87734.114758]  do_syscall_64+0x65/0x180
[87734.114760]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[87734.114762] RIP: 0033:0x7f28924bf73f
[87734.114766] Code: Bad RIP value.
[87734.114767] RSP: 002b:00007f28810473e0 EFLAGS: 00000293 ORIG_RAX: 000000000000004b
[87734.114769] RAX: ffffffffffffffda RBX: 0000000000000073 RCX: 00007f28924bf73f
[87734.114769] RDX: 0000000000000000 RSI: 000037ea5e9df880 RDI: 0000000000000073
[87734.114770] RBP: 00007f2881047430 R08: 0000000000000000 R09: 00000000f1b86835
[87734.114771] R10: 00000000f1b86835 R11: 0000000000000293 R12: 00007f28924bf700
[87734.114771] R13: 00007f2896b3a3f0 R14: 0000000000000000 R15: 000037ea5e9df8a0
[87734.114773] INFO: task TaskSchedulerFo:24195 blocked for more than 120 seconds.
[87734.114774]       Tainted: P           OE     4.19.23-1-MANJARO #1
[87734.114775] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[87734.114776] TaskSchedulerFo D    0 24195   9346 0x00000080
[87734.114777] Call Trace:
[87734.114779]  ? __schedule+0x29b/0x8b0
[87734.114781]  ? preempt_count_add+0x79/0xb0
[87734.114783]  schedule+0x32/0x90
[87734.114786]  cv_wait_common+0xf6/0x130 [spl]
[87734.114787]  ? wait_woken+0x80/0x80
[87734.114807]  zil_commit.part.8+0x80/0x940 [zfs]
[87734.114809]  ? _raw_spin_lock+0x13/0x40
[87734.114809]  ? _raw_spin_unlock+0x16/0x30
[87734.114812]  ? tsd_set+0x203/0x4f0 [spl]
[87734.114813]  ? preempt_count_add+0x79/0xb0
[87734.114833]  zfs_fsync+0x70/0xd0 [zfs]
[87734.114853]  zpl_fsync+0x63/0x90 [zfs]
[87734.114854]  ? __fget+0x6e/0xa0
[87734.114855]  do_fsync+0x38/0x70
[87734.114857]  __x64_sys_fdatasync+0x13/0x20
[87734.114858]  do_syscall_64+0x65/0x180
[87734.114859]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[87734.114860] RIP: 0033:0x7f28924bf73f
[87734.114861] Code: Bad RIP value.
[87734.114862] RSP: 002b:00007f287f043cf0 EFLAGS: 00000293 ORIG_RAX: 000000000000004b
[87734.114863] RAX: ffffffffffffffda RBX: 000000000000005d RCX: 00007f28924bf73f
[87734.114864] RDX: 0000000000000000 RSI: 0000000000000002 RDI: 000000000000005d
[87734.114864] RBP: 00007f287f043d40 R08: 0000000000000000 R09: 0000000000000080
[87734.114865] R10: 0000000000001000 R11: 0000000000000293 R12: 0000000000000000
[87734.114865] R13: d763a120f905d5d9 R14: 0000000000000000 R15: 0000000000001000
[87734.114878] INFO: task CIPCServer::Thr:13852 blocked for more than 120 seconds.
[87734.114879]       Tainted: P           OE     4.19.23-1-MANJARO #1
[87734.114880] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[87734.114881] CIPCServer::Thr D    0 13852  13418 0x20020080
[87734.114882] Call Trace:
[87734.114884]  ? __schedule+0x29b/0x8b0
[87734.114885]  ? __switch_to_asm+0x40/0x70
[87734.114886]  ? preempt_count_add+0x79/0xb0
[87734.114887]  schedule+0x32/0x90
[87734.114890]  cv_wait_common+0xf6/0x130 [spl]
[87734.114892]  ? wait_woken+0x80/0x80
[87734.114911]  zil_commit.part.8+0x80/0x940 [zfs]
[87734.114913]  ? _raw_spin_lock+0x13/0x40
[87734.114913]  ? _raw_spin_unlock+0x16/0x30
[87734.114916]  ? tsd_set+0x203/0x4f0 [spl]
[87734.114918]  ? preempt_count_add+0x79/0xb0
[87734.114937]  zfs_fsync+0x70/0xd0 [zfs]
[87734.114957]  zpl_fsync+0x63/0x90 [zfs]
[87734.114958]  ? __fget+0x6e/0xa0
[87734.114959]  do_fsync+0x38/0x70
[87734.114960]  __ia32_sys_fdatasync+0x13/0x20
[87734.114962]  do_fast_syscall_32+0xa7/0x2a0
[87734.114963]  entry_SYSCALL_compat_after_hwframe+0x45/0x4d
[87856.992274] INFO: task txg_sync:1006 blocked for more than 120 seconds.
[87856.992279]       Tainted: P           OE     4.19.23-1-MANJARO #1
[87856.992281] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[87856.992284] txg_sync        D    0  1006      2 0x80000080
[87856.992287] Call Trace:
[87856.992297]  ? __schedule+0x29b/0x8b0
[87856.992340]  ? zio_taskq_dispatch+0x6f/0x90 [zfs]
[87856.992343]  schedule+0x32/0x90
[87856.992347]  io_schedule+0x12/0x40
[87856.992354]  cv_wait_common+0xaa/0x130 [spl]
[87856.992358]  ? wait_woken+0x80/0x80
[87856.992396]  zio_wait+0x113/0x1b0 [zfs]
[87856.992431]  dsl_pool_sync+0xb9/0x400 [zfs]
[87856.992473]  spa_sync+0x48b/0xd30 [zfs]
[87856.992513]  txg_sync_thread+0x2c6/0x4a0 [zfs]
[87856.992516]  ? _raw_spin_unlock_irq+0x1d/0x30
[87856.992553]  ? txg_thread_exit.isra.4+0x60/0x60 [zfs]
[87856.992558]  ? __thread_exit+0x20/0x20 [spl]
[87856.992562]  thread_generic_wrapper+0x6f/0x80 [spl]
[87856.992565]  kthread+0x112/0x130
[87856.992567]  ? kthread_park+0x80/0x80
[87856.992570]  ret_from_fork+0x22/0x40
[87856.992714] INFO: task TaskSchedulerFo:9759 blocked for more than 120 seconds.
[87856.992717]       Tainted: P           OE     4.19.23-1-MANJARO #1
[87856.992718] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[87856.992720] TaskSchedulerFo D    0  9759   9346 0x00000080
[87856.992722] Call Trace:
[87856.992725]  ? __schedule+0x29b/0x8b0
[87856.992728]  schedule+0x32/0x90
[87856.992730]  io_schedule+0x12/0x40
[87856.992735]  cv_wait_common+0xaa/0x130 [spl]
[87856.992738]  ? wait_woken+0x80/0x80
[87856.992775]  zio_wait+0x113/0x1b0 [zfs]
[87856.992811]  zil_commit.part.8+0x85c/0x940 [zfs]
[87856.992848]  zfs_fsync+0x70/0xd0 [zfs]
[87856.992884]  zpl_fsync+0x63/0x90 [zfs]
[87856.992888]  ? __fget+0x6e/0xa0
[87856.992891]  do_fsync+0x38/0x70
[87856.992893]  __x64_sys_fdatasync+0x13/0x20
[87856.992896]  do_syscall_64+0x65/0x180
[87856.992899]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[87856.992901] RIP: 0033:0x7f28924bf73f
[87856.992907] Code: Bad RIP value.
[87856.992908] RSP: 002b:00007f28810473e0 EFLAGS: 00000293 ORIG_RAX: 000000000000004b
[87856.992910] RAX: ffffffffffffffda RBX: 0000000000000073 RCX: 00007f28924bf73f
[87856.992911] RDX: 0000000000000000 RSI: 000037ea5e9df880 RDI: 0000000000000073
[87856.992912] RBP: 00007f2881047430 R08: 0000000000000000 R09: 00000000f1b86835
[87856.992913] R10: 00000000f1b86835 R11: 0000000000000293 R12: 00007f28924bf700
[87856.992914] R13: 00007f2896b3a3f0 R14: 0000000000000000 R15: 000037ea5e9df8a0
[87856.992918] INFO: task TaskSchedulerFo:24195 blocked for more than 120 seconds.
[87856.992920]       Tainted: P           OE     4.19.23-1-MANJARO #1
[87856.992921] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[87856.992922] TaskSchedulerFo D    0 24195   9346 0x00000080
[87856.992924] Call Trace:
[87856.992927]  ? __schedule+0x29b/0x8b0
[87856.992930]  ? preempt_count_add+0x79/0xb0
[87856.992933]  schedule+0x32/0x90
[87856.992938]  cv_wait_common+0xf6/0x130 [spl]
[87856.992940]  ? wait_woken+0x80/0x80
[87856.992975]  zil_commit.part.8+0x80/0x940 [zfs]
[87856.992978]  ? _raw_spin_lock+0x13/0x40
[87856.992979]  ? _raw_spin_unlock+0x16/0x30
[87856.992984]  ? tsd_set+0x203/0x4f0 [spl]
[87856.992986]  ? preempt_count_add+0x79/0xb0
[87856.993021]  zfs_fsync+0x70/0xd0 [zfs]
[87856.993056]  zpl_fsync+0x63/0x90 [zfs]
[87856.993058]  ? __fget+0x6e/0xa0
[87856.993060]  do_fsync+0x38/0x70
[87856.993062]  __x64_sys_fdatasync+0x13/0x20
[87856.993064]  do_syscall_64+0x65/0x180
[87856.993067]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[87856.993068] RIP: 0033:0x7f28924bf73f
[87856.993071] Code: Bad RIP value.
[87856.993071] RSP: 002b:00007f287f043cf0 EFLAGS: 00000293 ORIG_RAX: 000000000000004b
[87856.993073] RAX: ffffffffffffffda RBX: 000000000000005d RCX: 00007f28924bf73f
[87856.993074] RDX: 0000000000000000 RSI: 0000000000000002 RDI: 000000000000005d
[87856.993075] RBP: 00007f287f043d40 R08: 0000000000000000 R09: 0000000000000080
[87856.993076] R10: 0000000000001000 R11: 0000000000000293 R12: 0000000000000000
[87856.993077] R13: d763a120f905d5d9 R14: 0000000000000000 R15: 0000000000001000
[87856.993092] INFO: task CIPCServer::Thr:13852 blocked for more than 120 seconds.
[87856.993094]       Tainted: P           OE     4.19.23-1-MANJARO #1
[87856.993096] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[87856.993097] CIPCServer::Thr D    0 13852  13418 0x20020084
[87856.993099] Call Trace:
[87856.993102]  ? __schedule+0x29b/0x8b0
[87856.993104]  ? __switch_to_asm+0x40/0x70
[87856.993106]  ? preempt_count_add+0x79/0xb0
[87856.993109]  schedule+0x32/0x90
[87856.993113]  cv_wait_common+0xf6/0x130 [spl]
[87856.993116]  ? wait_woken+0x80/0x80
[87856.993150]  zil_commit.part.8+0x80/0x940 [zfs]
[87856.993153]  ? _raw_spin_lock+0x13/0x40
[87856.993154]  ? _raw_spin_unlock+0x16/0x30
[87856.993159]  ? tsd_set+0x203/0x4f0 [spl]
[87856.993161]  ? preempt_count_add+0x79/0xb0
[87856.993196]  zfs_fsync+0x70/0xd0 [zfs]
[87856.993230]  zpl_fsync+0x63/0x90 [zfs]
[87856.993232]  ? __fget+0x6e/0xa0
[87856.993234]  do_fsync+0x38/0x70
[87856.993236]  __ia32_sys_fdatasync+0x13/0x20
[87856.993238]  do_fast_syscall_32+0xa7/0x2a0
[87856.993241]  entry_SYSCALL_compat_after_hwframe+0x45/0x4d 

Also the process is still stuck in D even waiting far longer:

$ ps aux | grep D
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
...
root      1006  0.0  0.0      0     0 ?        D    Feb23   0:19 [txg_sync] 

My 32Gb memory is still only 40% full

S.M.A.R.T data of all disks is fine and it seems the unmount is stuck at random disks each time.

My pool also is fine:


  pool: tank
 state: ONLINE
  scan: resilvered 11,8T in 13h51m with 0 errors on Wed Jan  2 02:00:28 2019
config:

    NAME                                              STATE READ WRITE CKSUM
    tank                                              ONLINE 0     0     0
      raidz2-0                                        ONLINE 0     0     0
        sdx_crypt10                                   ONLINE 0     0     0
        sdx_crypt11                                   ONLINE 0     0     0
        sdx_crypt7                                    ONLINE 0     0     0
        sdx_crypt6                                    ONLINE 0     0     0
    cache
      nvme-Samsung_SSD_960_EVO_500GB_S3EUNX0J215828E  ONLINE 0     0     0

errors: No known data errors

My pool consists of 4 10Tb LUKS encrypted disks. And one m.2 NVME drive as cache.

Here my iostat -mx output:

iostat -mx
Linux 4.19.23-1-MANJARO (I-KNOW-YOU.torgato.de)     25.02.2019 _x86_64_    (16 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           5,86    0,00    1,90    4,02    0,00   88,22

Device            r/s     w/s     rMB/s     wMB/s   rrqm/s wrqm/s  %rrqm  %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm  %util
nvme1n1          0,66    3,77      0,02      0,10     0,01 2,92   0,91  43,67    0,33    1,19   0,61    24,97    26,17 136,18  60,30
nvme0n1          0,03    0,99      0,00      0,08     0,00 0,00   0,00   0,00    0,44    2,01   0,60    36,25    81,47 592,53  60,21
sda              0,22    4,78      0,01      0,10     0,00 0,03   0,00   0,71    4,66    2,96   0,00    47,71    21,38 0,43   0,22
sdb              0,22    4,80      0,01      0,10     0,00 0,03   0,00   0,72    4,58    3,01   0,00    47,46    21,32 0,43   0,22
sdc              0,22    4,78      0,01      0,10     0,00 0,03   0,00   0,72    4,63    2,96   0,00    47,53    21,37 0,43   0,22
sdd              0,00    0,00      0,00      0,00     0,00 0,00   0,00   0,00    1,69    0,00   0,00    19,63     0,00 1,65   0,00
sde              0,22    4,80      0,01      0,10     0,00 0,03   0,00   0,71    4,53    2,84   0,00    47,66    21,26 0,43   0,22
dm-0             0,51    4,54      0,01      0,10     0,00 0,00   0,00   0,00   23,68    0,73   0,02    20,92    22,54 2,59   1,31
dm-1             0,51    4,55      0,01      0,10     0,00 0,00   0,00   0,00   22,65    0,72   0,01    20,97    22,43 2,48   1,26
dm-2             0,51    4,55      0,01      0,10     0,00 0,00   0,00   0,00   24,22    0,73   0,02    20,96    22,48 2,65   1,34
dm-3             0,51    4,53      0,01      0,10     0,00 0,00   0,00   0,00   23,64    0,73   0,28    21,04    22,56 20,12  10,14 

Shutdown log:

shutdown log:

Feb 23 01:12:39 I-KNOW-YOU.torgato.de systemd[1]: Stopped Cryptography Setup for sdx_crypt10.
Feb 23 01:12:39 I-KNOW-YOU.torgato.de audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-cryptsetup@sdx_crypt10 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
Feb 23 01:12:39 I-KNOW-YOU.torgato.de kernel: audit: type=1131 audit(1550880759.194:135): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-cryptsetup@sdx_crypt10 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
Feb 23 01:12:39 I-KNOW-YOU.torgato.de systemd-cryptsetup[28103]: device-mapper: remove ioctl on sdx_crypt6  failed: Device or resource busy
Feb 23 01:12:39 I-KNOW-YOU.torgato.de systemd-cryptsetup[28103]: Device sdx_crypt6 is still in use.
Feb 23 01:12:39 I-KNOW-YOU.torgato.de systemd-cryptsetup[28103]: Failed to deactivate: Device or resource busy
Feb 23 01:12:39 I-KNOW-YOU.torgato.de systemd[1]: systemd-cryptsetup@sdx_crypt6.service: Control process exited, code=exited, status=1/FAILURE
Feb 23 01:12:39 I-KNOW-YOU.torgato.de systemd[1]: systemd-cryptsetup@sdx_crypt6.service: Failed with result 'exit-code'.
Feb 23 01:12:39 I-KNOW-YOU.torgato.de systemd[1]: Stopped Cryptography Setup for sdx_crypt6.
Feb 23 01:12:39 I-KNOW-YOU.torgato.de audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-cryptsetup@sdx_crypt6 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
Feb 23 01:12:39 I-KNOW-YOU.torgato.de kernel: audit: type=1131 audit(1550880759.217:136): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-cryptsetup@sdx_crypt6 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
Feb 23 01:13:06 I-KNOW-YOU.torgato.de systemd[1]: Received SIGINT.
Feb 23 01:13:06 I-KNOW-YOU.torgato.de systemd[1]: Condition check resulted in Generate shutdown-ramfs being skipped.
Feb 23 01:13:06 I-KNOW-YOU.torgato.de systemd[1]: Unmounting /tank/Torge...
Feb 23 01:13:06 I-KNOW-YOU.torgato.de systemd[1]: Unmounting Temporary Directory (/tmp)...
Feb 23 01:13:06 I-KNOW-YOU.torgato.de umount[28167]: umount: /tank/Torge: target is busy.
Feb 23 01:13:06 I-KNOW-YOU.torgato.de systemd[1]: tank-Torge.mount: Mount process exited, code=exited, status=32/n/a
Feb 23 01:13:06 I-KNOW-YOU.torgato.de umount[28168]: umount: /tmp: target is busy.
Feb 23 01:13:06 I-KNOW-YOU.torgato.de systemd[1]: Failed unmounting /tank/Torge.
Feb 23 01:13:06 I-KNOW-YOU.torgato.de systemd[1]: tmp.mount: Mount process exited, code=exited, status=32/n/a
Feb 23 01:13:06 I-KNOW-YOU.torgato.de systemd[1]: Failed unmounting Temporary Directory (/tmp).
Feb 23 01:13:06 I-KNOW-YOU.torgato.de systemd[1]: Received SIGINT.
Feb 23 01:13:06 I-KNOW-YOU.torgato.de systemd[1]: Unmounting Temporary Directory (/tmp)...
Feb 23 01:13:06 I-KNOW-YOU.torgato.de systemd[1]: Condition check resulted in Generate shutdown-ramfs being skipped.
Feb 23 01:13:06 I-KNOW-YOU.torgato.de systemd[1]: Unmounting /tank/Torge...
Feb 23 01:13:06 I-KNOW-YOU.torgato.de umount[28170]: umount: /tmp: target is busy.
Feb 23 01:13:06 I-KNOW-YOU.torgato.de systemd[1]: tmp.mount: Mount process exited, code=exited, status=32/n/a
Feb 23 01:13:06 I-KNOW-YOU.torgato.de systemd[1]: Failed unmounting Temporary Directory (/tmp).
Feb 23 01:13:06 I-KNOW-YOU.torgato.de umount[28171]: umount: /tank/Torge: target is busy.
Feb 23 01:13:06 I-KNOW-YOU.torgato.de systemd[1]: tank-Torge.mount: Mount process exited, code=exited, status=32/n/a
Feb 23 01:13:06 I-KNOW-YOU.torgato.de systemd[1]: Failed unmounting /tank/Torge.
Feb 23 01:13:06 I-KNOW-YOU.torgato.de systemd[1]: Received SIGINT.
Feb 23 01:13:06 I-KNOW-YOU.torgato.de systemd[1]: Condition check resulted in Generate shutdown-ramfs being skipped.
Feb 23 01:13:06 I-KNOW-YOU.torgato.de systemd[1]: Unmounting /tank/Torge...
Feb 23 01:13:06 I-KNOW-YOU.torgato.de systemd[1]: Unmounting Temporary Directory (/tmp)...
Feb 23 01:13:06 I-KNOW-YOU.torgato.de umount[28174]: umount: /tank/Torge: target is busy.
Feb 23 01:13:06 I-KNOW-YOU.torgato.de systemd[1]: tank-Torge.mount: Mount process exited, code=exited, status=32/n/a
Feb 23 01:13:06 I-KNOW-YOU.torgato.de systemd[1]: Failed unmounting /tank/Torge.
Feb 23 01:13:06 I-KNOW-YOU.torgato.de umount[28175]: umount: /tmp: target is busy.
Feb 23 01:13:06 I-KNOW-YOU.torgato.de systemd[1]: tmp.mount: Mount process exited, code=exited, status=32/n/a
Feb 23 01:13:06 I-KNOW-YOU.torgato.de systemd[1]: Failed unmounting Temporary Directory (/tmp).
Feb 23 01:13:06 I-KNOW-YOU.torgato.de systemd[1]: Received SIGINT.
Feb 23 01:13:06 I-KNOW-YOU.torgato.de systemd[1]: Unmounting /tank/Torge...
Feb 23 01:13:06 I-KNOW-YOU.torgato.de systemd[1]: Unmounting Temporary Directory (/tmp)...
Feb 23 01:13:06 I-KNOW-YOU.torgato.de systemd[1]: Condition check resulted in Generate shutdown-ramfs being skipped.
Feb 23 01:13:06 I-KNOW-YOU.torgato.de umount[28177]: umount: /tank/Torge: target is busy.
Feb 23 01:13:06 I-KNOW-YOU.torgato.de systemd[1]: tank-Torge.mount: Mount process exited, code=exited, status=32/n/a
Feb 23 01:13:06 I-KNOW-YOU.torgato.de systemd[1]: Failed unmounting /tank/Torge.
Feb 23 01:13:06 I-KNOW-YOU.torgato.de umount[28178]: umount: /tmp: target is busy.
Feb 23 01:13:06 I-KNOW-YOU.torgato.de systemd[1]: tmp.mount: Mount process exited, code=exited, status=32/n/a
Feb 23 01:13:06 I-KNOW-YOU.torgato.de systemd[1]: Failed unmounting Temporary Directory (/tmp).
Feb 23 01:13:07 I-KNOW-YOU.torgato.de systemd[1]: Received SIGINT.
Feb 23 01:13:07 I-KNOW-YOU.torgato.de systemd[1]: Condition check resulted in Generate shutdown-ramfs being skipped.
Feb 23 01:13:07 I-KNOW-YOU.torgato.de systemd[1]: Unmounting Temporary Directory (/tmp)...
Feb 23 01:13:07 I-KNOW-YOU.torgato.de systemd[1]: Unmounting /tank/Torge...
Feb 23 01:13:07 I-KNOW-YOU.torgato.de umount[28181]: umount: /tmp: target is busy.
Feb 23 01:13:07 I-KNOW-YOU.torgato.de systemd[1]: tmp.mount: Mount process exited, code=exited, status=32/n/a
Feb 23 01:13:07 I-KNOW-YOU.torgato.de systemd[1]: Failed unmounting Temporary Directory (/tmp).
Feb 23 01:13:07 I-KNOW-YOU.torgato.de umount[28182]: umount: /tank/Torge: target is busy.
Feb 23 01:13:07 I-KNOW-YOU.torgato.de systemd[1]: tank-Torge.mount: Mount process exited, code=exited, status=32/n/a
Feb 23 01:13:07 I-KNOW-YOU.torgato.de systemd[1]: Failed unmounting /tank/Torge.
Feb 23 01:13:07 I-KNOW-YOU.torgato.de systemd[1]: Received SIGINT.
Feb 23 01:13:07 I-KNOW-YOU.torgato.de systemd[1]: Condition check resulted in Generate shutdown-ramfs being skipped.
Feb 23 01:13:07 I-KNOW-YOU.torgato.de systemd[1]: Unmounting /tank/Torge...
Feb 23 01:13:07 I-KNOW-YOU.torgato.de systemd[1]: Unmounting Temporary Directory (/tmp)...
Feb 23 01:13:07 I-KNOW-YOU.torgato.de umount[28184]: umount: /tank/Torge: target is busy.
Feb 23 01:13:07 I-KNOW-YOU.torgato.de systemd[1]: tank-Torge.mount: Mount process exited, code=exited, status=32/n/a
Feb 23 01:13:07 I-KNOW-YOU.torgato.de systemd[1]: Failed unmounting /tank/Torge.
Feb 23 01:13:07 I-KNOW-YOU.torgato.de umount[28185]: umount: /tmp: target is busy.
Feb 23 01:13:07 I-KNOW-YOU.torgato.de systemd[1]: tmp.mount: Mount process exited, code=exited, status=32/n/a
Feb 23 01:13:07 I-KNOW-YOU.torgato.de systemd[1]: Failed unmounting Temporary Directory (/tmp).
Feb 23 01:13:07 I-KNOW-YOU.torgato.de systemd[1]: Received SIGINT.
Feb 23 01:13:07 I-KNOW-YOU.torgato.de systemd[1]: Condition check resulted in Generate shutdown-ramfs being skipped.
Feb 23 01:13:07 I-KNOW-YOU.torgato.de systemd[1]: Unmounting Temporary Directory (/tmp)...
Feb 23 01:13:07 I-KNOW-YOU.torgato.de systemd[1]: Unmounting /tank/Torge...
Feb 23 01:13:07 I-KNOW-YOU.torgato.de umount[28187]: umount: /tmp: target is busy.
Feb 23 01:13:07 I-KNOW-YOU.torgato.de systemd[1]: tmp.mount: Mount process exited, code=exited, status=32/n/a
Feb 23 01:13:07 I-KNOW-YOU.torgato.de systemd[1]: Failed unmounting Temporary Directory (/tmp).
Feb 23 01:13:07 I-KNOW-YOU.torgato.de umount[28188]: umount: /tank/Torge: target is busy.
Feb 23 01:13:07 I-KNOW-YOU.torgato.de systemd[1]: tank-Torge.mount: Mount process exited, code=exited, status=32/n/a
Feb 23 01:13:07 I-KNOW-YOU.torgato.de systemd[1]: Failed unmounting /tank/Torge.
Feb 23 01:13:07 I-KNOW-YOU.torgato.de systemd[1]: Received SIGINT.
Feb 23 01:13:07 I-KNOW-YOU.torgato.de systemd[1]: Forcibly rebooting: Ctrl-Alt-Del was pressed more than 7 times within 2s
Feb 23 01:13:07 I-KNOW-YOU.torgato.de systemd[1]: Shutting down.
Feb 23 01:13:07 I-KNOW-YOU.torgato.de kernel: systemd-shutdow: 49 output lines suppressed due to ratelimiting
Feb 23 01:13:07 I-KNOW-YOU.torgato.de systemd-shutdown[1]: Syncing filesystems and block devices. 
   HERE IT HANGS

Smart data sdb/sdx_crypt6

[I-KNOW-YOU log]# smartctl -a /dev/sdb
smartctl 7.0 2018-12-30 r4883 [x86_64-linux-4.19.23-1-MANJARO] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate IronWolf
Device Model:     ST10000VN0004-1ZD101
Serial Number:    ZA20SV9S
LU WWN Device Id: 5 000c50 09396796e
Firmware Version: SC60
User Capacity:    10.000.831.348.736 bytes [10,0 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Mon Feb 25 00:35:55 2019 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)    Offline data collection activity
                    was completed without error.
                    Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)    The previous self-test routine completed
                    without error or no self-test has ever
                    been run.
Total time to complete Offline
data collection:         (  584) seconds.
Offline data collection
capabilities:              (0x7b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine
recommended polling time:      (   1) minutes.
Extended self-test routine
recommended polling time:      ( 881) minutes.
Conveyance self-test routine
recommended polling time:      (   2) minutes.
SCT capabilities:            (0x50bd)    SCT Status supported.
                    SCT Error Recovery Control supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   083   064   044    Pre-fail Always       -       190984434
  3 Spin_Up_Time            0x0003   088   086   000    Pre-fail Always       -       0
  4 Start_Stop_Count        0x0032   099   099   020    Old_age Always       -       1689
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail Always       -       0
  7 Seek_Error_Rate         0x000f   091   060   045    Pre-fail Always       -       1383480407
  9 Power_On_Hours          0x0032   088   088   000    Old_age Always       -       10784 (194 247 0)
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail Always       -       0
 12 Power_Cycle_Count       0x0032   099   099   020    Old_age Always       -       1028
184 End-to-End_Error        0x0032   100   100   099    Old_age Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age Always       -       0
188 Command_Timeout         0x0032   100   096   000    Old_age Always       -       2851901801115
189 High_Fly_Writes         0x003a   085   085   000    Old_age Always       -       15
190 Airflow_Temperature_Cel 0x0022   067   045   040    Old_age Always       -       33 (Min/Max 18/39)
191 G-Sense_Error_Rate      0x0032   099   099   000    Old_age Always       -       3678
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age Always       -       76
193 Load_Cycle_Count        0x0032   082   082   000    Old_age Always       -       37441
194 Temperature_Celsius     0x0022   033   055   000    Old_age Always       -       33 (0 18 0 0 0)
195 Hardware_ECC_Recovered  0x001a   011   001   000    Old_age Always       -       190984434
197 Current_Pending_Sector  0x0012   100   100   000    Old_age Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age Always       -       0
200 Multi_Zone_Error_Rate   0x0023   100   100   001    Pre-fail Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age Offline      -       5122 (129 181 0)
241 Total_LBAs_Written      0x0000   100   253   000    Old_age Offline      -       56499736146
242 Total_LBAs_Read         0x0000   100   253   000    Old_age Offline      -       254319596354

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay. 

Please this is so damn annoying. Any data I can collect to help get to the bottom of this? Why is the kernel not even able to reboot?

CySlider commented 5 years ago

Jonathon from the Manjaro Team had me check the i/o scheduler

I did and it turned out, that while zfs did set the i/o scheduler of the dm cryptsetup devices to none (I'm using luks encrypted disks) it did not manage to change the scheduler of the underlying physical sdx disks to none as well. They were set to mq-deadline instead

I now changed all of them to none as well and until now the problem did not reappear. It is far from long enough to be sure, but it is definitely already a notable time that passed and my hopes are up.

So if you have this issue and you use LUKS devices, check that the io scheduler is set to none on both layers and if not change it.

Update: can also be "noop" on your system

denizzzka commented 5 years ago

(I'm using luks encrypted disks)

Hmmm, me too! Is anybody here who NOT used LUKS but faced with this issue?

@CySlider you use ZFS pool on LUKS or LUKS vdev on ZFS?

denizzzka commented 5 years ago

check that the io scheduler is set to none

Scheduler for HDDs is called "noop", not "none". Maybe this is important.

(So, I changed it to "noop" and stay fingers crossed)

CySlider commented 5 years ago

On my system it is "none" but you are right it might also be called "noop". You can check the available modes with: cat /sys/block/THE_DEVICE/queue/scheduler

bglod commented 5 years ago

@denizzzka I am NOT using LUKS, but I am using ZFS native encryption.

The scheduler for all (12) devices is set to mq-deadline. I will set these to none and see what happens!

CySlider commented 5 years ago

If you followed by guide to persist changes after reboot. I had an error in there and corrected the post. Make sure you did not copy the mistake.

CySlider commented 5 years ago

(I'm using luks encrypted disks)

Hmmm, me too! Is anybody here who NOT used LUKS but faced with this issue?

@CySlider you use ZFS pool on LUKS or LUKS vdev on ZFS?

nearly missed your edit.

I am not sure which of your options it is. I use LUKS transparently to ZFS. I set up my whole drives individually with LUKS. mount them decrypted. And only after that added them as devices to ZFS. Hope this answers your question.

denizzzka commented 5 years ago

I started adding new partition into rpool mirror.

While resilvering I added a new pool and started transferring gigabyte files to it from rpool. Then it caused: https://gist.github.com/denizzzka/9fb23dbeb6db3ba6074eed04d6da47bb

Probably, if I were use rpool everything would hang, but rpool is used only for read and not affected.

CySlider commented 5 years ago

Be aware that the message itself is not an indicator of something going wrong. It could also just mean your disks are really busy for two minutes. It is only a problem if it does get totally stuck as it did in my case.

denizzzka commented 5 years ago

Be aware that the message itself is not an indicator of something going wrong.

After that copying process is just stalled...

denizzzka commented 5 years ago

Still hangs with noop queue scheduler for /dev/sd[a-z]

bglod commented 5 years ago

Still hangs with noop queue scheduler for /dev/sd[a-z]

Same result here. I switched it over to bfq to see what happens.

behlendorf commented 5 years ago

The hangs reported here are very similar to ones we've seen reported in the past. They're caused by an I/O which is submitted to the kernel but for some as-yet-unknown reason neither completes, fails, nor times out. This results in the observed hang. Since the issue can't be reproduced easily we have not be able to determine which component in the stack is responsible (ZFS, LUKS, scsi layer/driver, or hardware).

In order the better understand how this is possible we've added diagnostic and optional recovery code. These changes are not in the 0.7.13 release but they are in the current 0.8.0-rc releases.

@bglod since you're running ZFS's native encryption you should also have this debugging. By default, it's largely disabled but if a hung I/O is detected a new deadman event will be posted. If you encounter this issue again it would be help our analysis if you could check zpool events -v for such an event.

If you'd like to enable the recovery logic, which attempts to retry the hung I/O, you'll need to set zfs_deadman_failmode property to continue. This should allow the system to continue operating after the zfs_deadman_ziotime_ms=300s time is reached.

       zfs_deadman_failmode (charp)
                   Controls the failure  behavior  when  the  deadman  detects  a
                   "hung" I/O.  Valid values are wait, continue, and panic.

                   wait - Wait for a "hung" I/O to complete.  For each "hung" I/O
                   a "deadman" event will be posted describing that I/O.

                   continue - Attempt to recover from a  "hung"  I/O  by  re-dis‐
                   patching it to the I/O pipeline if possible.

                   panic  -  Panic the system.  This can be used to facilitate an
                   automatic fail-over to a properly configured  fail-over  partner.

                   Default value: wait.

Furthermore, we want to identify the root cause for this issue so if anyone is running an 0.8.0-rc and is willing to provide additional debug logs please let me know. There are a few more debugging options we can try enabling to get more detailed information about the outstanding I/Os.

bglod commented 5 years ago

I did not see a hang when the scheduler was set to bfq for the last 10 days or so. This doesn't mean bfq is the way to go, of course. zpool events did not show any deadman events, but it seems the event log only goes back as far as the last reboot. Because I hadn't experienced another hang, it's understandable there were no deadman entries.

I upgraded my zfs from becdcec7b9cd6b8beaa2273495b97c06de24e9e5 to 060f0226e6396a3c7104fedc8d2af7063a27c1f9 (latest for Arch's zfs-linux-git) and set the zfs_deadman_failmode=continue module option. I also changed the scheduler back to the default of mq-deadline.

I setup a script to periodically check for deadman events and e-mail me if any are found. I will report back with any findings.

Update: still locks up. Running zpool events -v just hangs forever.

gerardba commented 5 years ago

Hi, after upgrading to 0.8 I'm having a similar ZFS stuck issue that I'm able to reproduce with mdtest over NFS, as detailed in https://github.com/zfsonlinux/zfs/issues/7473

I provided further logs in the issue above but let me know if there's anything else I can provide.

Once the system go stuck I ran:

# echo continue > /sys/module/zfs/parameters/zfs_deadman_failmode But it remains stuck and no events have been recorded. In this case txg_sync is in D state, but there's no process consuming CPU on the node. Perhaps I had to set the zfs_deadman_failmode mode before ZFS got stuck? Should I set zio_deadman_log_all to 1?

My current module parameters:
# arc_summary ------------------------------------------------------------------------ ZFS Subsystem Report Fri Mar 29 12:09:35 2019 ARC Summary: (HEALTHY) Memory Throttle Count: 0 ARC Misc: Deleted: 2.77M Mutex Misses: 1.95k Evict Skips: 583 ARC Size: 52.46% 237.05 GiB Target Size: (Adaptive) 52.59% 237.63 GiB Min Size (Hard Limit): 0.22% 1.00 GiB Max Size (High Water): 451:1 451.88 GiB ARC Size Breakdown: Recently Used Cache Size: 16.31% 33.56 GiB Frequently Used Cache Size: 83.69% 172.25 GiB ARC Hash Breakdown: Elements Max: 3.62M Elements Current: 91.42% 3.31M Collisions: 425.56k Chain Max: 3 Chains: 79.41k ARC Total accesses: 1.74G Cache Hit Ratio: 99.93% 1.74G Cache Miss Ratio: 0.07% 1.26M Actual Hit Ratio: 99.92% 1.74G Data Demand Efficiency: 99.46% 49.98M CACHE HITS BY CACHE LIST: Most Recently Used: 5.55% 96.45M Most Frequently Used: 94.44% 1.64G Most Recently Used Ghost: 0.06% 1.04M Most Frequently Used Ghost: 0.00% 35.40k CACHE HITS BY DATA TYPE: Demand Data: 2.86% 49.71M Prefetch Data: 0.00% 0 Demand Metadata: 97.11% 1.69G Prefetch Metadata: 0.02% 425.38k CACHE MISSES BY DATA TYPE: Demand Data: 21.32% 268.55k Prefetch Data: 0.00% 0 Demand Metadata: 77.29% 973.76k Prefetch Metadata: 1.39% 17.57k ZFS Tunables: dbuf_cache_hiwater_pct 10 dbuf_cache_lowater_pct 10 dbuf_cache_max_bytes 8423723520 dbuf_cache_shift 5 dbuf_metadata_cache_max_bytes 4211861760 dbuf_metadata_cache_shift 6 dmu_object_alloc_chunk_shift 7 ignore_hole_birth 1 l2arc_feed_again 1 l2arc_feed_min_ms 200 l2arc_feed_secs 1 l2arc_headroom 2 l2arc_headroom_boost 200 l2arc_noprefetch 1 l2arc_norw 0 l2arc_write_boost 8388608 l2arc_write_max 8388608 metaslab_aliquot 524288 metaslab_bias_enabled 1 metaslab_debug_load 0 metaslab_debug_unload 0 metaslab_force_ganging 16777217 metaslab_fragmentation_factor_enabled 1 metaslab_lba_weighting_enabled 1 metaslab_preload_enabled 1 send_holes_without_birth_time 1 spa_asize_inflation 24 spa_config_path /etc/zfs/zpool.cache spa_load_print_vdev_tree 0 spa_load_verify_data 1 spa_load_verify_maxinflight 10000 spa_load_verify_metadata 1 spa_slop_shift 5 vdev_removal_max_span 32768 vdev_validate_skip 0 zfetch_array_rd_sz 1048576 zfetch_max_distance 8388608 zfetch_max_streams 8 zfetch_min_sec_reap 2 zfs_abd_scatter_enabled 1 zfs_abd_scatter_max_order 10 zfs_abd_scatter_min_size 1536 zfs_admin_snapshot 0 zfs_arc_average_blocksize 8192 zfs_arc_dnode_limit 0 zfs_arc_dnode_limit_percent 10 zfs_arc_dnode_reduce_percent 10 zfs_arc_grow_retry 0 zfs_arc_lotsfree_percent 10 zfs_arc_max 485206474752 zfs_arc_meta_adjust_restarts 4096 zfs_arc_meta_limit 0 zfs_arc_meta_limit_percent 75 zfs_arc_meta_min 0 zfs_arc_meta_prune 10000 zfs_arc_meta_strategy 1 zfs_arc_min 1073741824 zfs_arc_min_prefetch_ms 0 zfs_arc_min_prescient_prefetch_ms 0 zfs_arc_p_dampener_disable 1 zfs_arc_p_min_shift 0 zfs_arc_pc_percent 0 zfs_arc_shrink_shift 0 zfs_arc_sys_free 0 zfs_async_block_max_blocks 100000 zfs_autoimport_disable 1 zfs_checksum_events_per_second 20 zfs_commit_timeout_pct 5 zfs_compressed_arc_enabled 1 zfs_condense_indirect_commit_entry_delay_ms 0 zfs_condense_indirect_vdevs_enable 1 zfs_condense_max_obsolete_bytes 1073741824 zfs_condense_min_mapping_bytes 131072 zfs_dbgmsg_enable 1 zfs_dbgmsg_maxsize 4194304 zfs_dbuf_state_index 0 zfs_ddt_data_is_special 1 zfs_deadman_checktime_ms 60000 zfs_deadman_enabled 1 zfs_deadman_failmode continue zfs_deadman_synctime_ms 600000 zfs_deadman_ziotime_ms 300000 zfs_dedup_prefetch 0 zfs_delay_min_dirty_percent 60 zfs_delay_scale 500000 zfs_delete_blocks 20480 zfs_dirty_data_max 4294967296 zfs_dirty_data_max_max 4294967296 zfs_dirty_data_max_max_percent 25 zfs_dirty_data_max_percent 10 zfs_dirty_data_sync_percent 20 zfs_disable_ivset_guid_check 0 zfs_dmu_offset_next_sync 0 zfs_expire_snapshot 300 zfs_flags 0 zfs_free_bpobj_enabled 1 zfs_free_leak_on_eio 0 zfs_free_min_time_ms 1000 zfs_immediate_write_sz 32768 zfs_initialize_value 16045690984833335022 zfs_key_max_salt_uses 400000000 zfs_lua_max_instrlimit 100000000 zfs_lua_max_memlimit 104857600 zfs_max_missing_tvds 0 zfs_max_recordsize 1048576 zfs_metaslab_fragmentation_threshold 70 zfs_metaslab_segment_weight_enabled 1 zfs_metaslab_switch_threshold 2 zfs_mg_fragmentation_threshold 85 zfs_mg_noalloc_threshold 0 zfs_multihost_fail_intervals 10 zfs_multihost_history 0 zfs_multihost_import_intervals 20 zfs_multihost_interval 2000 zfs_multilist_num_sublists 0 zfs_no_scrub_io 0 zfs_no_scrub_prefetch 0 zfs_nocacheflush 0 zfs_nopwrite_enabled 1 zfs_object_mutex_size 64 zfs_obsolete_min_time_ms 500 zfs_override_estimate_recordsize 0 zfs_pd_bytes_max 52428800 zfs_per_txg_dirty_frees_percent 5 zfs_prefetch_disable 1 zfs_read_chunk_size 1048576 zfs_read_history 0 zfs_read_history_hits 0 zfs_reconstruct_indirect_combinations_max 4096 zfs_recover 0 zfs_recv_queue_length 16777216 zfs_removal_ignore_errors 0 zfs_removal_suspend_progress 0 zfs_remove_max_segment 16777216 zfs_resilver_disable_defer 0 zfs_resilver_min_time_ms 3000 zfs_scan_checkpoint_intval 7200 zfs_scan_fill_weight 3 zfs_scan_ignore_errors 0 zfs_scan_issue_strategy 0 zfs_scan_legacy 0 zfs_scan_max_ext_gap 2097152 zfs_scan_mem_lim_fact 20 zfs_scan_mem_lim_soft_fact 20 zfs_scan_strict_mem_lim 0 zfs_scan_suspend_progress 0 zfs_scan_vdev_limit 4194304 zfs_scrub_min_time_ms 1000 zfs_send_corrupt_data 0 zfs_send_queue_length 16777216 zfs_slow_io_events_per_second 20 zfs_spa_discard_memory_limit 16777216 zfs_sync_pass_deferred_free 2 zfs_sync_pass_dont_compress 5 zfs_sync_pass_rewrite 2 zfs_sync_taskq_batch_pct 75 zfs_txg_history 100 zfs_txg_timeout 5 zfs_unlink_suspend_progress 0 zfs_user_indirect_is_special 1 zfs_vdev_aggregation_limit 16777216 zfs_vdev_aggregation_limit_non_rotating 131072 zfs_vdev_async_read_max_active 3 zfs_vdev_async_read_min_active 1 zfs_vdev_async_write_active_max_dirty_percent 60 zfs_vdev_async_write_active_min_dirty_percent 30 zfs_vdev_async_write_max_active 10 zfs_vdev_async_write_min_active 2 zfs_vdev_cache_bshift 16 zfs_vdev_cache_max 16384 zfs_vdev_cache_size 0 zfs_vdev_default_ms_count 200 zfs_vdev_initializing_max_active 1 zfs_vdev_initializing_min_active 1 zfs_vdev_max_active 1000 zfs_vdev_min_ms_count 16 zfs_vdev_mirror_non_rotating_inc 0 zfs_vdev_mirror_non_rotating_seek_inc 1 zfs_vdev_mirror_rotating_inc 0 zfs_vdev_mirror_rotating_seek_inc 5 zfs_vdev_mirror_rotating_seek_offset 1048576 zfs_vdev_ms_count_limit 131072 zfs_vdev_queue_depth_pct 1000 zfs_vdev_raidz_impl [fastest] original scalar sse2 ssse3 avx2 avx512f avx512bw zfs_vdev_read_gap_limit 32768 zfs_vdev_removal_max_active 2 zfs_vdev_removal_min_active 1 zfs_vdev_scheduler deadline zfs_vdev_scrub_max_active 2 zfs_vdev_scrub_min_active 1 zfs_vdev_sync_read_max_active 10 zfs_vdev_sync_read_min_active 10 zfs_vdev_sync_write_max_active 10 zfs_vdev_sync_write_min_active 10 zfs_vdev_write_gap_limit 4096 zfs_zevent_cols 80 zfs_zevent_console 0 zfs_zevent_len_max 1152 zfs_zil_clean_taskq_maxalloc 1048576 zfs_zil_clean_taskq_minalloc 1024 zfs_zil_clean_taskq_nthr_pct 100 zil_nocacheflush 0 zil_replay_disable 0 zil_slog_bulk 786432 zio_deadman_log_all 0 zio_dva_throttle_enabled 1 zio_requeue_io_start_cut_in_line 1 zio_slow_io_ms 30000 zio_taskq_batch_pct 75 zvol_inhibit_dev 0 zvol_major 230 zvol_max_discard_blocks 16384 zvol_prefetch_bytes 131072 zvol_request_sync 0 zvol_threads 32 zvol_volmode 1
denizzzka commented 5 years ago

Can this issue be related with latest kernel issue "Incorrect disk IO caused by blk-mq direct issue can lead to file system corruption"? https://bugzilla.kernel.org/show_bug.cgi?id=201685

gerardba commented 5 years ago

Can this issue be related with latest kernel issue "Incorrect disk IO caused by blk-mq direct issue can lead to file system corruption"? https://bugzilla.kernel.org/show_bug.cgi?id=201685

I don't think so, I'm using 'none' for all the (nvme) devices in this zpool:

# cat /sys/block/nvme*/queue/scheduler
[none] mq-deadline kyber
[none] mq-deadline kyber
[none] mq-deadline kyber
[none] mq-deadline kyber
[none] mq-deadline kyber
[none] mq-deadline kyber
[none] mq-deadline kyber
[none] mq-deadline kyber
[none] mq-deadline kyber
[none] mq-deadline kyber
[none] mq-deadline kyber
[none] mq-deadline kyber
[none] mq-deadline kyber
[none] mq-deadline kyber
[none] mq-deadline kyber
[none] mq-deadline kyber
[none] mq-deadline kyber
[none] mq-deadline kyber
[none] mq-deadline kyber
[none] mq-deadline kyber
[none] mq-deadline kyber
[none] mq-deadline kyber
[none] mq-deadline kyber
[none] mq-deadline kyber
[none] mq-deadline kyber
[none] mq-deadline kyber
[none] mq-deadline kyber
[none] mq-deadline kyber
[none] mq-deadline kyber
[none] mq-deadline kyber
[none] mq-deadline kyber
[none] mq-deadline kyber
gerardba commented 5 years ago

I managed to reproduce the stuck on 0.8.0-rc3_126_g5dbf8b4ed, now with zpool events. Details in https://github.com/zfsonlinux/zfs/issues/7473

denizzzka commented 5 years ago

If it is still actual to reproduce this issue, here is my way to reproduce:

  1. Clone/download and make this nice compressor program https://github.com/byronknoll/cmix
  2. cd to directory inside of pool and run cmix with argument is some 10-100 Mb file: ~/Dev/cmix/cmix -c 100Mb_file.bin result.cmix
  3. After few minutis system will hang

Cmix consumes a lot of RAM, maybe this is important

CySlider commented 5 years ago

Any updates? My issue is suddenly back after being gone for 6 weeks. Really annoying

Orfheo commented 5 years ago

New "ZFS sync hang" event, #2-2019 in five months:i

kernel 4.14.105-gentoo-r1 #1 SMP (elevator=deadline)

SPL: Loaded module v0.7.12-r0-gentoo ZFS: Loaded module v0.7.12-r0-gentoo, ZFS pool version 5000, ZFS filesystem version 5

3631 root 20 0 0 0 0 R 100.0 0.0 793:51.76 kworker/u32:4

hang AGAIN, hogging 100% CPU.

Again while bacula was updating its postgres database, a 12Gb db on a zfs filesystem.

Any sync command was hanging and: cat /proc/3631/stack[] 0xffffffffffffffff

Same incident happened on this system:

24/11/2017 #1-2017 06/05/2018 #1-2018 06/05/2018 #2-2018 12/01/2019 #1-2019 17/05/2019 #2-2019

under different kernels and different ZFS/SPL versions, from v0.7.7-r0-gentoo to to v0.7.12-r0-gentoo.

Just hoping it will help to debug this really boring problem.

Only way to recover was a "reboot -f -n".

denizzzka commented 5 years ago

Only way to recover was a "reboot -f -n".

All mine hang events caused hangs of all GUI and consoles too

Orfheo commented 5 years ago

Reading other related "ZFS sync hanging" issues, a sync hang may be related to an hw problem, for example a flaky cable.

Not my case! I had the same events on two different systems, pretty sure it is not an hw problem.

My systems hadn't any other problem beside "sync hanging" apparently.

The only relation I can see? They have, on same disks, a mix of ext4 and ZFS pools.

May be the "smoking gun"?

denizzzka commented 5 years ago

Issue repruduced on my system after upgrade to Linux kernel 4.19.0-5-amd64 and ZFS sources ver 0.8.0

Orfheo commented 5 years ago

From my statistics, the issue show up, when it show up, really a rare event, ONLY on systems where there is a mix, on the same disks, of ZFS pools and ext3/4 filesystems.

I'm beginning to think the problem is not really a ZFS ONLY issue.

May be a kernel bug which show up just when ZFS and ext3/4 I/O requests mix on the same system for the same disks?

In other words when on the VERY SAME I/O queue there is a mix of I/O requests, for the same kernel worker, directed to the same disk, coming from an ext3/4 filesystem and from ZFS?

denizzzka commented 5 years ago

ONLY on systems where there is a mix, on the same disks, of ZFS pools and ext3/4 filesystems.

My case also refers to this: I use ZFS rpool and zvol what contains LUKS ext4 /home partition

Orfheo commented 5 years ago

I understand your layout, denizzzka. Mine is a little bit different, I've MBR partitioned disks which contains native ZFS pools and native ext3/ext4 filesystems, no zvol at all.

But I guess, the effect is the same: a mix of ext3/4 and ZFS I/O requests may be handled by the same kworker, which, for what we have seen, in some case, hangs, for unknown reasons.

May I ask to the others posting to this issue to report if they have the same disk/filesystems layout?

Another info: I handn't never seen this event in Linux kernels with version < 4.xx. I still have a couple of lin ux kernels 3.16.5, ZFS v0.6.3-r0-gentoo, with the very same layout, and I've never seens this hang to happen.

Anteru commented 5 years ago

Same for me, I have an ext4 root, everything else is ZFS.

bglod commented 5 years ago

No ext3/4 partitions here. Just vfat on /boot (because EFI) and ZFS for everything else.