openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.68k stars 1.75k forks source link

zfs umount and spl_delay_taskq stuck for days #16647

Open arcenik opened 1 month ago

arcenik commented 1 month ago

System information

Type Version/Name
Distribution Name Debian
Distribution Version Trixie
Kernel Version 6.10.11-amd64
Architecture amd64 / AMD Opteron(tm) X3421 APU
OpenZFS Version zfs-dkms 2.2.6-1

Describe the problem you're observing

Zfs umount on an old snapshot (triggered by zfs destroy) command get stuck forever, until hard reboot.

Describe how to reproduce the problem

It just happens.

Include any warning/errors/backtraces from the system logs

First kernel thread [spl_delay_taskq] to get stuck:

Oct 14 09:34:19 iota kernel: task:spl_delay_taskq state:D stack:0     pid:463   tgid:463   ppid:2      flags:0x00004000
Oct 14 09:34:19 iota kernel: Call Trace:
Oct 14 09:34:19 iota kernel:  <TASK>
Oct 14 09:34:19 iota kernel:  __schedule+0x3eb/0xb40
Oct 14 09:34:19 iota kernel:  schedule+0x27/0xf0
Oct 14 09:34:19 iota kernel:  schedule_timeout+0x15d/0x170
Oct 14 09:34:19 iota kernel:  __wait_for_common+0x93/0x1c0
Oct 14 09:34:19 iota kernel:  ? __pfx_schedule_timeout+0x10/0x10
Oct 14 09:34:19 iota kernel:  wait_for_completion_state+0x21/0x40
Oct 14 09:34:19 iota kernel:  call_usermodehelper_exec+0x16e/0x1a0
Oct 14 09:34:19 iota kernel:  zfsctl_snapshot_unmount+0x110/0x1f0 [zfs]
Oct 14 09:34:19 iota kernel:  snapentry_expire+0x59/0xf0 [zfs]
Oct 14 09:34:19 iota kernel:  taskq_thread+0x2ba/0x500 [spl]
Oct 14 09:34:19 iota kernel:  ? __pfx_default_wake_function+0x10/0x10
Oct 14 09:34:19 iota kernel:  ? __pfx_taskq_thread+0x10/0x10 [spl]
Oct 14 09:34:19 iota kernel:  kthread+0xd2/0x100
Oct 14 09:34:19 iota kernel:  ? __pfx_kthread+0x10/0x10
Oct 14 09:34:19 iota kernel:  ret_from_fork+0x34/0x50
Oct 14 09:34:19 iota kernel:  ? __pfx_kthread+0x10/0x10
Oct 14 09:34:19 iota kernel:  ret_from_fork_asm+0x1a/0x30
Oct 14 09:34:19 iota kernel:  </TASK>

Another [spl_delay_taskq] kernel thread:

Oct 14 09:34:19 iota kernel: task:spl_delay_taskq state:D stack:0     pid:11699 tgid:11699 ppid:2      flags:0x00004000
Oct 14 09:34:19 iota kernel: Call Trace:
Oct 14 09:34:19 iota kernel:  <TASK>
Oct 14 09:34:19 iota kernel:  __schedule+0x3eb/0xb40
Oct 14 09:34:19 iota kernel:  schedule+0x27/0xf0
Oct 14 09:34:19 iota kernel:  schedule_timeout+0x15d/0x170
Oct 14 09:34:19 iota kernel:  __wait_for_common+0x93/0x1c0
Oct 14 09:34:19 iota kernel:  ? __pfx_schedule_timeout+0x10/0x10
Oct 14 09:34:19 iota kernel:  wait_for_completion_state+0x21/0x40
Oct 14 09:34:19 iota kernel:  call_usermodehelper_exec+0x16e/0x1a0
Oct 14 09:34:19 iota kernel:  zfsctl_snapshot_unmount+0x110/0x1f0 [zfs]
Oct 14 09:34:19 iota kernel:  snapentry_expire+0x59/0xf0 [zfs]
Oct 14 09:34:19 iota kernel:  taskq_thread+0x2ba/0x500 [spl]
Oct 14 09:34:19 iota kernel:  ? __pfx_default_wake_function+0x10/0x10
Oct 14 09:34:19 iota kernel:  ? __pfx_taskq_thread+0x10/0x10 [spl]
Oct 14 09:34:19 iota kernel:  kthread+0xd2/0x100
Oct 14 09:34:19 iota kernel:  ? __pfx_kthread+0x10/0x10
Oct 14 09:34:19 iota kernel:  ret_from_fork+0x34/0x50
Oct 14 09:34:19 iota kernel:  ? __pfx_kthread+0x10/0x10
Oct 14 09:34:19 iota kernel:  ret_from_fork_asm+0x1a/0x30
Oct 14 09:34:19 iota kernel:  </TASK>

An umount command umount -t zfs -n /data/archives/omega-2013/.zfs/snapshot/2013-11-27:

Oct 14 09:34:19 iota kernel: task:umount          state:D stack:0     pid:3114253 tgid:3114253 ppid:3108421 flags:0x00004002
Oct 14 09:34:19 iota kernel: Call Trace:
Oct 14 09:34:19 iota kernel:  <TASK>
Oct 14 09:34:19 iota kernel:  __schedule+0x3eb/0xb40
Oct 14 09:34:19 iota kernel:  schedule+0x27/0xf0
Oct 14 09:34:19 iota kernel:  taskq_wait_outstanding+0xbe/0x100 [spl]
Oct 14 09:34:19 iota kernel:  ? __pfx_autoremove_wake_function+0x10/0x10
Oct 14 09:34:19 iota kernel:  arc_remove_prune_callback+0xb7/0xf0 [zfs]
Oct 14 09:34:19 iota kernel:  zfs_umount+0x2c/0x100 [zfs]
Oct 14 09:34:19 iota kernel:  zpl_put_super+0x2b/0x40 [zfs]
Oct 14 09:34:19 iota kernel:  generic_shutdown_super+0x78/0x110
Oct 14 09:34:19 iota kernel:  kill_anon_super+0x16/0x40
Oct 14 09:34:19 iota kernel:  deactivate_locked_super+0x33/0xb0
Oct 14 09:34:19 iota kernel:  cleanup_mnt+0xba/0x150
Oct 14 09:34:19 iota kernel:  task_work_run+0x5c/0x90
Oct 14 09:34:19 iota kernel:  syscall_exit_to_user_mode+0x1ff/0x210
Oct 14 09:34:19 iota kernel:  do_syscall_64+0x8e/0x190
Oct 14 09:34:19 iota kernel:  ? __lruvec_stat_mod_folio+0x68/0xa0
Oct 14 09:34:19 iota kernel:  ? set_ptes.isra.0+0x28/0x90
Oct 14 09:34:19 iota kernel:  ? do_anonymous_page+0xf7/0x7d0
Oct 14 09:34:19 iota kernel:  ? __pte_offset_map+0x1b/0x180
Oct 14 09:34:19 iota kernel:  ? __handle_mm_fault+0xc20/0x1060
Oct 14 09:34:19 iota kernel:  ? syscall_exit_to_user_mode_prepare+0x15d/0x190
Oct 14 09:34:19 iota kernel:  ? __count_memcg_events+0x58/0xf0
Oct 14 09:34:19 iota kernel:  ? count_memcg_events.constprop.0+0x1a/0x30
Oct 14 09:34:19 iota kernel:  ? handle_mm_fault+0xae/0x320
Oct 14 09:34:19 iota kernel:  ? do_user_addr_fault+0x33a/0x6a0
Oct 14 09:34:19 iota kernel:  ? exc_page_fault+0x7e/0x180
Oct 14 09:34:19 iota kernel:  entry_SYSCALL_64_after_hwframe+0x76/0x7e
Oct 14 09:34:19 iota kernel: RIP: 0033:0x7f7018e87637
Oct 14 09:34:19 iota kernel: RSP: 002b:00007ffd50d37508 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
Oct 14 09:34:19 iota kernel: RAX: 0000000000000000 RBX: 000055814488da68 RCX: 00007f7018e87637
Oct 14 09:34:19 iota kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00005581448d0e80
Oct 14 09:34:19 iota kernel: RBP: 0000000000000000 R08: 00007ffd50d362c0 R09: 00000000ffffffff
Oct 14 09:34:19 iota kernel: R10: 0000000000000103 R11: 0000000000000246 R12: 00007f7018fe2244
Oct 14 09:34:19 iota kernel: R13: 00005581448d0e80 R14: 000055814488dd70 R15: 000055814488d960
Oct 14 09:34:19 iota kernel:  </TASK>

Another one umount -t zfs -n /data/archives/omega-2013/.zfs/snapshot/2013-11-28:

Oct 14 09:34:19 iota kernel: task:umount          state:D stack:0     pid:3114254 tgid:3114254 ppid:3087024 flags:0x00004002
Oct 14 09:34:19 iota kernel: Call Trace:
Oct 14 09:34:19 iota kernel:  <TASK>
Oct 14 09:34:19 iota kernel:  __schedule+0x3eb/0xb40
Oct 14 09:34:19 iota kernel:  schedule+0x27/0xf0
Oct 14 09:34:19 iota kernel:  taskq_wait_outstanding+0xbe/0x100 [spl]
Oct 14 09:34:19 iota kernel:  ? __pfx_autoremove_wake_function+0x10/0x10
Oct 14 09:34:19 iota kernel:  arc_remove_prune_callback+0xb7/0xf0 [zfs]
Oct 14 09:34:19 iota kernel:  zfs_umount+0x2c/0x100 [zfs]
Oct 14 09:34:19 iota kernel:  zpl_put_super+0x2b/0x40 [zfs]
Oct 14 09:34:19 iota kernel:  generic_shutdown_super+0x78/0x110
Oct 14 09:34:19 iota kernel:  kill_anon_super+0x16/0x40
Oct 14 09:34:19 iota kernel:  deactivate_locked_super+0x33/0xb0
Oct 14 09:34:19 iota kernel:  cleanup_mnt+0xba/0x150
Oct 14 09:34:19 iota kernel:  task_work_run+0x5c/0x90
Oct 14 09:34:19 iota kernel:  syscall_exit_to_user_mode+0x1ff/0x210
Oct 14 09:34:19 iota kernel:  do_syscall_64+0x8e/0x190
Oct 14 09:34:19 iota kernel:  ? syscall_exit_to_user_mode_prepare+0x15d/0x190
Oct 14 09:34:19 iota kernel:  ? syscall_exit_to_user_mode+0x77/0x210
Oct 14 09:34:19 iota kernel:  ? do_syscall_64+0x8e/0x190
Oct 14 09:34:19 iota kernel:  ? syscall_exit_to_user_mode+0x77/0x210
Oct 14 09:34:19 iota kernel:  ? do_syscall_64+0x8e/0x190
Oct 14 09:34:19 iota kernel:  ? syscall_exit_to_user_mode+0x77/0x210
Oct 14 09:34:19 iota kernel:  ? do_syscall_64+0x8e/0x190
Oct 14 09:34:19 iota kernel:  ? handle_mm_fault+0xae/0x320
Oct 14 09:34:19 iota kernel:  ? do_user_addr_fault+0x33a/0x6a0
Oct 14 09:34:19 iota kernel:  ? exc_page_fault+0x7e/0x180
Oct 14 09:34:19 iota kernel:  entry_SYSCALL_64_after_hwframe+0x76/0x7e
Oct 14 09:34:19 iota kernel: RIP: 0033:0x7f011b9d7637
Oct 14 09:34:19 iota kernel: RSP: 002b:00007fff3920e818 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
Oct 14 09:34:19 iota kernel: RAX: 0000000000000000 RBX: 00005597f4d5fa68 RCX: 00007f011b9d7637
Oct 14 09:34:19 iota kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00005597f4da2c20
Oct 14 09:34:19 iota kernel: RBP: 0000000000000000 R08: 00007fff3920d5d0 R09: 00000000ffffffff
Oct 14 09:34:19 iota kernel: R10: 0000000000000103 R11: 0000000000000246 R12: 00007f011bb32244
Oct 14 09:34:19 iota kernel: R13: 00005597f4da2c20 R14: 00005597f4d5fd70 R15: 00005597f4d5f960
Oct 14 09:34:19 iota kernel:  </TASK>
arcenik commented 1 month ago

It looks like the cause is that kernel thread arc_prune exited #16324

snajpa commented 1 month ago

Can you try this patch pls? https://github.com/vpsfreecz/zfs/commit/74964ac17e0e5f461b74a0a847d87007d21c8d75

dennixxNL commented 2 weeks ago

We have had issues like this recently on some of our systems as well, the call traces are similar. In our case the systems run RHEL8 with the 4.18.0 kernel, and zfs version 2.1.15 The zfs filesystems are accessed via NFSv3, and in all cases one of the users had copied files from a snapshot not long before the issue occurred.

After that, any zfs or zpool commands hang indefinitely, until the system gets a hard reboot.

[Thu Nov  7 14:01:07 2024] task:spl_delay_taskq state:D stack:0     pid:2010  ppid:2      flags:0x80004000
[Thu Nov  7 14:01:07 2024] Call Trace:
[Thu Nov  7 14:01:07 2024]  __schedule+0x2d1/0x870
[Thu Nov  7 14:01:07 2024]  schedule+0x55/0xf0
[Thu Nov  7 14:01:07 2024]  schedule_timeout+0x281/0x320
[Thu Nov  7 14:01:07 2024]  ? srso_return_thunk+0x5/0x5f
[Thu Nov  7 14:01:07 2024]  ? try_to_wake_up+0x1b4/0x4b0
[Thu Nov  7 14:01:07 2024]  ? srso_return_thunk+0x5/0x5f
[Thu Nov  7 14:01:07 2024]  ? srso_return_thunk+0x5/0x5f
[Thu Nov  7 14:01:07 2024]  wait_for_completion+0x96/0x100
[Thu Nov  7 14:01:07 2024]  call_usermodehelper_exec+0x132/0x160
[Thu Nov  7 14:01:07 2024]  zfsctl_snapshot_unmount+0x109/0x1f0 [zfs]
[Thu Nov  7 14:01:07 2024]  snapentry_expire+0x55/0xe0 [zfs]
[Thu Nov  7 14:01:07 2024]  taskq_thread+0x2e1/0x510 [spl]
[Thu Nov  7 14:01:07 2024]  ? wake_up_q+0x60/0x60
[Thu Nov  7 14:01:07 2024]  ? taskq_thread_spawn+0x50/0x50 [spl]
[Thu Nov  7 14:01:07 2024]  kthread+0x134/0x150
[Thu Nov  7 14:01:07 2024]  ? set_kthread_struct+0x50/0x50
[Thu Nov  7 14:01:07 2024]  ret_from_fork+0x35/0x40
[Thu Nov  7 14:01:07 2024] task:rpc.mountd      state:D stack:0     pid:7269  ppid:1      flags:0x00004080
[Thu Nov  7 14:01:07 2024] Call Trace:
[Thu Nov  7 14:01:07 2024]  __schedule+0x2d1/0x870
[Thu Nov  7 14:01:07 2024]  schedule+0x55/0xf0
[Thu Nov  7 14:01:07 2024]  taskq_wait_id+0x8e/0xe0 [spl]
[Thu Nov  7 14:01:07 2024]  ? finish_wait+0x80/0x80
[Thu Nov  7 14:01:07 2024]  taskq_cancel_id+0xce/0x120 [spl]
[Thu Nov  7 14:01:07 2024]  zfsctl_snapshot_unmount_cancel+0x37/0x80 [zfs]
[Thu Nov  7 14:01:07 2024]  zfsctl_snapshot_unmount_delay+0x3e/0xb0 [zfs]
[Thu Nov  7 14:01:07 2024]  zfs_lookup+0x133/0x400 [zfs]
[Thu Nov  7 14:01:07 2024]  zpl_xattr_get_dir+0x5b/0x1a0 [zfs]
[Thu Nov  7 14:01:07 2024]  ? nvlist_lookup_common+0x32/0x80 [znvpair]
[Thu Nov  7 14:01:07 2024]  ? srso_return_thunk+0x5/0x5f
[Thu Nov  7 14:01:07 2024]  ? zpl_xattr_get_sa+0xcc/0x120 [zfs]
[Thu Nov  7 14:01:07 2024]  ? srso_return_thunk+0x5/0x5f
[Thu Nov  7 14:01:07 2024]  ? srso_return_thunk+0x5/0x5f
[Thu Nov  7 14:01:07 2024]  zpl_xattr_get+0xd3/0x1d0 [zfs]
[Thu Nov  7 14:01:07 2024]  zpl_xattr_security_get+0x3e/0x60 [zfs]
[Thu Nov  7 14:01:07 2024]  __vfs_getxattr+0x54/0x70
[Thu Nov  7 14:01:07 2024]  get_vfs_caps_from_disk+0x68/0x190
[Thu Nov  7 14:01:07 2024]  ? srso_return_thunk+0x5/0x5f
[Thu Nov  7 14:01:07 2024]  ? terminate_walk+0x7e/0xf0
[Thu Nov  7 14:01:07 2024]  audit_copy_inode+0x94/0xd0
[Thu Nov  7 14:01:07 2024]  filename_lookup.part.58+0x114/0x170
[Thu Nov  7 14:01:07 2024]  ? srso_return_thunk+0x5/0x5f
[Thu Nov  7 14:01:07 2024]  ? srso_return_thunk+0x5/0x5f
[Thu Nov  7 14:01:07 2024]  ? path_get+0x11/0x30
[Thu Nov  7 14:01:07 2024]  ? srso_return_thunk+0x5/0x5f
[Thu Nov  7 14:01:07 2024]  ? audit_alloc_name+0x132/0x150
[Thu Nov  7 14:01:07 2024]  ? srso_return_thunk+0x5/0x5f
[Thu Nov  7 14:01:07 2024]  ? __audit_getname+0x2d/0x50
[Thu Nov  7 14:01:07 2024]  ? srso_return_thunk+0x5/0x5f
[Thu Nov  7 14:01:07 2024]  vfs_statx+0x74/0xe0
[Thu Nov  7 14:01:07 2024]  __do_sys_newstat+0x39/0x70
[Thu Nov  7 14:01:07 2024]  ? srso_return_thunk+0x5/0x5f
[Thu Nov  7 14:01:07 2024]  ? syscall_trace_enter+0x1ff/0x2d0
[Thu Nov  7 14:01:07 2024]  ? srso_return_thunk+0x5/0x5f
[Thu Nov  7 14:01:07 2024]  ? audit_reset_context.part.16+0x26a/0x2d0
[Thu Nov  7 14:01:07 2024]  do_syscall_64+0x5b/0x1a0
[Thu Nov  7 14:01:07 2024]  entry_SYSCALL_64_after_hwframe+0x66/0xcb
[Thu Nov  7 14:01:07 2024] RIP: 0033:0x7f67b861fb09
[Thu Nov  7 14:01:07 2024] Code: Unable to access opcode bytes at RIP 0x7f67b861fadf.
[Thu Nov  7 14:01:07 2024] RSP: 002b:00007ffe0789cd28 EFLAGS: 00000246 ORIG_RAX: 0000000000000004
[Thu Nov  7 14:01:07 2024] RAX: ffffffffffffffda RBX: 0000555eea3cf290 RCX: 00007f67b861fb09
[Thu Nov  7 14:01:07 2024] RDX: 00007ffe0789cdd0 RSI: 00007ffe0789cdd0 RDI: 0000555eea3dfcd8
[Thu Nov  7 14:01:07 2024] RBP: 0000555eea3dfcd8 R08: 00007ffe0789c241 R09: 0000000000000000
[Thu Nov  7 14:01:07 2024] R10: 00007f67b8672e80 R11: 0000000000000246 R12: 0000555eea3cf2a8
[Thu Nov  7 14:01:07 2024] R13: 0000555eea3cf290 R14: 0000555eea3e7030 R15: 0000555eea3e11a0
[Thu Nov  7 14:01:07 2024] task:umount          state:D stack:0     pid:444721 ppid:191258 flags:0x00004080
[Thu Nov  7 14:01:07 2024] Call Trace:
[Thu Nov  7 14:01:07 2024]  __schedule+0x2d1/0x870
[Thu Nov  7 14:01:07 2024]  ? srso_return_thunk+0x5/0x5f
[Thu Nov  7 14:01:07 2024]  schedule+0x55/0xf0
[Thu Nov  7 14:01:07 2024]  schedule_preempt_disabled+0xa/0x10
[Thu Nov  7 14:01:07 2024]  rwsem_down_write_slowpath+0x370/0x570
[Thu Nov  7 14:01:07 2024]  ? mnt_get_count+0x39/0x50
[Thu Nov  7 14:01:07 2024]  ? pin_kill+0x70/0x190
[Thu Nov  7 14:01:07 2024]  down_write+0x29/0x50
[Thu Nov  7 14:01:07 2024]  zfsctl_destroy+0x53/0xd0 [zfs]
[Thu Nov  7 14:01:07 2024]  zfs_preumount+0x2a/0x70 [zfs]
[Thu Nov  7 14:01:07 2024]  zpl_kill_sb+0xe/0x20 [zfs]
[Thu Nov  7 14:01:07 2024]  deactivate_locked_super+0x34/0x70
[Thu Nov  7 14:01:07 2024]  cleanup_mnt+0x3b/0x70
[Thu Nov  7 14:01:07 2024]  task_work_run+0x8a/0xb0
[Thu Nov  7 14:01:07 2024]  exit_to_usermode_loop+0xef/0x100
[Thu Nov  7 14:01:07 2024]  do_syscall_64+0x195/0x1a0
[Thu Nov  7 14:01:07 2024]  entry_SYSCALL_64_after_hwframe+0x66/0xcb
[Thu Nov  7 14:01:07 2024] RIP: 0033:0x7fcaec2558fb
[Thu Nov  7 14:01:07 2024] Code: Unable to access opcode bytes at RIP 0x7fcaec2558d1.
[Thu Nov  7 14:01:07 2024] RSP: 002b:00007fff09fd8cf8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
[Thu Nov  7 14:01:07 2024] RAX: 0000000000000000 RBX: 00005632b76f0310 RCX: 00007fcaec2558fb
[Thu Nov  7 14:01:07 2024] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 00005632b76fbac0
[Thu Nov  7 14:01:07 2024] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007fcaec3ac220
[Thu Nov  7 14:01:07 2024] R10: 00005632b76f48b0 R11: 0000000000000246 R12: 00005632b76fbac0
[Thu Nov  7 14:01:07 2024] R13: 00007fcaed0d8184 R14: 00005632b76f45e0 R15: 00000000ffffffff