Zygo / bees

Best-Effort Extent-Same, a btrfs dedupe agent
GNU General Public License v3.0
647 stars 55 forks source link

Running bees soft locks the file system where bees is running deduplication on #162

Open ghost opened 3 years ago

ghost commented 3 years ago

Reading the filesystem is still ok until there is an attempt to make changes (create new directory, create new file, edit file, rename), at which it soft locks that directory, no read or write request can make to it, any attempt will freeze the client trying to do that.

Powershell creating new directory within directory "Software": image

after that, trying to access folder "Software" within Webmin causes infinite load: image

bees won't generate new log, rendering the program stuck: image

BtrFS is used within Ubuntu VM, running on top of Hyper-V, with disk passthrough. Disk usage can be viewed from Task Manager on Hyper-V host, and there's basically 0% usage and 0kb/s all the time, confirming that deduplication isn't working: image image

despite not deduplicating anything, bees keeps using CPU all the time, image image

Filesystem is mounted with option noatime,nodiratime,compress=zstd. Storage setup is RAID-1 from two 2TB hard disk. I'm using Ubuntu 20.04 LTS. Kernel version is 5.4.0.

This issue occurs after a 20GB file was moved to the BtrFS, as you can see in the log screenshot (Ubuntu Server.old.vhdx), and bees runs AFTER a snapshot was created. snapshot is readonly. Perhaps bees is trying to make changes to that readonly snapshot? what should I do next?

kakra commented 3 years ago

Just wait. Chances are that CPU time is spent in the kernel, not bees itself (top should show a high SYS% in that case). The kernel will eventually finish the operation after minutes or hours. Bees will mark the affected hash/extent as toxic and not use it again. Using a newer kernel will probably fix it. This is not a problem with read-only snapshots.

You may want to run dmesg from time to time (or run demsg -w to watch for new logs in a terminal): You should see the kernel complaining about seemingly being hung in some tasks for more than 120s (if the soft-lock detector is enabled in your kernel). This isn't critical, it's a symptom of the problem and will resolve cleanly if you just wait long enough.

kakra commented 3 years ago

I'm currently seeing it, too. It looks like this:

[415947.956802] INFO: task btrfs:1921055 blocked for more than 614 seconds.
[415947.956804]       Tainted: P           O      5.4.85-gentoo #1
[415947.956805] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[415947.956806] btrfs           D    0 1921055 1921053 0x00004000
[415947.956808] Call Trace:
[415947.956814]  __schedule+0x58e/0xb00
[415947.956816]  schedule+0x3b/0xb0
[415947.956818]  io_schedule+0xd/0x40
[415947.956821]  __lock_page+0xf8/0x1b0
[415947.956823]  ? __page_cache_alloc+0x10/0x10
[415947.956825]  pagecache_get_page+0x1aa/0x240
[415947.956827]  btrfs_defrag_file+0x548/0xe40
[415947.956829]  btrfs_ioctl_defrag+0xe3/0x160
[415947.956831]  btrfs_ioctl+0x1a12/0x2d90
[415947.956834]  ? tty_insert_flip_string_fixed_flag+0x7b/0xd0
[415947.956835]  ? pty_write+0x87/0x90
[415947.956837]  ? __wake_up_common_lock+0x75/0x90
[415947.956840]  ? do_vfs_ioctl+0x412/0x600
[415947.956841]  do_vfs_ioctl+0x412/0x600
[415947.956842]  ksys_ioctl+0x35/0x70
[415947.956844]  __x64_sys_ioctl+0x11/0x20
[415947.956846]  do_syscall_64+0x3d/0x100
[415947.956848]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[415947.956850] RIP: 0033:0x7f3ea91fbb87
[415947.956854] Code: Bad RIP value.
[415947.956854] RSP: 002b:00007ffd78491ea8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[415947.956856] RAX: ffffffffffffffda RBX: 00007ffd78492428 RCX: 00007f3ea91fbb87
[415947.956856] RDX: 000055d7f7e3e700 RSI: 0000000040309410 RDI: 0000000000000003
[415947.956857] RBP: 0000000000000003 R08: 0000000000002000 R09: 0000000000000000
[415947.956858] R10: 00007f3ea92850c0 R11: 0000000000000246 R12: 0000000000000000
[415947.956858] R13: 00007ffd78491ed8 R14: 00007ffd78494910 R15: 00007ffd78491ee0

In my case, it's not bees but the btrfs command I'm running to defrag some files.

Zygo commented 3 years ago

This sounds like the same issue as #161. Use kernel 5.7 or later, or wait until bees maps out all the toxic extents by brute force.

If you want to look at tasks, look at /proc/(pid of bees)/task/(tid of bees thread running LOGICAL_INO)/stack during periods of high CPU usage. If it contains logical_to_ino, find_parent_nodes, or iterate_inodes_from_logical, and is stuck in the system call long enough that you can cat its kernel stack, you are hitting the known performance bug.

If you want bees to leave read-only snapshots alone, you can use --workaround-btrfs-send. bees will then only dedupe read-write subvols. When the read-only snapshot is deleted, any space saved by dedupe will become available as the deleted snapshot releases the last references to the duplicate data. If you intend to delete the snapshot anyway (i.e. it is only used for btrfs send or other backup system, and the snapshot will be deleted when no longer required), you can save some IO time by ignoring the snapshot while it exists.

(edit: reference the right issue)

ghost commented 3 years ago
~# cat /proc/1986/task/*/stack

[<0>] do_sigtimedwait.isra.0+0x17f/0x200
[<0>] __x64_sys_rt_sigtimedwait+0x71/0xd0
[<0>] do_syscall_64+0x57/0x190
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[<0>] futex_wait_queue_me+0xbb/0x120
[<0>] futex_wait+0x105/0x260
[<0>] do_futex+0x10f/0x1e0
[<0>] __x64_sys_futex+0x13f/0x170
[<0>] do_syscall_64+0x57/0x190
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[<0>] futex_wait_queue_me+0xbb/0x120
[<0>] futex_wait+0x105/0x260
[<0>] do_futex+0x10f/0x1e0
[<0>] __x64_sys_futex+0x13f/0x170
[<0>] do_syscall_64+0x57/0x190
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[<0>] wait_current_trans+0xbd/0x100 [btrfs]
[<0>] start_transaction+0x476/0x510 [btrfs]
[<0>] btrfs_start_transaction+0x1e/0x20 [btrfs]
[<0>] btrfs_punch_hole_range+0x10e/0x720 [btrfs]
[<0>] btrfs_clone+0xb04/0xd40 [btrfs]
[<0>] btrfs_extent_same_range+0x75/0xa0 [btrfs]
[<0>] btrfs_remap_file_range+0x355/0x3c0 [btrfs]
[<0>] vfs_dedupe_file_range_one+0x12d/0x150
[<0>] vfs_dedupe_file_range+0x156/0x1e0
[<0>] do_vfs_ioctl+0x340/0x670
[<0>] ksys_ioctl+0x67/0x90
[<0>] __x64_sys_ioctl+0x1a/0x20
[<0>] do_syscall_64+0x57/0x190
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[<0>] wait_current_trans+0xbd/0x100 [btrfs]
[<0>] start_transaction+0x329/0x510 [btrfs]
[<0>] btrfs_attach_transaction+0x1d/0x20 [btrfs]
[<0>] iterate_extent_inodes+0x86/0x2c0 [btrfs]
[<0>] iterate_inodes_from_logical+0x9f/0xe0 [btrfs]
[<0>] btrfs_ioctl_logical_to_ino+0x13b/0x1b0 [btrfs]
[<0>] btrfs_ioctl+0xf46/0x20d0 [btrfs]
[<0>] do_vfs_ioctl+0x407/0x670
[<0>] ksys_ioctl+0x67/0x90
[<0>] __x64_sys_ioctl+0x1a/0x20
[<0>] do_syscall_64+0x57/0x190
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[<0>] wait_current_trans+0xbd/0x100 [btrfs]
[<0>] start_transaction+0x329/0x510 [btrfs]
[<0>] btrfs_attach_transaction+0x1d/0x20 [btrfs]
[<0>] iterate_extent_inodes+0x86/0x2c0 [btrfs]
[<0>] iterate_inodes_from_logical+0x9f/0xe0 [btrfs]
[<0>] btrfs_ioctl_logical_to_ino+0x13b/0x1b0 [btrfs]
[<0>] btrfs_ioctl+0xf46/0x20d0 [btrfs]
[<0>] do_vfs_ioctl+0x407/0x670
[<0>] ksys_ioctl+0x67/0x90
[<0>] __x64_sys_ioctl+0x1a/0x20
[<0>] do_syscall_64+0x57/0x190
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[<0>] futex_wait_queue_me+0xbb/0x120
[<0>] futex_wait+0x105/0x260
[<0>] do_futex+0x10f/0x1e0
[<0>] __x64_sys_futex+0x13f/0x170
[<0>] do_syscall_64+0x57/0x190
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[<0>] futex_wait_queue_me+0xbb/0x120
[<0>] futex_wait+0x105/0x260
[<0>] do_futex+0x10f/0x1e0
[<0>] __x64_sys_futex+0x13f/0x170
[<0>] do_syscall_64+0x57/0x190
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[<0>] futex_wait_queue_me+0xbb/0x120
[<0>] futex_wait+0x105/0x260
[<0>] do_futex+0x10f/0x1e0
[<0>] __x64_sys_futex+0x13f/0x170
[<0>] do_syscall_64+0x57/0x190
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[<0>] futex_wait_queue_me+0xbb/0x120
[<0>] futex_wait+0x105/0x260
[<0>] do_futex+0x10f/0x1e0
[<0>] __x64_sys_futex+0x13f/0x170
[<0>] do_syscall_64+0x57/0x190
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Ok, now I see it, it has logical_to_ino and iterate_inodes_from_logical

and

~# dmesg -w
[  484.370216] INFO: task crawl_5:1992 blocked for more than 120 seconds.
[  484.370666]       Not tainted 5.4.0-58-generic #64-Ubuntu
[  484.370951] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  484.371362] crawl_5         D    0  1992   1951 0x00004000
[  484.371363] Call Trace:
[  484.371365]  __schedule+0x2e3/0x740
[  484.371367]  schedule+0x42/0xb0
[  484.371378]  wait_current_trans+0xbd/0x100 [btrfs]
[  484.371379]  ? wait_woken+0x80/0x80
[  484.371388]  start_transaction+0x329/0x510 [btrfs]
[  484.371401]  ? btrfs_inode_flags_to_xflags+0x50/0x50 [btrfs]
[  484.371412]  btrfs_attach_transaction+0x1d/0x20 [btrfs]
[  484.371425]  iterate_extent_inodes+0x86/0x2c0 [btrfs]
[  484.371439]  ? extent_from_logical+0x173/0x1c0 [btrfs]
[  484.371453]  iterate_inodes_from_logical+0x9f/0xe0 [btrfs]
[  484.371467]  ? iterate_inodes_from_logical+0x9f/0xe0 [btrfs]
[  484.371480]  ? btrfs_inode_flags_to_xflags+0x50/0x50 [btrfs]
[  484.371504]  btrfs_ioctl_logical_to_ino+0x13b/0x1b0 [btrfs]
[  484.371518]  btrfs_ioctl+0xf46/0x20d0 [btrfs]
[  484.371520]  ? cputime_adjust+0xe3/0x110
[  484.371521]  ? _cond_resched+0x19/0x30
[  484.371522]  ? mmput+0x13/0x130
[  484.371524]  ? getrusage+0x417/0x430
[  484.371525]  do_vfs_ioctl+0x407/0x670
[  484.371526]  ? do_vfs_ioctl+0x407/0x670
[  484.371527]  ? __do_sys_getrusage+0x5b/0x90
[  484.371528]  ksys_ioctl+0x67/0x90
[  484.371529]  __x64_sys_ioctl+0x1a/0x20
[  484.371531]  do_syscall_64+0x57/0x190
[  484.371532]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  484.371533] RIP: 0033:0x7fa1c0ee550b
[  484.371536] Code: Bad RIP value.
[  484.371537] RSP: 002b:00007fa1be5c0aa8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  484.371538] RAX: ffffffffffffffda RBX: 000055daa83585f0 RCX: 00007fa1c0ee550b
[  484.371538] RDX: 00007fa1be5c0e88 RSI: 00000000c038943b RDI: 0000000000000004
[  484.371539] RBP: 00007fa1be5c0e80 R08: 00007fa18006f900 R09: 00007fa1c131cbb0
[  484.371539] R10: 00007ffcd336d080 R11: 0000000000000246 R12: 0000000000000004
[  484.371540] R13: 00007fa1be5c0e88 R14: 00007fa1be5c0da0 R15: 00007fa1be5c0cd8
[  605.199819] INFO: task crawl_5:1989 blocked for more than 241 seconds.
[  605.200169]       Not tainted 5.4.0-58-generic #64-Ubuntu
[  605.200471] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  605.200952] crawl_5         D    0  1989   1951 0x00004000
[  605.200954] Call Trace:
[  605.200959]  __schedule+0x2e3/0x740
[  605.200961]  schedule+0x42/0xb0
[  605.200990]  wait_current_trans+0xbd/0x100 [btrfs]
[  605.200993]  ? wait_woken+0x80/0x80
[  605.201005]  start_transaction+0x476/0x510 [btrfs]
[  605.201016]  btrfs_start_transaction+0x1e/0x20 [btrfs]
[  605.201027]  btrfs_punch_hole_range+0x10e/0x720 [btrfs]
[  605.201040]  btrfs_clone+0xb04/0xd40 [btrfs]
[  605.201054]  btrfs_extent_same_range+0x75/0xa0 [btrfs]
[  605.201068]  btrfs_remap_file_range+0x355/0x3c0 [btrfs]
[  605.201070]  vfs_dedupe_file_range_one+0x12d/0x150
[  605.201071]  vfs_dedupe_file_range+0x156/0x1e0
[  605.201073]  do_vfs_ioctl+0x340/0x670
[  605.201074]  ksys_ioctl+0x67/0x90
[  605.201075]  __x64_sys_ioctl+0x1a/0x20
[  605.201077]  do_syscall_64+0x57/0x190
[  605.201079]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  605.201080] RIP: 0033:0x7fa1c0ee550b
[  605.201085] Code: Bad RIP value.
[  605.201085] RSP: 002b:00007fa1bfdc3368 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  605.201086] RAX: ffffffffffffffda RBX: 00007fa1bfdc3600 RCX: 00007fa1c0ee550b
[  605.201087] RDX: 00007fa174027230 RSI: 00000000c0189436 RDI: 0000000000000017
[  605.201087] RBP: 00007fa174027230 R08: 0000000000000005 R09: 00007fa1bfdc37c8
[  605.201088] R10: 00007fa1bfdc37c0 R11: 0000000000000246 R12: 0000000000000018
[  605.201088] R13: 00007fa17400b538 R14: 00007fa1bfdc3608 R15: 00007fa1bfdc35c0
[  605.201090] INFO: task crawl_5:1990 blocked for more than 241 seconds.
[  605.201458]       Not tainted 5.4.0-58-generic #64-Ubuntu
[  605.201798] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  605.202191] crawl_5         D    0  1990   1951 0x00000000
[  605.202192] Call Trace:
[  605.202195]  __schedule+0x2e3/0x740
[  605.202196]  schedule+0x42/0xb0
[  605.202208]  wait_current_trans+0xbd/0x100 [btrfs]
[  605.202210]  ? wait_woken+0x80/0x80
[  605.202219]  start_transaction+0x329/0x510 [btrfs]
[  605.202230]  ? btrfs_inode_flags_to_xflags+0x50/0x50 [btrfs]
[  605.202239]  btrfs_attach_transaction+0x1d/0x20 [btrfs]
[  605.202251]  iterate_extent_inodes+0x86/0x2c0 [btrfs]
[  605.202262]  ? extent_from_logical+0x173/0x1c0 [btrfs]
[  605.202275]  iterate_inodes_from_logical+0x9f/0xe0 [btrfs]
[  605.202288]  ? iterate_inodes_from_logical+0x9f/0xe0 [btrfs]
[  605.202301]  ? btrfs_inode_flags_to_xflags+0x50/0x50 [btrfs]
[  605.202313]  btrfs_ioctl_logical_to_ino+0x13b/0x1b0 [btrfs]
[  605.202326]  btrfs_ioctl+0xf46/0x20d0 [btrfs]
[  605.202327]  ? cputime_adjust+0xe3/0x110
[  605.202329]  ? _cond_resched+0x19/0x30
[  605.202330]  ? mmput+0x13/0x130
[  605.202331]  ? getrusage+0x417/0x430
[  605.202333]  do_vfs_ioctl+0x407/0x670
[  605.202334]  ? do_vfs_ioctl+0x407/0x670
[  605.202334]  ? __do_sys_getrusage+0x5b/0x90
[  605.202336]  ksys_ioctl+0x67/0x90
[  605.202337]  __x64_sys_ioctl+0x1a/0x20
[  605.202338]  do_syscall_64+0x57/0x190
[  605.202340]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  605.202341] RIP: 0033:0x7fa1c0ee550b
[  605.202343] Code: Bad RIP value.
[  605.202344] RSP: 002b:00007fa1bf5c2aa8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  605.202345] RAX: ffffffffffffffda RBX: 000055daa83585f0 RCX: 00007fa1c0ee550b
[  605.202345] RDX: 00007fa1bf5c2e88 RSI: 00000000c038943b RDI: 0000000000000004
[  605.202346] RBP: 00007fa1bf5c2e80 R08: 00007fa16c05cab0 R09: 00007fa1c131cbb0
[  605.202346] R10: 00007ffcd336d080 R11: 0000000000000246 R12: 0000000000000004
[  605.202347] R13: 00007fa1bf5c2e88 R14: 00007fa1bf5c2da0 R15: 00007fa1bf5c2cd8
[  605.202348] INFO: task crawl_5:1992 blocked for more than 241 seconds.
[  605.202686]       Not tainted 5.4.0-58-generic #64-Ubuntu
[  605.202956] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  605.203410] crawl_5         D    0  1992   1951 0x00004000
[  605.203412] Call Trace:
[  605.203414]  __schedule+0x2e3/0x740
[  605.203415]  schedule+0x42/0xb0
[  605.203426]  wait_current_trans+0xbd/0x100 [btrfs]
[  605.203428]  ? wait_woken+0x80/0x80
[  605.203437]  start_transaction+0x329/0x510 [btrfs]
[  605.203449]  ? btrfs_inode_flags_to_xflags+0x50/0x50 [btrfs]
[  605.203467]  btrfs_attach_transaction+0x1d/0x20 [btrfs]
[  605.203479]  iterate_extent_inodes+0x86/0x2c0 [btrfs]
[  605.203491]  ? extent_from_logical+0x173/0x1c0 [btrfs]
[  605.203503]  iterate_inodes_from_logical+0x9f/0xe0 [btrfs]
[  605.203516]  ? iterate_inodes_from_logical+0x9f/0xe0 [btrfs]
[  605.203529]  ? btrfs_inode_flags_to_xflags+0x50/0x50 [btrfs]
[  605.203541]  btrfs_ioctl_logical_to_ino+0x13b/0x1b0 [btrfs]
[  605.203554]  btrfs_ioctl+0xf46/0x20d0 [btrfs]
[  605.203555]  ? cputime_adjust+0xe3/0x110
[  605.203556]  ? _cond_resched+0x19/0x30
[  605.203557]  ? mmput+0x13/0x130
[  605.203558]  ? getrusage+0x417/0x430
[  605.203560]  do_vfs_ioctl+0x407/0x670
[  605.203561]  ? do_vfs_ioctl+0x407/0x670
[  605.203562]  ? __do_sys_getrusage+0x5b/0x90
[  605.203563]  ksys_ioctl+0x67/0x90
[  605.203564]  __x64_sys_ioctl+0x1a/0x20
[  605.203565]  do_syscall_64+0x57/0x190
[  605.203567]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  605.203567] RIP: 0033:0x7fa1c0ee550b
[  605.203570] Code: Bad RIP value.
[  605.203570] RSP: 002b:00007fa1be5c0aa8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  605.203571] RAX: ffffffffffffffda RBX: 000055daa83585f0 RCX: 00007fa1c0ee550b
[  605.203572] RDX: 00007fa1be5c0e88 RSI: 00000000c038943b RDI: 0000000000000004
[  605.203572] RBP: 00007fa1be5c0e80 R08: 00007fa18006f900 R09: 00007fa1c131cbb0
[  605.203572] R10: 00007ffcd336d080 R11: 0000000000000246 R12: 0000000000000004
[  605.203573] R13: 00007fa1be5c0e88 R14: 00007fa1be5c0da0 R15: 00007fa1be5c0cd8

I'll try 5.8.10 to see it's resolved.

ghost commented 3 years ago

Ok so I switched to 5.8.0 (5.8.10 not avaiable in ubuntu repository lol) The issue is gone, no more soft lock. I'm considering switching to a distro that follows up latest kernel since my ubuntu only acts as file server.