System bg not reclaimed

kdave commented 3 years ago

With 2MB zone size and a git workload, I'm observing that the system block group does not get reclaimed:

# btrfs fi df /mnt
Data, single: total=2.53GiB, used=181.91MiB, zone_unusable=2.32GiB
System, single: total=60.00MiB, used=240.00KiB, zone_unusable=58.81MiB
Metadata, single: total=864.00MiB, used=4.27MiB, zone_unusable=855.19MiB
GlobalReserve, single: total=3.25MiB, used=0.00B

The log is full of the reclaiming messages but it's just metadata or data, so system will fill the whole system sooner or later (total size is 10G).

Same on a fs with 4MB zones:

# btrfs fi df /mnt
Data, single: total=7.84GiB, used=2.06GiB, zone_unusable=5.75GiB
System, single: total=16.00MiB, used=272.00KiB, zone_unusable=12.00MiB
Metadata, single: total=508.00MiB, used=12.31MiB, zone_unusable=494.12MiB
GlobalReserve, single: total=8.80MiB, used=48.00KiB

The initial offset of the system bg is 8388608 so the relocation would not pick it soon enough and it may need to be prioritized. After the test ran a few more minutes I found first occurence of the system chunk relocation, but just for one of the filesystems.

kdave commented 3 years ago

And because the background reclaiming runs all the time I can't start the balance manually.

kdave commented 3 years ago

So after some time the system chunks get reclaimed but it's still too slow.

On one filesytem I removed all the data and let it finish the autoreclaim, numbers were decreasing. I managed to sneak in one balance command that sped it up and now all the numbers are almost zero (expected values).

kdave commented 3 years ago

There's a final lockup after all block groups are filled:

# btrfs fi df /mnt
Data, single: total=7.35GiB, used=194.12MiB, zone_unusable=7.16GiB
System, single: total=178.00MiB, used=688.00KiB, zone_unusable=176.08MiB
Metadata, single: total=2.47GiB, used=3.27MiB, zone_unusable=2.44GiB
GlobalReserve, single: total=3.25MiB, used=0.00B

[ 7379.192958] INFO: task kworker/u16:8:2479 blocked for more than 491 seconds.
[ 7379.200225]       Tainted: G        W         5.13.0-git+ #779
[ 7379.206198] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 7379.214240] task:kworker/u16:8   state:D stack:10216 pid: 2479 ppid:     2 flags:0x00004000
[ 7379.214264] Workqueue: events_unbound btrfs_reclaim_bgs_work [btrfs]
[ 7379.222870] Call Trace:
[ 7379.222879]  __schedule+0x27d/0x730
[ 7379.222897]  ? lock_acquire+0xa0/0x150
[ 7379.222910]  ? btrfs_wait_ordered_roots+0x66/0x240 [btrfs]
[ 7379.223075]  schedule+0x59/0xc0
[ 7379.223085]  schedule_preempt_disabled+0x11/0x20
[ 7379.223091]  __mutex_lock+0x2d0/0x910
[ 7379.223105]  btrfs_wait_ordered_roots+0x66/0x240 [btrfs]
[ 7379.223269]  ? btrfs_wait_nocow_writers+0x33/0xc0 [btrfs]
[ 7379.223461]  btrfs_relocate_block_group+0x165/0x340 [btrfs]
[ 7379.223655]  btrfs_relocate_chunk+0x38/0x120 [btrfs]
[ 7379.223849]  btrfs_reclaim_bgs_work.cold+0x73/0x167 [btrfs]
[ 7379.224073]  process_one_work+0x262/0x600
[ 7379.224097]  worker_thread+0x4c/0x320
[ 7379.224107]  ? process_one_work+0x600/0x600
[ 7379.224116]  kthread+0x135/0x160
[ 7379.224127]  ? set_kthread_struct+0x40/0x40
[ 7379.224135]  ret_from_fork+0x1f/0x30
[ 7379.224173] INFO: task kworker/u16:7:16155 blocked for more than 491 seconds.
[ 7379.224178]       Tainted: G        W         5.13.0-git+ #779
[ 7379.224182] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 7379.224186] task:kworker/u16:7   state:D stack: 9608 pid:16155 ppid:     2 flags:0x00004000
[ 7379.224203] Workqueue: events_unbound btrfs_async_reclaim_data_space [btrfs]
[ 7379.224412] Call Trace:
[ 7379.224418]  __schedule+0x27d/0x730
[ 7379.224431]  ? usleep_range+0x90/0x90
[ 7379.224438]  schedule+0x59/0xc0
[ 7379.224450]  schedule_timeout+0xed/0x130
[ 7379.224460]  ? lock_acquire+0xa0/0x150
[ 7379.224466]  ? __wait_for_common+0x4f/0x170
[ 7379.224477]  ? lock_release+0x9c/0x140
[ 7379.224486]  ? trace_hardirqs_on+0x1b/0xf0
[ 7379.224495]  __wait_for_common+0xc2/0x170
[ 7379.224509]  btrfs_wait_ordered_extents+0x3d6/0x470 [btrfs]
[ 7379.224707]  btrfs_wait_ordered_roots+0x175/0x240 [btrfs]
[ 7379.394482]  shrink_delalloc+0x12b/0x1f0 [btrfs]
[ 7379.394598]  flush_space+0x1bf/0x2f0 [btrfs]
[ 7379.403802]  btrfs_async_reclaim_data_space+0xb2/0x150 [btrfs]
[ 7379.409866]  process_one_work+0x262/0x600
[ 7379.409875]  worker_thread+0x4c/0x320
[ 7379.409878]  ? process_one_work+0x600/0x600
[ 7379.409883]  kthread+0x135/0x160
[ 7379.409887]  ? set_kthread_struct+0x40/0x40
[ 7379.409892]  ret_from_fork+0x1f/0x30
[ 7379.409935] INFO: task kworker/u16:5:2964 blocked for more than 491 seconds.
[ 7379.440534]       Tainted: G        W         5.13.0-git+ #779
[ 7379.440537] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 7379.440539] task:kworker/u16:5   state:D stack:10472 pid: 2964 ppid:     2 flags:0x00004000
[ 7379.462995] Workqueue: btrfs-flush_delalloc btrfs_work_helper [btrfs]
[ 7379.463113] Call Trace:
[ 7379.463116]  __schedule+0x27d/0x730
[ 7379.475870]  schedule+0x59/0xc0
[ 7379.475875]  io_schedule+0x42/0x70
[ 7379.482723]  wait_on_page_bit_common+0x105/0x380
[ 7379.482733]  ? page_cache_delete_batch+0x160/0x160
[ 7379.482739]  extent_write_cache_pages+0x339/0x450 [btrfs]
[ 7379.497966]  extent_writepages+0x54/0xc0 [btrfs]
[ 7379.502800]  do_writepages+0x35/0xd0
[ 7379.502808]  ? lock_release+0x9c/0x140
[ 7379.510440]  __filemap_fdatawrite_range+0xce/0x110
[ 7379.510447]  ? __smp_call_single_queue+0x39/0x70
[ 7379.520158]  btrfs_start_ordered_extent+0x98/0x1d0 [btrfs]
[ 7379.525850]  ? lock_acquire+0xa0/0x150
[ 7379.529701]  ? btrfs_work_helper+0x5e/0x1d0 [btrfs]
[ 7379.534767]  btrfs_run_ordered_extent_work+0x1a/0x30 [btrfs]
[ 7379.540600]  btrfs_work_helper+0xd5/0x1d0 [btrfs]
[ 7379.545477]  process_one_work+0x262/0x600
[ 7379.549581]  worker_thread+0x4c/0x320
[ 7379.553334]  ? process_one_work+0x600/0x600
[ 7379.557613]  kthread+0x135/0x160
[ 7379.560946]  ? set_kthread_struct+0x40/0x40
[ 7379.565230]  ret_from_fork+0x1f/0x30
[ 7379.568950] INFO: task kworker/u16:12:25822 blocked for more than 491 seconds.
[ 7379.576304]       Tainted: G        W         5.13.0-git+ #779
[ 7379.576306] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 7379.576308] task:kworker/u16:12  state:D stack:10032 pid:25822 ppid:     2 flags:0x00004000
[ 7379.576320] Workqueue: writeback wb_workfn (flush-btrfs-6)
[ 7379.576328] Call Trace:
[ 7379.576331]  __schedule+0x27d/0x730
[ 7379.576338]  schedule+0x59/0xc0
[ 7379.576342]  io_schedule+0x42/0x70
[ 7379.576346]  wait_on_page_bit_common+0x105/0x380
[ 7379.576353]  ? page_cache_delete_batch+0x160/0x160
[ 7379.626731]  extent_write_cache_pages+0x339/0x450 [btrfs]
[ 7379.632332]  extent_writepages+0x54/0xc0 [btrfs]
[ 7379.632418]  do_writepages+0x35/0xd0
[ 7379.640932]  ? lock_acquired+0xa6/0x150
[ 7379.640936]  ? lock_release+0x9c/0x140
[ 7379.640939]  ? lock_acquire+0xa0/0x150
[ 7379.652643]  ? writeback_sb_inodes+0x152/0x550
[ 7379.657194]  __writeback_single_inode+0x61/0x400
[ 7379.661918]  writeback_sb_inodes+0x20b/0x550
[ 7379.666297]  __writeback_inodes_wb+0x4c/0xe0
[ 7379.670669]  wb_writeback+0x2ee/0x4c0
[ 7379.674438]  wb_do_writeback+0x21b/0x230
[ 7379.678470]  wb_workfn+0x7a/0x2c0
[ 7379.681889]  ? put_prev_task_fair+0x21/0x30
[ 7379.686173]  ? pick_next_task+0xbb6/0xdd0
[ 7379.690288]  ? lock_acquire+0xa0/0x150
[ 7379.694138]  ? finish_task_switch.isra.0+0x76/0x380
[ 7379.699124]  ? lock_acquire+0xa0/0x150
[ 7379.702975]  ? process_one_work+0x1df/0x600
[ 7379.707259]  process_one_work+0x262/0x600
[ 7379.711377]  worker_thread+0x4c/0x320
[ 7379.715142]  ? process_one_work+0x600/0x600
[ 7379.719425]  kthread+0x135/0x160
[ 7379.722750]  ? set_kthread_struct+0x40/0x40
[ 7379.727045]  ret_from_fork+0x1f/0x30
[ 7379.730748] INFO: lockdep is turned off.
...

naota / linux

System bg not reclaimed #52