hhoffstaette / kernel-patches

Custom Linux kernel patches
39 stars 7 forks source link

btrfs: fails with latest master #3

Closed disaster123 closed 8 years ago

disaster123 commented 8 years ago

While using latest master btrfs hangs after some seconds with: INFO: task kworker/u66:0:70 blocked for more than 120 seconds. Tainted: G O 4.4.8+18-ph #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kworker/u66:0 D ffff8808fcf0bad8 0 70 2 0x00080000 Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs] ffff8808fcf0bad8 ffff8808fce80000 ffff8808fcf00000 ffff8808fcf0bb08 ffff8808fcf0c000 ffff880e78e19e28 7fffffffffffffff ffff8808fcf00000 0000000000000001 ffff8808fcf0baf8 ffffffff936b1799 ffff88105bab5c00 Call Trace: [<ffffffff936b1799>] schedule+0x39/0x80 [<ffffffff936b4167>] schedule_timeout+0x237/0x2d0 [<ffffffff93097764>] ? wake_up_worker+0x24/0x30 [<ffffffff93098024>] ? insert_work+0x74/0xc0 [<ffffffff936b233f>] wait_for_completion+0x9f/0x110 [<ffffffff930ac140>] ? try_to_wake_up+0x3e0/0x3e0 [<ffffffffc024a680>] btrfs_async_run_delayed_refs+0x150/0x170 [btrfs] [<ffffffffc0267167>] __btrfs_end_transaction+0x217/0x350 [btrfs] [<ffffffffc02672d0>] btrfs_end_transaction+0x10/0x20 [btrfs] [<ffffffffc026ff70>] btrfs_finish_ordered_io+0x150/0x630 [btrfs] [<ffffffffc0270465>] finish_ordered_fn+0x15/0x20 [btrfs] [<ffffffffc0297f30>] normal_work_helper+0xc0/0x2f0 [btrfs] [<ffffffff930988cd>] ? pwq_activate_delayed_work+0x3d/0xa0 [<ffffffffc02982f2>] btrfs_endio_write_helper+0x12/0x20 [btrfs] [<ffffffff9309a900>] process_one_work+0x150/0x3e0 [<ffffffff9309acea>] worker_thread+0x11a/0x440 [<ffffffff936b1031>] ? __schedule+0x341/0x9c0 [<ffffffff9309abd0>] ? process_scheduled_works+0x40/0x40 [<ffffffff930a0d99>] kthread+0xc9/0xe0 [<ffffffff930a0cd0>] ? kthread_stop+0x100/0x100 [<ffffffff936b56cf>] ret_from_fork+0x3f/0x70 [<ffffffff930a0cd0>] ? kthread_stop+0x100/0x100 INFO: task kworker/u65:8:809 blocked for more than 120 seconds. Tainted: G O 4.4.8+18-ph #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kworker/u65:8 D ffff88085987fad8 0 809 2 0x00080000 Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs] ffff88085987fad8 ffff8808fce4ca00 ffff88085d4e8000 ffff88085987fb08 ffff880859880000 ffff8806e97241a8 7fffffffffffffff ffff88085d4e8000 0000000000000001 ffff88085987faf8 ffffffff936b1799 ffff880072bd3c00 Call Trace: [<ffffffff936b1799>] schedule+0x39/0x80 [<ffffffff936b4167>] schedule_timeout+0x237/0x2d0 [<ffffffff93097764>] ? wake_up_worker+0x24/0x30 [<ffffffff93098024>] ? insert_work+0x74/0xc0 [<ffffffff936b233f>] wait_for_completion+0x9f/0x110 [<ffffffff930ac140>] ? try_to_wake_up+0x3e0/0x3e0 [<ffffffffc024a680>] btrfs_async_run_delayed_refs+0x150/0x170 [btrfs] [<ffffffffc0267167>] __btrfs_end_transaction+0x217/0x350 [btrfs] [<ffffffffc02672d0>] btrfs_end_transaction+0x10/0x20 [btrfs] [<ffffffffc026ff70>] btrfs_finish_ordered_io+0x150/0x630 [btrfs] [<ffffffffc0270465>] finish_ordered_fn+0x15/0x20 [btrfs] [<ffffffffc0297f30>] normal_work_helper+0xc0/0x2f0 [btrfs] [<ffffffff930988cd>] ? pwq_activate_delayed_work+0x3d/0xa0 [<ffffffffc02982f2>] btrfs_endio_write_helper+0x12/0x20 [btrfs] [<ffffffff9309a900>] process_one_work+0x150/0x3e0 [<ffffffff9309acea>] worker_thread+0x11a/0x440 [<ffffffff936b1031>] ? __schedule+0x341/0x9c0 [<ffffffff9309abd0>] ? process_scheduled_works+0x40/0x40 [<ffffffff930a0d99>] kthread+0xc9/0xe0 [<ffffffff930a0cd0>] ? kthread_stop+0x100/0x100 [<ffffffff936b56cf>] ret_from_fork+0x3f/0x70 [<ffffffff930a0cd0>] ? kthread_stop+0x100/0x100 INFO: task kworker/u66:1:986 blocked for more than 120 seconds. Tainted: G O 4.4.8+18-ph #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kworker/u66:1 D ffff88105ae5fad8 0 986 2 0x00080000 Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs] ffff88105ae5fad8 ffff8808fce40000 ffff88105e2b2500 ffff88105ae5fb08 ffff88105ae60000 ffff8810423863e8 7fffffffffffffff ffff88105e2b2500 0000000000000001 ffff88105ae5faf8 ffffffff936b1799 ffff88105bab5c00 Call Trace: [<ffffffff936b1799>] schedule+0x39/0x80 [<ffffffff936b4167>] schedule_timeout+0x237/0x2d0 [<ffffffff93097764>] ? wake_up_worker+0x24/0x30 [<ffffffff93098024>] ? insert_work+0x74/0xc0 [<ffffffff936b233f>] wait_for_completion+0x9f/0x110 [<ffffffff930ac140>] ? try_to_wake_up+0x3e0/0x3e0 [<ffffffffc024a680>] btrfs_async_run_delayed_refs+0x150/0x170 [btrfs] [<ffffffffc0267167>] __btrfs_end_transaction+0x217/0x350 [btrfs] [<ffffffffc02672d0>] btrfs_end_transaction+0x10/0x20 [btrfs] [<ffffffffc026ff70>] btrfs_finish_ordered_io+0x150/0x630 [btrfs] [<ffffffffc0270465>] finish_ordered_fn+0x15/0x20 [btrfs] [<ffffffffc0297f30>] normal_work_helper+0xc0/0x2f0 [btrfs] [<ffffffff930988cd>] ? pwq_activate_delayed_work+0x3d/0xa0 [<ffffffffc02982f2>] btrfs_endio_write_helper+0x12/0x20 [btrfs] [<ffffffff9309a900>] process_one_work+0x150/0x3e0 [<ffffffff9309acea>] worker_thread+0x11a/0x440 [<ffffffff936b1031>] ? __schedule+0x341/0x9c0 [<ffffffff9309abd0>] ? process_scheduled_works+0x40/0x40 [<ffffffff930a0d99>] kthread+0xc9/0xe0 [<ffffffff930a0cd0>] ? kthread_stop+0x100/0x100 [<ffffffff936b56cf>] ret_from_fork+0x3f/0x70 [<ffffffff930a0cd0>] ? kthread_stop+0x100/0x100 INFO: task btrfs-transacti:10173 blocked for more than 120 seconds. Tainted: G O 4.4.8+18-ph #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. btrfs-transacti D ffff880853effd78 0 10173 2 0x00080000 ffff880853effd78 ffff8808fce40000 ffff88017ca74a00 0000000000000246 ffff880853f00000 ffff881051694098 ffff88105db001c8 0000000000000000 ffff88017c3cb800 ffff880853effd98 ffffffff936b1799 ffff881051694098 Call Trace: [<ffffffff936b1799>] schedule+0x39/0x80 [<ffffffffc026415a>] btrfs_wait_pending_ordered+0x3a/0x70 [btrfs] [<ffffffff930c4f80>] ? add_wait_queue+0x60/0x60 [<ffffffffc0266861>] btrfs_commit_transaction+0x3b1/0xa50 [btrfs] [<ffffffff930e2fb0>] ? call_timer_fn+0x140/0x140 [<ffffffffc02640f6>] transaction_kthread+0x1e6/0x210 [btrfs] [<ffffffffc0263f10>] ? open_ctree+0x2600/0x2600 [btrfs] [<ffffffff930a0d99>] kthread+0xc9/0xe0 [<ffffffff930a0cd0>] ? kthread_stop+0x100/0x100 [<ffffffff936b56cf>] ret_from_fork+0x3f/0x70 [<ffffffff930a0cd0>] ? kthread_stop+0x100/0x100 INFO: task kworker/u65:0:15560 blocked for more than 120 seconds. Tainted: G O 4.4.8+18-ph #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kworker/u65:0 D ffff880071dcbad8 0 15560 2 0x00080000 Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs] ffff880071dcbad8 ffff8808fce30000 ffff88085eb80000 ffff880071dcbb08 ffff880071dcc000 ffff88085bb9cf28 7fffffffffffffff ffff88085eb80000 0000000000000001 ffff880071dcbaf8 ffffffff936b1799 ffff880072bd3c00 Call Trace: [<ffffffff936b1799>] schedule+0x39/0x80 [<ffffffff936b4167>] schedule_timeout+0x237/0x2d0 [<ffffffff93097764>] ? wake_up_worker+0x24/0x30 [<ffffffff93098024>] ? insert_work+0x74/0xc0 [<ffffffff936b233f>] wait_for_completion+0x9f/0x110 [<ffffffff930ac140>] ? try_to_wake_up+0x3e0/0x3e0 [<ffffffffc024a680>] btrfs_async_run_delayed_refs+0x150/0x170 [btrfs] [<ffffffffc0267167>] __btrfs_end_transaction+0x217/0x350 [btrfs] [<ffffffffc02672d0>] btrfs_end_transaction+0x10/0x20 [btrfs] [<ffffffffc026ff70>] btrfs_finish_ordered_io+0x150/0x630 [btrfs] [<ffffffffc0270465>] finish_ordered_fn+0x15/0x20 [btrfs] [<ffffffffc0297f30>] normal_work_helper+0xc0/0x2f0 [btrfs] [<ffffffff930988cd>] ? pwq_activate_delayed_work+0x3d/0xa0 [<ffffffffc02982f2>] btrfs_endio_write_helper+0x12/0x20 [btrfs] [<ffffffff9309a900>] process_one_work+0x150/0x3e0 [<ffffffff9309acea>] worker_thread+0x11a/0x440 [<ffffffff936b1031>] ? __schedule+0x341/0x9c0 [<ffffffff9309abd0>] ? process_scheduled_works+0x40/0x40 [<ffffffff930a0d99>] kthread+0xc9/0xe0 [<ffffffff930a0cd0>] ? kthread_stop+0x100/0x100 [<ffffffff936b56cf>] ret_from_fork+0x3f/0x70 [<ffffffff930a0cd0>] ? kthread_stop+0x100/0x100 INFO: task kworker/u65:1:15567 blocked for more than 120 seconds. Tainted: G O 4.4.8+18-ph #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kworker/u65:1 D ffff8808577c7ad8 0 15567 2 0x00080000 Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs] ffff8808577c7ad8 ffffffff93c0f4c0 ffff88085fbfa500 ffff8808577c7b08 ffff8808577c8000 ffff8807a378d3a8 7fffffffffffffff ffff88085fbfa500 0000000000000001 ffff8808577c7af8 ffffffff936b1799 ffff880072bd3c00 Call Trace: [<ffffffff936b1799>] schedule+0x39/0x80 [<ffffffff936b4167>] schedule_timeout+0x237/0x2d0 [<ffffffff93097764>] ? wake_up_worker+0x24/0x30 [<ffffffff93098024>] ? insert_work+0x74/0xc0 [<ffffffff936b233f>] wait_for_completion+0x9f/0x110 [<ffffffff930ac140>] ? try_to_wake_up+0x3e0/0x3e0 [<ffffffffc024a680>] btrfs_async_run_delayed_refs+0x150/0x170 [btrfs] [<ffffffffc0267167>] __btrfs_end_transaction+0x217/0x350 [btrfs] [<ffffffffc02672d0>] btrfs_end_transaction+0x10/0x20 [btrfs] [<ffffffffc026ff70>] btrfs_finish_ordered_io+0x150/0x630 [btrfs] [<ffffffffc0270465>] finish_ordered_fn+0x15/0x20 [btrfs] [<ffffffffc0297f30>] normal_work_helper+0xc0/0x2f0 [btrfs] [<ffffffff930988cd>] ? pwq_activate_delayed_work+0x3d/0xa0 [<ffffffffc02982f2>] btrfs_endio_write_helper+0x12/0x20 [btrfs] [<ffffffff9309a900>] process_one_work+0x150/0x3e0 [<ffffffff9309acea>] worker_thread+0x11a/0x440 [<ffffffff936b1031>] ? __schedule+0x341/0x9c0 [<ffffffff9309abd0>] ? process_scheduled_works+0x40/0x40 [<ffffffff930a0d99>] kthread+0xc9/0xe0 [<ffffffff930a0cd0>] ? kthread_stop+0x100/0x100 [<ffffffff936b56cf>] ret_from_fork+0x3f/0x70 [<ffffffff930a0cd0>] ? kthread_stop+0x100/0x100 INFO: task kworker/u66:2:15568 blocked for more than 120 seconds. Tainted: G O 4.4.8+18-ph #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kworker/u66:2 D ffff881016307ad8 0 15568 2 0x00080000 Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs] ffff881016307ad8 ffff8808fce72500 ffff881015cba500 ffff881016307b08 ffff881016308000 ffff88101153a928 7fffffffffffffff ffff881015cba500 0000000000000001 ffff881016307af8 ffffffff936b1799 ffff88105bab5c00 Call Trace: [<ffffffff936b1799>] schedule+0x39/0x80 [<ffffffff936b4167>] schedule_timeout+0x237/0x2d0 [<ffffffff93097764>] ? wake_up_worker+0x24/0x30 [<ffffffff93098024>] ? insert_work+0x74/0xc0 [<ffffffff936b233f>] wait_for_completion+0x9f/0x110 [<ffffffff930ac140>] ? try_to_wake_up+0x3e0/0x3e0 [<ffffffffc024a680>] btrfs_async_run_delayed_refs+0x150/0x170 [btrfs] [<ffffffffc0267167>] __btrfs_end_transaction+0x217/0x350 [btrfs] [<ffffffffc02672d0>] btrfs_end_transaction+0x10/0x20 [btrfs] [<ffffffffc026ff70>] btrfs_finish_ordered_io+0x150/0x630 [btrfs] [<ffffffffc0270465>] finish_ordered_fn+0x15/0x20 [btrfs] [<ffffffffc0297f30>] normal_work_helper+0xc0/0x2f0 [btrfs] [<ffffffff930988cd>] ? pwq_activate_delayed_work+0x3d/0xa0 [<ffffffffc02982f2>] btrfs_endio_write_helper+0x12/0x20 [btrfs] [<ffffffff9309a900>] process_one_work+0x150/0x3e0 [<ffffffff9309acea>] worker_thread+0x11a/0x440 [<ffffffff936b1031>] ? __schedule+0x341/0x9c0 [<ffffffff9309abd0>] ? process_scheduled_works+0x40/0x40 [<ffffffff930a0d99>] kthread+0xc9/0xe0 [<ffffffff930a0cd0>] ? kthread_stop+0x100/0x100 [<ffffffff936b56cf>] ret_from_fork+0x3f/0x70 [<ffffffff930a0cd0>] ? kthread_stop+0x100/0x100 INFO: task kworker/u65:3:15581 blocked for more than 120 seconds. Tainted: G O 4.4.8+18-ph #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kworker/u65:3 D ffff88084535fc68 0 15581 2 0x00080000 Workqueue: btrfs-extent-refs btrfs_extent_refs_helper [btrfs] ffff88084535fc68 ffff8808fce34a00 ffff8808475d0000 0000000000000246 ffff880845360000 ffff88017c3cb9f0 ffff88017c3cb9f0 0000000000000000 ffff88074e2a17c0 ffff88084535fc88 ffffffff936b1799 ffff88017c3cb9f0 Call Trace: [<ffffffff936b1799>] schedule+0x39/0x80 [<ffffffffc0265bc5>] wait_current_trans.isra.37+0xa5/0x100 [btrfs] [<ffffffff930c4f80>] ? add_wait_queue+0x60/0x60 [<ffffffffc0267524>] start_transaction+0x244/0x480 [btrfs] [<ffffffffc0254762>] ? btrfs_run_delayed_refs+0x202/0x2c0 [btrfs] [<ffffffffc02677c7>] btrfs_attach_transaction+0x17/0x20 [btrfs] [<ffffffffc0254bd8>] delayed_ref_async_start+0x18/0xa0 [btrfs] [<ffffffffc0297f30>] normal_work_helper+0xc0/0x2f0 [btrfs] [<ffffffff930988cd>] ? pwq_activate_delayed_work+0x3d/0xa0 [<ffffffffc0298392>] btrfs_extent_refs_helper+0x12/0x20 [btrfs] [<ffffffff9309a900>] process_one_work+0x150/0x3e0 [<ffffffff9309acea>] worker_thread+0x11a/0x440 [<ffffffff936b1031>] ? __schedule+0x341/0x9c0 [<ffffffff9309abd0>] ? process_scheduled_works+0x40/0x40 [<ffffffff930a0d99>] kthread+0xc9/0xe0 [<ffffffff930a0cd0>] ? kthread_stop+0x100/0x100 [<ffffffff936b56cf>] ret_from_fork+0x3f/0x70 [<ffffffff930a0cd0>] ? kthread_stop+0x100/0x100 INFO: task cp:15841 blocked for more than 120 seconds. Tainted: G O 4.4.8+18-ph #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. cp D ffff881005fd3918 0 15841 15605 0x00080000 ffff881005fd3918 ffff8808fce44a00 ffff880037398000 ffff881005fd3948 ffff881005fd4000 ffff88105eabe568 7fffffffffffffff ffff880037398000 0000000000000001 ffff881005fd3938 ffffffff936b1799 ffff88105bab5c00 Call Trace: [<ffffffff936b1799>] schedule+0x39/0x80 [<ffffffff936b4167>] schedule_timeout+0x237/0x2d0 [<ffffffff93097764>] ? wake_up_worker+0x24/0x30 [<ffffffff93098024>] ? insert_work+0x74/0xc0 [<ffffffff936b233f>] wait_for_completion+0x9f/0x110 [<ffffffff930ac140>] ? try_to_wake_up+0x3e0/0x3e0 [<ffffffffc024a680>] btrfs_async_run_delayed_refs+0x150/0x170 [btrfs] [<ffffffffc0267167>] __btrfs_end_transaction+0x217/0x350 [btrfs] [<ffffffffc02672d0>] btrfs_end_transaction+0x10/0x20 [btrfs] [<ffffffffc029b126>] clone_finish_inode_update+0xa6/0x140 [btrfs] [<ffffffffc029c322>] btrfs_clone+0x902/0x11f0 [btrfs] [<ffffffffc029cefb>] btrfs_ioctl_clone+0x2eb/0x4b0 [btrfs] [<ffffffff931e9a44>] ? mntput+0x24/0x40 [<ffffffffc02a0ca4>] btrfs_ioctl+0x1464/0x2bc0 [btrfs] [<ffffffff93114c72>] ? from_kgid+0x12/0x20 [<ffffffff93114c8e>] ? from_kgid_munged+0xe/0x20 [<ffffffff931cf826>] ? cp_new_stat+0x146/0x160 [<ffffffff9311bf9c>] ? acct_account_cputime+0x1c/0x20 [<ffffffff930b28b3>] ? account_user_time+0x73/0x80 [<ffffffff931dc723>] do_vfs_ioctl+0x83/0x4e0 [<ffffffff9300261f>] ? enter_from_user_mode+0x1f/0x50 [<ffffffff93002711>] ? syscall_trace_enter_phase1+0xc1/0x110 [<ffffffff931dcbcc>] SyS_ioctl+0x4c/0x80 [<ffffffff936b536e>] entry_SYSCALL_64_fastpath+0x12/0x71 INFO: task kworker/u66:3:15845 blocked for more than 120 seconds. Tainted: G O 4.4.8+18-ph #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kworker/u66:3 D ffff881007b4bad8 0 15845 2 0x00080000 Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs] ffff881007b4bad8 ffff8808fce82500 ffff88105ed54a00 ffff881007b4bb08 ffff881007b4c000 ffff88005d62ce68 7fffffffffffffff ffff88105ed54a00 0000000000000001 ffff881007b4baf8 ffffffff936b1799 ffff88105bab5c00 Call Trace: [<ffffffff936b1799>] schedule+0x39/0x80 [<ffffffff936b4167>] schedule_timeout+0x237/0x2d0 [<ffffffff93097764>] ? wake_up_worker+0x24/0x30 [<ffffffff93098024>] ? insert_work+0x74/0xc0 [<ffffffff936b233f>] wait_for_completion+0x9f/0x110 [<ffffffff930ac140>] ? try_to_wake_up+0x3e0/0x3e0 [<ffffffffc024a680>] btrfs_async_run_delayed_refs+0x150/0x170 [btrfs] [<ffffffffc0267167>] __btrfs_end_transaction+0x217/0x350 [btrfs] [<ffffffffc02672d0>] btrfs_end_transaction+0x10/0x20 [btrfs] [<ffffffffc026ff70>] btrfs_finish_ordered_io+0x150/0x630 [btrfs] [<ffffffffc0270465>] finish_ordered_fn+0x15/0x20 [btrfs] [<ffffffffc0297f30>] normal_work_helper+0xc0/0x2f0 [btrfs] [<ffffffff930988cd>] ? pwq_activate_delayed_work+0x3d/0xa0 [<ffffffffc02982f2>] btrfs_endio_write_helper+0x12/0x20 [btrfs] [<ffffffff9309a900>] process_one_work+0x150/0x3e0 [<ffffffff9309acea>] worker_thread+0x11a/0x440 [<ffffffff936b1031>] ? __schedule+0x341/0x9c0 [<ffffffff9309abd0>] ? process_scheduled_works+0x40/0x40 [<ffffffff930a0d99>] kthread+0xc9/0xe0 [<ffffffff930a0cd0>] ? kthread_stop+0x100/0x100 [<ffffffff936b56cf>] ret_from_fork+0x3f/0x70 [<ffffffff930a0cd0>] ? kthread_stop+0x100/0x100

after deleting all patches from 20160411 (rm btrfs-20160411-*) it works fine again.

hhoffstaette commented 8 years ago

This is likely caused by commit b84bcf32d ("track transid for delayed ref flushing") which apparently can cause deadlocks...which I've never encountered myself since adding it. It should now be fixed in c3e29537a2a which addresses exactly that problem. Thanks for testing. :-)

hhoffstaette commented 8 years ago

@disaster123 Just out of curiosity: have your free-space problems become better with recent releases?

disaster123 commented 8 years ago

thx for pointing this out. The free space problem has gone since integrating btrfs patches from 4.5 and switching to space_cache=v2.

Thanks for your work.

hhoffstaette commented 8 years ago

That's great to hear! I think everybody involved (esp. Josef and Omar) would be delighted to hear that on the list.

hhoffstaette commented 8 years ago

Did you get a chance to try the fix? if it's fixed I'd like to close this.

disaster123 commented 8 years ago

Thanks tested it right now. Yes works fine and fixes the deadlock.