naota / linux

Linux kernel source tree
Other
3 stars 1 forks source link

Lockdep splat, possible deadlock sb_internal#2 --> &fs_info->reloc_mutex --> &fs_info->zoned_meta_io_lock #66

Open morbidrsa opened 3 months ago

morbidrsa commented 3 months ago
    [ 9654.979258] run fstests generic/696 at 2024-05-23 22:45:00
  [ 9655.620971] BTRFS: device fsid b35d3f2a-a51f-48a1-ae62-e2ab6084cab5 devid 1 transid 6 /dev/nvme1n2 (259:2) scanned by mount (1701699)
  [ 9655.625188] BTRFS info (device nvme1n2): first mount of filesystem b35d3f2a-a51f-48a1-ae62-e2ab6084cab5
  [ 9655.625971] BTRFS info (device nvme1n2): using crc32c (crc32c-intel) checksum algorithm
  [ 9655.626687] BTRFS info (device nvme1n2): using free-space-tree
  [ 9655.640184] BTRFS info (device nvme1n2): host-managed zoned block device /dev/nvme1n2, 904 zones of 2147483648 bytes
  [ 9655.641145] BTRFS info (device nvme1n2): zoned: async discard ignored and disabled for zoned mode
  [ 9655.641879] BTRFS info (device nvme1n2): zoned mode enabled with zone size 2147483648
  [ 9655.645112] BTRFS info (device nvme1n2): checking UUID tree

  [ 9655.737988] ======================================================
  [ 9655.738471] WARNING: possible circular locking dependency detected
  [ 9655.738958] 6.9.0-rc7+ #1 Not tainted
  [ 9655.739250] ------------------------------------------------------
  [ 9655.739729] kworker/u10:6/1684755 is trying to acquire lock:
  [ 9655.740174] ffff8d720d320610 (sb_internal#2){++++}-{0:0}, at: btrfs_inc_block_group_ro+0x59/0x240
  [ 9655.740880] 
                 but task is already holding lock:
  [ 9655.741333] ffff8d726cb92218 (&fs_info->zoned_meta_io_lock){+.+.}-{3:3}, at: btree_write_cache_pages+0xc4/0x6e0
  [ 9655.742121] 
                 which lock already depends on the new lock.

[ 9655.742740] 
                 the existing dependency chain (in reverse order) is:
  [ 9655.743312] 
                 -> #3 (&fs_info->zoned_meta_io_lock){+.+.}-{3:3}:
  [ 9655.743862]        __mutex_lock+0xbe/0xc00
  [ 9655.744198]        btree_write_cache_pages+0xc4/0x6e0
  [ 9655.744586]        do_writepages+0x79/0x270
  [ 9655.744930]        filemap_fdatawrite_wbc+0x63/0x90
  [ 9655.745321]        __filemap_fdatawrite_range+0x60/0x90
  [ 9655.745727]        btrfs_write_marked_extents+0xab/0x150
  [ 9655.746147]        btrfs_write_and_wait_transaction+0x56/0xe0
  [ 9655.746587]        create_pending_snapshot+0x11fa/0x1260
  [ 9655.747004]        create_pending_snapshots+0xaa/0xd0
  [ 9655.747394]        btrfs_commit_transaction+0x795/0x13a0
  [ 9655.747811]        btrfs_mksubvol+0x380/0x610
  [ 9655.748159]        btrfs_mksnapshot+0x79/0xb0
  [ 9655.748504]        __btrfs_ioctl_snap_create+0x1c2/0x1d0
  [ 9655.748925]        btrfs_ioctl_snap_create_v2+0x108/0x130
  [ 9655.749351]        btrfs_ioctl+0x1718/0x26f0
  [ 9655.749689]        __x64_sys_ioctl+0x97/0xd0
  [ 9655.750041]        do_syscall_64+0x95/0x180
  [ 9655.750376]        entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 9655.750817] 
                 -> #2 (&fs_info->reloc_mutex){+.+.}-{3:3}:
  [ 9655.751323]        __mutex_lock+0xbe/0xc00
  [ 9655.751641]        btrfs_commit_transaction+0x78d/0x13a0
  [ 9655.752033]        sync_filesystem+0x7e/0xa0
  [ 9655.752323]        generic_shutdown_super+0x26/0x110
  [ 9655.752650]        kill_anon_super+0x16/0x40
  [ 9655.752938]        btrfs_kill_super+0x16/0x20
  [ 9655.753231]        deactivate_locked_super+0x33/0xa0
  [ 9655.753556]        cleanup_mnt+0xba/0x150
  [ 9655.753826]        task_work_run+0x5c/0xa0
  [ 9655.754105]        syscall_exit_to_user_mode+0x292/0x2a0
  [ 9655.754454]        do_syscall_64+0xa2/0x180
  [ 9655.754730]        entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 9655.755103] 
                 -> #1 (btrfs_trans_unblocked){++++}-{0:0}:
  [ 9655.755536]        wait_current_trans+0xc5/0x1b0
  [ 9655.755848]        start_transaction+0x40a/0xa70
  [ 9655.756153]        btrfs_finish_one_ordered+0x3b7/0xad0
  [ 9655.756506]        btrfs_work_helper+0x10a/0x4c0
  [ 9655.756816]        process_one_work+0x228/0x740
  [ 9655.757121]        worker_thread+0x1dc/0x3c0
  [ 9655.757406]        kthread+0xe3/0x110
  [ 9655.757654]        ret_from_fork+0x34/0x50
  [ 9655.757937]        ret_from_fork_asm+0x1a/0x30
  [ 9655.758235] 
                 -> #0 (sb_internal#2){++++}-{0:0}:
  [ 9655.758629]        __lock_acquire+0x13e7/0x2180
  [ 9655.758941]        lock_acquire+0xcb/0x2e0
  [ 9655.759213]        start_transaction+0x4bb/0xa70
  [ 9655.759520]        btrfs_inc_block_group_ro+0x59/0x240
  [ 9655.759869]        do_zone_finish+0x90/0x3f0
  [ 9655.760158]        btrfs_zone_finish_one_bg+0x113/0x130
  [ 9655.760502]        btrfs_check_meta_write_pointer+0x22a/0x2c0
  [ 9655.760883]        btree_write_cache_pages+0x1e6/0x6e0
  [ 9655.761222]        do_writepages+0x79/0x270
  [ 9655.761505]        __writeback_single_inode+0x5b/0x4d0
  [ 9655.761850]        writeback_sb_inodes+0x1fc/0x540
  [ 9655.762167]        wb_writeback+0xcb/0x3b0
  [ 9655.762440]        wb_workfn+0xda/0x530
  [ 9655.762703]        process_one_work+0x228/0x740
  [ 9655.763009]        worker_thread+0x1dc/0x3c0
  [ 9655.763299]        kthread+0xe3/0x110
  [ 9655.763547]        ret_from_fork+0x34/0x50
  [ 9655.763824]        ret_from_fork_asm+0x1a/0x30
[ 9655.764121] 
                 other info that might help us debug this:

  [ 9655.764647] Chain exists of:
                   sb_internal#2 --> &fs_info->reloc_mutex --> &fs_info->zoned_meta_io_lock

  [ 9655.765453]  Possible unsafe locking scenario:

  [ 9655.765846]        CPU0                    CPU1
  [ 9655.766147]        ----                    ----
  [ 9655.766453]   lock(&fs_info->zoned_meta_io_lock);
  [ 9655.766770]                                lock(&fs_info->reloc_mutex);
  [ 9655.767210]                                lock(&fs_info->zoned_meta_io_lock);
  [ 9655.767684]   rlock(sb_internal#2);
  [ 9655.767928] 
                  *** DEADLOCK ***

  [ 9655.768321] 3 locks held by kworker/u10:6/1684755:
  [ 9655.768638]  #0: ffff8d7200cb0948 ((wq_completion)writeback){+.+.}-{0:0}, at: process_one_work+0x43a/0x740
  [ 9655.769274]  #1: ffffb9228729fe60 ((work_completion)(&(&wb->dwork)->work)){+.+.}-{0:0}, at: process_one_work+0x1e2/0x740
  [ 9655.769980]  #2: ffff8d726cb92218 (&fs_info->zoned_meta_io_lock){+.+.}-{3:3}, at: btree_write_cache_pages+0xc4/0x6e0
  [ 9655.770671] 
                 stack backtrace:
  [ 9655.770981] CPU: 1 PID: 1684755 Comm: kworker/u10:6 Kdump: loaded Not tainted 6.9.0-rc7+ #1
  [ 9655.771533] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS edk2-20231122-3.fc39 11/22/2023
  [ 9655.772131] Workqueue: writeback wb_workfn (flush-btrfs-1953)
  [ 9655.772515] Call Trace:
  [ 9655.772686]  <TASK>
  [ 9655.772842]  dump_stack_lvl+0x77/0xb0
  [ 9655.773094]  check_noncircular+0x148/0x160
  [ 9655.773378]  __lock_acquire+0x13e7/0x2180
  [ 9655.773650]  ? fs_reclaim_acquire+0x4c/0xf0
  [ 9655.773936]  ? srso_return_thunk+0x5/0x5f
  [ 9655.774217]  lock_acquire+0xcb/0x2e0
  [ 9655.774460]  ? btrfs_inc_block_group_ro+0x59/0x240
  [ 9655.774788]  ? lock_is_held_type+0x9a/0x110
  [ 9655.775073]  start_transaction+0x4bb/0xa70
  [ 9655.775349]  ? btrfs_inc_block_group_ro+0x59/0x240
  [ 9655.775671]  btrfs_inc_block_group_ro+0x59/0x240
  [ 9655.775984]  do_zone_finish+0x90/0x3f0
  [ 9655.776248]  btrfs_zone_finish_one_bg+0x113/0x130
  [ 9655.776564]  btrfs_check_meta_write_pointer+0x22a/0x2c0
  [ 9655.776915]  btree_write_cache_pages+0x1e6/0x6e0
  [ 9655.777243]  do_writepages+0x79/0x270
  [ 9655.777491]  ? srso_return_thunk+0x5/0x5f
  [ 9655.777762]  ? lock_acquire+0xcb/0x2e0
  [ 9655.778015]  ? srso_return_thunk+0x5/0x5f
  [ 9655.778284]  ? find_held_lock+0x2b/0x80
  [ 9655.778546]  __writeback_single_inode+0x5b/0x4d0
  [ 9655.778858]  ? do_raw_spin_unlock+0x4d/0xb0
  [ 9655.779138]  ? srso_return_thunk+0x5/0x5f
  [ 9655.779418]  writeback_sb_inodes+0x1fc/0x540
  [ 9655.779727]  wb_writeback+0xcb/0x3b0
  [ 9655.779977]  wb_workfn+0xda/0x530
  [ 9655.780205]  ? srso_return_thunk+0x5/0x5f
  [ 9655.780473]  ? lock_release+0xca/0x2a0
  [ 9655.780726]  ? lock_is_held_type+0x9a/0x110
  [ 9655.781012]  process_one_work+0x228/0x740
  [ 9655.781283]  ? srso_return_thunk+0x5/0x5f
  [ 9655.781556]  worker_thread+0x1dc/0x3c0
  [ 9655.781814]  ? __pfx_worker_thread+0x10/0x10
  [ 9655.782102]  kthread+0xe3/0x110
  [ 9655.782329]  ? __pfx_kthread+0x10/0x10
  [ 9655.782615]  ret_from_fork+0x34/0x50
  [ 9655.782877]  ? __pfx_kthread+0x10/0x10
  [ 9655.783132]  ret_from_fork_asm+0x1a/0x30
  [ 9655.783403]  </TASK>
  [ 9655.787793] BTRFS info (device nvme0n2): last unmount of filesystem 16416b7b-8293-44d6-821e-da28d304e2fd
  [ 9655.839914] BTRFS: device fsid 16416b7b-8293-44d6-821e-da28d304e2fd devid 1 transid 41462 /dev/nvme0n2 (259:3) scanned by mount (1701744)
  [ 9655.862189] BTRFS info (device nvme0n2): first mount of filesystem 16416b7b-8293-44d6-821e-da28d304e2fd
  [ 9655.862860] BTRFS info (device nvme0n2): using crc32c (crc32c-intel) checksum algorithm
  [ 9655.863441] BTRFS info (device nvme0n2): using free-space-tree
  [ 9655.868835] BTRFS info (device nvme0n2): host-managed zoned block device /dev/nvme0n2, 904 zones of 2147483648 bytes
  [ 9655.869607] BTRFS info (device nvme0n2): zoned: async discard ignored and disabled for zoned mode
  [ 9655.870236] BTRFS info (device nvme0n2): zoned mode enabled with zone size 2147483648
  [ 9655.922465] BTRFS info (device nvme1n2): last unmount of filesystem b35d3f2a-a51f-48a1-ae62-e2ab6084cab5
  [ 9655.944342] BTRFS: device fsid b35d3f2a-a51f-48a1-ae62-e2ab6084cab5 devid 1 transid 7 /dev/nvme1n2 (259:2) scanned by mount (1701774)
  [ 9655.947155] BTRFS info (device nvme1n2): first mount of filesystem b35d3f2a-a51f-48a1-ae62-e2ab6084cab5
  [ 9655.947810] BTRFS info (device nvme1n2): using crc32c (crc32c-intel) checksum algorithm
  [ 9655.948357] BTRFS info (device nvme1n2): using free-space-tree
  [ 9655.953701] BTRFS info (device nvme1n2): host-managed zoned block device /dev/nvme1n2, 904 zones of 2147483648 bytes
  [ 9655.954454] BTRFS info (device nvme1n2): zoned: async discard ignored and disabled for zoned mode
  [ 9655.955053] BTRFS info (device nvme1n2): zoned mode enabled with zone size 2147483648
  [ 9655.963649] BTRFS info (device nvme1n2): last unmount of filesystem b35d3f2a-a51f-48a1-ae62-e2ab6084cab5