koverstreet / bcachefs

Other
632 stars 69 forks source link

INFO: task bch-reclaim/c74:3184 blocked for more than 491 seconds. #701

Open combor opened 1 week ago

combor commented 1 week ago

Hi, I've got a very repeatable issue with bcachefs. It happens immediately after any write attempt to the filesystem. dmesg reports:

[  860.219325] INFO: task bch-reclaim/c74:3184 blocked for more than 491 seconds.
[  860.219330]       Not tainted 6.9.7-arch1-1 #1
[  860.219332] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  860.219334] task:bch-reclaim/c74 state:D stack:0     pid:3184  tgid:3184  ppid:2      flags:0x00004000
[  860.219339] Call Trace:
[  860.219341]  <TASK>
[  860.219344]  __schedule+0x3c7/0x1510
[  860.219354]  schedule+0x27/0xf0
[  860.219357]  __closure_sync+0x7e/0x140
[  860.219363]  bch2_btree_update_start+0x9c0/0x9d0 [bcachefs aa0637810d467c8e6cf072acf6a70476543ba202]
[  860.219414]  ? __pfx_closure_sync_fn+0x10/0x10
[  860.219419]  ? bch2_btree_write_buffer_flush_locked+0x9ac/0xaa0 [bcachefs aa0637810d467c8e6cf072acf6a70476543ba202]
[  860.219458]  ? bch2_btree_split_leaf+0x56/0x1a0 [bcachefs aa0637810d467c8e6cf072acf6a70476543ba202]
[  860.219493]  bch2_btree_split_leaf+0x56/0x1a0 [bcachefs aa0637810d467c8e6cf072acf6a70476543ba202]
[  860.219525]  ? bch2_btree_write_buffer_flush_locked+0x9ac/0xaa0 [bcachefs aa0637810d467c8e6cf072acf6a70476543ba202]
[  860.219557]  bch2_trans_commit_error+0x1fb/0x4e0 [bcachefs aa0637810d467c8e6cf072acf6a70476543ba202]
[  860.219592]  ? srso_alias_return_thunk+0x5/0xfbef5
[  860.219595]  ? six_trylock_ip+0x1f/0x50 [bcachefs aa0637810d467c8e6cf072acf6a70476543ba202]
[  860.219641]  ? srso_alias_return_thunk+0x5/0xfbef5
[  860.219646]  __bch2_trans_commit+0x15f4/0x1730 [bcachefs aa0637810d467c8e6cf072acf6a70476543ba202]
[  860.219681]  ? srso_alias_return_thunk+0x5/0xfbef5
[  860.219684]  ? btree_write_buffered_insert+0xb8/0x100 [bcachefs aa0637810d467c8e6cf072acf6a70476543ba202]
[  860.219722]  bch2_btree_write_buffer_flush_locked+0x9ac/0xaa0 [bcachefs aa0637810d467c8e6cf072acf6a70476543ba202]
[  860.219755]  ? srso_alias_return_thunk+0x5/0xfbef5
[  860.219758]  ? update_load_avg+0x7e/0x7b0
[  860.219765]  ? __pfx_bch2_btree_write_buffer_journal_flush+0x10/0x10 [bcachefs aa0637810d467c8e6cf072acf6a70476543ba202]
[  860.219797]  btree_write_buffer_flush_seq+0x28c/0x2e0 [bcachefs aa0637810d467c8e6cf072acf6a70476543ba202]
[  860.219829]  ? srso_alias_return_thunk+0x5/0xfbef5
[  860.219832]  ? local_clock_noinstr+0xd/0xd0
[  860.219836]  ? __pfx_bch2_btree_write_buffer_journal_flush+0x10/0x10 [bcachefs aa0637810d467c8e6cf072acf6a70476543ba202]
[  860.219867]  bch2_btree_write_buffer_journal_flush+0x35/0x60 [bcachefs aa0637810d467c8e6cf072acf6a70476543ba202]
[  860.219899]  journal_flush_pins.constprop.0+0x192/0x2c0 [bcachefs aa0637810d467c8e6cf072acf6a70476543ba202]
[  860.219947]  __bch2_journal_reclaim+0x1d5/0x380 [bcachefs aa0637810d467c8e6cf072acf6a70476543ba202]
[  860.219986]  bch2_journal_reclaim_thread+0x6e/0x160 [bcachefs aa0637810d467c8e6cf072acf6a70476543ba202]
[  860.220021]  ? __pfx_bch2_journal_reclaim_thread+0x10/0x10 [bcachefs aa0637810d467c8e6cf072acf6a70476543ba202]
[  860.220054]  kthread+0xd2/0x100
[  860.220058]  ? __pfx_kthread+0x10/0x10
[  860.220062]  ret_from_fork+0x34/0x50
[  860.220066]  ? __pfx_kthread+0x10/0x10
[  860.220069]  ret_from_fork_asm+0x1a/0x30
[  860.220076]  </TASK>

or

[  737.343590] INFO: task umount:3381 blocked for more than 368 seconds.                                                                                              
[  737.343592]       Not tainted 6.9.7-arch1-1 #1                                                                                                                     
[  737.343594] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.                                                                              
[  737.343595] task:umount          state:D stack:0     pid:3381  tgid:3381  ppid:1697   flags:0x00004002                                                             
[  737.343598] Call Trace:                                                                                                                                            
[  737.343600]  <TASK>                                                                                                                                                
[  737.343602]  __schedule+0x3c7/0x1510                                                                                                                               
[  737.343606]  ? schedule+0x27/0xf0                                                                                                                                  
[  737.343611]  schedule+0x27/0xf0                                                                                                                                    
[  737.343614]  schedule_timeout+0x12f/0x160                                                                                                                          
[  737.343618]  wait_for_completion+0x86/0x170                                                                                                                        
[  737.343623]  kthread_stop+0x6a/0x180                                                                                                                               
[  737.343627]  bch2_copygc_stop+0x1e/0x80 [bcachefs aa0637810d467c8e6cf072acf6a70476543ba202]                                                                        
[  737.343659]  __bch2_fs_read_only+0x3b/0x210 [bcachefs aa0637810d467c8e6cf072acf6a70476543ba202]                                                                    
[  737.343703]  bch2_fs_read_only+0x140/0x3f0 [bcachefs aa0637810d467c8e6cf072acf6a70476543ba202]                                                                     
[  737.343739]  ? __pfx_autoremove_wake_function+0x10/0x10                                                                                                            
[  737.343744]  __bch2_fs_stop+0x5a/0x380 [bcachefs aa0637810d467c8e6cf072acf6a70476543ba202]                                                                         
[  737.343778]  ? srso_alias_return_thunk+0x5/0xfbef5                                                                                                                 
[  737.343782]  generic_shutdown_super+0x7a/0x170                                                                                                                     
[  737.343787]  bch2_kill_sb+0x16/0x20 [bcachefs aa0637810d467c8e6cf072acf6a70476543ba202]                                                                            
[  737.343836]  deactivate_locked_super+0x33/0xb0                                                                                                                     
[  737.343839]  cleanup_mnt+0xba/0x150                                                                                                                                
[  737.343843]  task_work_run+0x5c/0x90                                                                                                                               
[  737.343848]  syscall_exit_to_user_mode+0x1fe/0x210                                                                                                                 
[  737.343852]  do_syscall_64+0x8f/0x190                                                                                                                              
[  737.343859]  ? srso_alias_return_thunk+0x5/0xfbef5                                                                                                                 
[  737.343862]  ? vfs_statx+0x93/0x1c0                                                                                                                                
[  737.343867]  ? srso_alias_return_thunk+0x5/0xfbef5                                                                                                                 
[  737.343870]  ? vfs_fstatat+0x94/0xb0                                                                                                                               
[  737.343873]  ? srso_alias_return_thunk+0x5/0xfbef5                                                                                                                 
[  737.343876]  ? __do_sys_newfstatat+0x3c/0x80                                                                                                                       
[  737.343882]  ? srso_alias_return_thunk+0x5/0xfbef5                                                                                                                 
[  737.343885]  ? syscall_exit_to_user_mode+0x75/0x210                                                                                                                
[  737.343888]  ? srso_alias_return_thunk+0x5/0xfbef5                                                                                                                 
[  737.343891]  ? do_syscall_64+0x8f/0x190                                                                                                                            
[  737.343896]  ? srso_alias_return_thunk+0x5/0xfbef5                                                                                                                 
[  737.343899]  ? generic_permission+0x39/0x220                                                                                                                       
[  737.343903]  ? srso_alias_return_thunk+0x5/0xfbef5                                                                                                                 
[  737.343905]  ? mntput_no_expire+0x4a/0x260                                                                                                                         
[  737.343909]  ? srso_alias_return_thunk+0x5/0xfbef5                                                                                                                 
[  737.343912]  ? do_faccessat+0x1e4/0x2e0                                                                                                                            
[  737.343917]  ? srso_alias_return_thunk+0x5/0xfbef5                                                                                                                 
[  737.343919]  ? syscall_exit_to_user_mode+0x75/0x210
[  737.343921]  ? srso_alias_return_thunk+0x5/0xfbef5
[  737.343923]  ? do_syscall_64+0x8f/0x190
[  737.343925]  ? srso_alias_return_thunk+0x5/0xfbef5
[  737.343927]  ? do_user_addr_fault+0x34e/0x620
[  737.343930]  ? srso_alias_return_thunk+0x5/0xfbef5
[  737.343932]  ? srso_alias_return_thunk+0x5/0xfbef5
[  737.343934]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  737.343936] RIP: 0033:0x75f95869ee3b
[  737.343951] RSP: 002b:00007fff0d093518 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
[  737.343953] RAX: 0000000000000000 RBX: 00005f6d73c09658 RCX: 000075f95869ee3b
[  737.343955] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00005f6d73c10180
[  737.343956] RBP: 00007fff0d0935f0 R08: 00005f6d73c08010 R09: 0000000000000007
[  737.343957] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[  737.343958] R13: 00005f6d73c10180 R14: 00005f6d73c09960 R15: 00005f6d73c09550
[  737.343962]  </TASK>

steps to reproduce:

bcachefs format \
    --label=hdd.wd_red /dev/sda \
    --label=ssd.wd_blue /dev/sdc \
    --foreground_target=ssd \
    --promote_target=ssd \
    --background_target=hdd

mount -U c74d1dc2-b9ea-4d3c-a687-f1e81647ec22 /tmp/data/
cd /tmp/data/
dd if=/dev/urandom of=file bs=512 count=2
umount /tmp/data/

some debug info:

# uname -a 
Linux server 6.9.7-arch1-1 #1 SMP PREEMPT_DYNAMIC Fri, 28 Jun 2024 04:32:50 +0000 x86_64 GNU/Linux
# bcachefs version 
1.9.1

# bcachefs show-super /dev/sdc                                                                                                                         
Device:                                     WD Blue SA510 2.                                                                                                          
External UUID:                             c74d1dc2-b9ea-4d3c-a687-f1e81647ec22                                                                                       
Internal UUID:                             c70d838e-8ca2-419b-9916-e4a61a57f969                                                                                       
Magic number:                              c68573f6-66ce-90a9-d96a-60cf803df7ef                                                                                       
Device index:                              1                                                                                                                          
Label:                                                                                                                                                                
Version:                                   1.7: mi_btree_bitmap                                                                                                       
Version upgrade complete:                  1.7: mi_btree_bitmap                                                                                                       
Oldest version on disk:                    1.7: mi_btree_bitmap                                                                                                       
Created:                                   Sun Jun 30 13:54:09 2024                                                                                                   
Sequence number:                           24                                                                                                                         
Time of last write:                        Sun Jun 30 21:56:41 2024                                                                                                   
Superblock size:                           4.72 KiB/1.00 MiB                                                                                                          
Clean:                                     0                                                                                                                          
Devices:                                   2                                                                                                                          
Sections:                                  members_v1,replicas_v0,disk_groups,clean,journal_seq_blacklist,journal_v2,counters,members_v2,errors,ext,downgrade         
Features:                                  journal_seq_blacklist_v3,new_siphash,new_extent_overwrite,btree_ptr_v2,extents_above_btree_updates,btree_updates_journalled
,new_varint,journal_no_flush,alloc_v2,extents_across_btree_nodes                                                                                                      
Compat features:                           alloc_info,alloc_metadata,extents_above_btree_updates_done,bformat_overflow_done   

Options:                                                                                                                                                              
  block_size:                              4.00 KiB                                                                                                                   
  btree_node_size:                         256 KiB                                                                                                                    
  errors:                                  continue [fix_safe] panic ro                                                                                               
  metadata_replicas:                       1                                                                                                                          
  data_replicas:                           1                                                                                                                          
  metadata_replicas_required:              1                                                                                                                          
  data_replicas_required:                  1                                                                                                                          
  encoded_extent_max:                      64.0 KiB                                                                                                                   
  metadata_checksum:                       none [crc32c] crc64 xxhash                                                                                                 
  data_checksum:                           none [crc32c] crc64 xxhash                                                                                                 
  compression:                             none                                                                                                                       
  background_compression:                  none                                                                                                                       
  str_hash:                                crc32c crc64 [siphash]                                                                                                     
  metadata_target:                         none                                                                                                                       
  foreground_target:                       ssd                                                                                                                        
  background_target:                       hdd                                                                                                                        
  promote_target:                          ssd
  erasure_code:                            0
  inodes_32bit:                            1
  shard_inode_numbers:                     1
  inodes_use_key_cache:                    1
  gc_reserve_percent:                      8
  gc_reserve_bytes:                        0 B
  root_reserve_percent:                    0
  wide_macs:                               0
  acl:                                     1
  usrquota:                                0
  grpquota:                                0
  prjquota:                                0
  journal_flush_delay:                     1000
  journal_flush_disabled:                  0
  journal_reclaim_delay:                   100
  journal_transaction_names:               1
  version_upgrade:                         [compatible] incompatible none 
  nocow:                                   0

members_v2 (size 304):
Device:                                    0
  Label:                                   wd_red (1)
  UUID:                                    5ff29c65-db36-4b54-953b-8fae5779e44d
  Size:                                    5.46 TiB
  read errors:                             0
  write errors:                            0
  checksum errors:                         0
  seqread iops:                            0
  seqwrite iops:                           0
  randread iops:                           0
  randwrite iops:                          0
  Bucket size:                             512 KiB
  First bucket:                            0
  Buckets:                                 11446333
  Last mount:                              Sun Jun 30 21:56:41 2024
  Last superblock write:                   24
  State:                                   rw
  Data allowed:                            journal,btree,user
  Has data:                                (none)
  Btree allocated bitmap blocksize:        1.00 B
  Btree allocated bitmap:                  0000000000000000000000000000000000000000000000000000000000000000
  Durability:                              1
  Discard:                                 0
  Freespace initialized:                   1
Device:                                    1
  Label:                                   wd_blue (3)
  UUID:                                    021e7cc7-3c52-4aad-9c84-00af0a31e3d8
  Size:                                    932 GiB
  read errors:                             0
  write errors:                            0
  checksum errors:                         0
  seqread iops:                            0
  seqwrite iops:                           0
  randread iops:                           0
  randwrite iops:                          0
  Bucket size:                             512 KiB
  First bucket:                            0
  Buckets:                                 1907739
  Last mount:                              Sun Jun 30 21:56:41 2024
  Last superblock write:                   24
  State:                                   rw
  Data allowed:                            journal,btree,user
  Has data:                                journal,btree
  Btree allocated bitmap blocksize:        256 KiB
  Btree allocated bitmap:                  0000000000000000000000000000000100000000000000000000000000000000
  Durability:                              1
  Discard:                                 0
  Freespace initialized:                   1

Any other logs I could provide to help debug? Thanks

ssfdust commented 1 week ago

The same problem occurs here, but after I upgraded kernel to the latest mainline version, the issue seems to be resolved.

combor commented 1 week ago

I'm on latest stable 6.9.7-arch1-1 and it didnt go unfortunatelly.

SorrelArticulata commented 1 week ago

I am seeing the same messages every few minutes this morning. I did updates yesterday, but I didn't upgrade my bootloader config so I'm still on an older kernel (6.8.12).

EDIT: Shutdown has hung at "unmounting filesystems", something is quite unhappy.

EDIT2: Kernel 6.9.7 is just as broken. Random stuff is just hanging, like using 'su' or starting certain applications. Not a good day :| Hung at unmounting filesystems again. I'll try a memtest next, then an older older kernel.

SorrelArticulata commented 1 week ago

Problem still manifests in an older kernel (6.7.12) and I passed a memtest, so I think the filesystem itself or my NVME might be hosed.

My /home filesystem is bcachefs whilst my root filesystem is ext4. That makes me wonder how 'su' failed earlier, perhaps some shared resource?

The most important things I did yesterday were run backups (yay) and then update my distro (Voidlinux). My timing might have been good.

dmesg_short.log

SorrelArticulata commented 1 week ago

bcachefs fsck /dev/nvme0n1p2: Passes fine. bcachefs fsck -k /dev/nvme0n1p2: Lots of errors that it wants me to fix, see bcachefs fsck kernel.log

Could this be an issue caused by the kernel and bcachefs-tools being mismatched? Or the latter having been upgraded? I'll hop on IRC and ask for help before blindly saying yes to anything.

@ssfdust @combor Could you please report your kernel version (uname -a), tools version (run "bcachefs version") and if you are seeing similar results to me from "bcachefs fsck -k /dev/yourdisk" ?

bcachefs-tools verison 1.9.1 kernel 6.9.7_1

ssfdust commented 1 week ago

@SorrelArticulata Kernel Version: 6.10.0-rc6-1-mainline Bcachefs Tool Version: 1.9.1 In-kernel Filesystem Check Result: Error occurs, the fsck complains unreachable inode.

Although I have reservations about the safety of running fsck on a mounted bcachefs device, the command appears to execute successfully.

SorrelArticulata commented 1 week ago

Problem seems to be solved now that I've pulled and compiled the latest kernel sources from this repo (d6d793ae55897afe5871b8bf7eb3599ba25890fa). All my programs work, dmesg doesn't report anything weird, "bcachefs fsck -k" no longer claims that my filesystem is damaged and it looks like my kernel now supports version 1.9 of the disk format.

It looks like the issue might be related to having mismatched versions of bcachefs-tools and kernel (and bcachefs-tools upgrading the filesystem from v1.7 format to v1.9 format). I am glad that I didn't tell "bcachefs fsck -k" to make any fixes.

Although I have reservations about the safety of running fsck on a mounted bcachefs device, the command appears to execute successfully.

@ssfdust IIRC it used to throw that same error when I ran it on a mounted volume, even mounted ro. I think it's a limitation of the tool.