Cannot mount filesystem after unclean shutdown (kernel thread hangs during journal replay)

inodentry commented 5 months ago

I have an Arch Linux system that is installed to bcachefs. It is booting using EFISTUB with the kernel on a small EFI partition, everything else is bcachefs. It is a "home media server" running some Docker containers, Kodi, a few other things...

Just to give some relevant info about things I have done in the configuration:

It has two SSDs (/dev/sda1, /dev/sdd1) for foreground+promote and two HDDs (/dev/sdb, /dev/sdc) for background. Replicas=2 at the FS level. Replicas=1 set selectively on some directories. xxhash checksums, no compression.

At one point, I decided that I don't want the rootfs to go on the background devices anymore, so I did bcachefs setattr --background_target=ssd / (set the background to ssd on the root of the filesystem), and then set it back on specific directories like bcachefs setattr --background_target=hdd /srv. I am not sure if this is a good idea, but oh well, it seemed to work.

Kernel: 6.7.6-arch1-2

The issue is that I can no longer mount my filesystem, because it hangs during mount during journal replay. I am now trying to mount it from a live ISO to at least collect dmesg output.

Relevant dmesg output:

[    0.719587] sd 2:0:0:0: [sda] 976773168 512-byte logical blocks: (500 GB/466 GiB)
[    0.719589] sd 2:0:0:0: [sda] 4096-byte physical blocks
[    0.719593] sd 2:0:0:0: [sda] Write Protect is off
[    0.719594] sd 2:0:0:0: [sda] Mode Sense: 00 3a 00 00
[    0.719600] sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    0.719612] sd 2:0:0:0: [sda] Preferred minimum I/O size 4096 bytes
[    0.719657] scsi 3:0:0:0: Direct-Access     ATA      HGST HTS721010A9 A3J0 PQ: 0 ANSI: 5
[    0.719821] sd 3:0:0:0: [sdb] 1953525168 512-byte logical blocks: (1.00 TB/932 GiB)
[    0.719822] sd 3:0:0:0: [sdb] 4096-byte physical blocks
[    0.719827] sd 3:0:0:0: [sdb] Write Protect is off
[    0.719828] sd 3:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[    0.719836] sd 3:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    0.719844] sd 3:0:0:0: [sdb] Preferred minimum I/O size 4096 bytes
[    0.719845] scsi 4:0:0:0: Direct-Access     ATA      HGST HTS721010A9 A3V0 PQ: 0 ANSI: 5
[    0.719959] sd 4:0:0:0: [sdc] 1953525168 512-byte logical blocks: (1.00 TB/932 GiB)
[    0.719961] sd 4:0:0:0: [sdc] 4096-byte physical blocks
[    0.719965] sd 4:0:0:0: [sdc] Write Protect is off
[    0.719966] sd 4:0:0:0: [sdc] Mode Sense: 00 3a 00 00
[    0.719974] sd 4:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    0.719988] sd 4:0:0:0: [sdc] Preferred minimum I/O size 4096 bytes
[    0.719996] scsi 5:0:0:0: Direct-Access     ATA      R3SL480G         2C   PQ: 0 ANSI: 5
[    0.720135] sd 5:0:0:0: [sdd] 937703088 512-byte logical blocks: (480 GB/447 GiB)
[    0.720139] sd 5:0:0:0: [sdd] Write Protect is off
[    0.720140] sd 5:0:0:0: [sdd] Mode Sense: 00 3a 00 00
[    0.720146] sd 5:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    0.720154] sd 5:0:0:0: [sdd] Preferred minimum I/O size 512 bytes
[    0.721044]  sdd: sdd1
[    0.721082] sd 5:0:0:0: [sdd] Attached SCSI disk
[    0.721237]  sda: sda1 sda2 sda3
[    0.721392] sd 2:0:0:0: [sda] supports TCG Opal
[    0.721393] sd 2:0:0:0: [sda] Attached SCSI disk
[    0.987275] sd 3:0:0:0: [sdb] Attached SCSI disk
[    1.003645] sd 4:0:0:0: [sdc] Attached SCSI disk

[  297.162442] bcachefs (0cda7e0c-e826-4f4b-a499-d93bb20ed966): mounting version 1.3: rebalance_work opts=metadata_replicas=2,data_replicas=2,metadata_checksum=xxhash,data_checksum=xxhash,metadata_target=ssd,foreground_target=ssd,background_target=hdd,promote_target=ssd,root_reserve_percent=2
[  297.162446] bcachefs (0cda7e0c-e826-4f4b-a499-d93bb20ed966): recovering from unclean shutdown
[  303.801614] bcachefs (0cda7e0c-e826-4f4b-a499-d93bb20ed966): journal read done, replaying entries 439092-440516
[  304.845607] bcachefs (0cda7e0c-e826-4f4b-a499-d93bb20ed966): alloc_read... done
[  304.867014] bcachefs (0cda7e0c-e826-4f4b-a499-d93bb20ed966): stripes_read... done
[  304.867020] bcachefs (0cda7e0c-e826-4f4b-a499-d93bb20ed966): snapshots_read... done
[  304.994570] bcachefs (0cda7e0c-e826-4f4b-a499-d93bb20ed966): journal_replay...
[  305.061079] bcachefs (0cda7e0c-e826-4f4b-a499-d93bb20ed966): going read-write

[  859.964406] INFO: task kworker/u16:1:13 blocked for more than 491 seconds.
[  859.964473]       Tainted: P           OE      6.7.2-arch1-2 #1
[  859.964510] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  859.964553] task:kworker/u16:1   state:D stack:0     pid:13    tgid:13    ppid:2      flags:0x00004000
[  859.964569] Workqueue: btree_update btree_interior_update_work [bcachefs]
[  859.964827] Call Trace:
[  859.964831]  <TASK>
[  859.964840]  __schedule+0x3e7/0x1410
[  859.964858]  ? __bch2_time_stats_update+0x11c/0x270 [bcachefs b6b77aaf0e90cfaebdb766a3d082fa79e52144ab]
[  859.965120]  schedule+0x32/0xd0
[  859.965129]  __closure_sync+0x82/0x160
[  859.965145]  bch2_btree_update_start+0x928/0x940 [bcachefs b6b77aaf0e90cfaebdb766a3d082fa79e52144ab]
[  859.965394]  ? __pfx_closure_sync_fn+0x10/0x10
[  859.965408]  ? __bch2_foreground_maybe_merge+0x512/0xd10 [bcachefs b6b77aaf0e90cfaebdb766a3d082fa79e52144ab]
[  859.965644]  __bch2_foreground_maybe_merge+0x512/0xd10 [bcachefs b6b77aaf0e90cfaebdb766a3d082fa79e52144ab]
[  859.965892]  ? run_btree_triggers+0x205/0x3b0 [bcachefs b6b77aaf0e90cfaebdb766a3d082fa79e52144ab]
[  859.966130]  __bch2_trans_commit+0xcf8/0x1820 [bcachefs b6b77aaf0e90cfaebdb766a3d082fa79e52144ab]
[  859.966372]  btree_interior_update_work+0x28f/0x9e0 [bcachefs b6b77aaf0e90cfaebdb766a3d082fa79e52144ab]
[  859.966607]  ? btree_interior_update_work+0x106/0x9e0 [bcachefs b6b77aaf0e90cfaebdb766a3d082fa79e52144ab]
[  859.966845]  process_one_work+0x171/0x340
[  859.966857]  worker_thread+0x27b/0x3a0
[  859.966868]  ? __pfx_worker_thread+0x10/0x10
[  859.966876]  kthread+0xe5/0x120
[  859.966887]  ? __pfx_kthread+0x10/0x10
[  859.966898]  ret_from_fork+0x31/0x50
[  859.966908]  ? __pfx_kthread+0x10/0x10
[  859.966918]  ret_from_fork_asm+0x1b/0x30
[  859.966933]  </TASK>
[  859.966937] Future hung task reports are suppressed, see sysctl kernel.hung_task_warnings

bcachefs show-super output:

External UUID:                              0cda7e0c-e826-4f4b-a499-d93bb20ed966
Internal UUID:                              88a840a2-4595-4b47-a58a-d9ebf955800d
Magic number:                               c68573f6-66ce-90a9-d96a-60cf803df7ef
Device index:                               0
Label:
Version:                                    1.3: rebalance_work
Version upgrade complete:                   1.3: rebalance_work
Oldest version on disk:                     1.3: rebalance_work
Created:                                    Sat Feb 24 18:45:40 2024
Sequence number:                            46
Time of last write:                         Sat Feb 24 18:45:41 2024
Superblock size:                            6472
Clean:                                      0
Devices:                                    4
Sections:                                   members_v1,replicas_v0,disk_groups,clean,journal_seq_blacklist,journal_v2,counters,members_v2,errors,ext,downgrade
Features:                                   journal_seq_blacklist_v3,reflink,new_siphash,inline_data,new_extent_overwrite,btree_ptr_v2,extents_above_btree_updates,btree_updates_journalled,reflink_inline_data,new_varint,journal_no_flush,alloc_v2,extents_across_btree_nodes
Compat features:                            alloc_info,alloc_metadata,extents_above_btree_updates_done,bformat_overflow_done

Options:
  block_size:                               4.00 KiB
  btree_node_size:                          256 KiB
  errors:                                   continue [ro] panic
  metadata_replicas:                        2
  data_replicas:                            2
  metadata_replicas_required:               1
  data_replicas_required:                   1
  encoded_extent_max:                       64.0 KiB
  metadata_checksum:                        none crc32c crc64 [xxhash]
  data_checksum:                            none crc32c crc64 [xxhash]
  compression:                              none
  background_compression:                   none
  str_hash:                                 crc32c crc64 [siphash]
  metadata_target:                          ssd
  foreground_target:                        ssd
  background_target:                        hdd
  promote_target:                           ssd
  erasure_code:                             0
  inodes_32bit:                             1
  shard_inode_numbers:                      1
  inodes_use_key_cache:                     1
  gc_reserve_percent:                       8
  gc_reserve_bytes:                         0 B
  root_reserve_percent:                     2
  wide_macs:                                0
  acl:                                      1
  usrquota:                                 0
  grpquota:                                 0
  prjquota:                                 0
  journal_flush_delay:                      1000
  journal_flush_disabled:                   0
  journal_reclaim_delay:                    100
  journal_transaction_names:                1
  version_upgrade:                          [compatible] incompatible none
  nocow:                                    0

members_v2 (size 528):
Device:                                     0
  Label:                                    ssd1 (1)
  UUID:                                     79ecf337-747a-4bb8-981b-34c92a422735
  Size:                                     447 GiB
  read errors:                              0
  write errors:                             0
  checksum errors:                          0
  seqread iops:                             0
  seqwrite iops:                            0
  randread iops:                            0
  randwrite iops:                           0
  Bucket size:                              256 KiB
  First bucket:                             0
  Buckets:                                  1831444
  Last mount:                               Thu Feb 29 15:33:32 2024
  Last superblock write:                    7
  State:                                    rw
  Data allowed:                             journal,btree,user
  Has data:                                 journal,btree,user,cached
  Durability:                               1
  Discard:                                  0
  Freespace initialized:                    1
Device:                                     1
  Label:                                    ssd2 (2)
  UUID:                                     8cb0e37c-3d83-4237-92c2-6aa6424f9cc8
  Size:                                     447 GiB
  read errors:                              0
  write errors:                             0
  checksum errors:                          0
  seqread iops:                             0
  seqwrite iops:                            0
  randread iops:                            0
  randwrite iops:                           0
  Bucket size:                              256 KiB
  First bucket:                             0
  Buckets:                                  1831444
  Last mount:                               Thu Feb 29 15:33:32 2024
  Last superblock write:                    7
  State:                                    rw
  Data allowed:                             journal,btree,user
  Has data:                                 journal,btree,user,cached
  Durability:                               1
  Discard:                                  0
  Freespace initialized:                    1
Device:                                     2
  Label:                                    hdd1 (4)
  UUID:                                     bbe1cb36-ac33-4518-81b9-5d635326cf37
  Size:                                     932 GiB
  read errors:                              0
  write errors:                             0
  checksum errors:                          0
  seqread iops:                             0
  seqwrite iops:                            0
  randread iops:                            0
  randwrite iops:                           0
  Bucket size:                              256 KiB
  First bucket:                             0
  Buckets:                                  3815478
  Last mount:                               Thu Feb 29 15:33:32 2024
  Last superblock write:                    7
  State:                                    rw
  Data allowed:                             journal,btree,user
  Has data:                                 user,cached
  Durability:                               1
  Discard:                                  0
  Freespace initialized:                    1
Device:                                     3
  Label:                                    hdd2 (5)
  UUID:                                     6b984599-4c1b-4c0c-b387-0b89726e2871
  Size:                                     932 GiB
  read errors:                              0
  write errors:                             0
  checksum errors:                          0
  seqread iops:                             0
  seqwrite iops:                            0
  randread iops:                            0
  randwrite iops:                           0
  Bucket size:                              256 KiB
  First bucket:                             0
  Buckets:                                  3815478
  Last mount:                               Thu Feb 29 15:33:32 2024
  Last superblock write:                    7
  State:                                    rw
  Data allowed:                             journal,btree,user
  Has data:                                 user,cached
  Durability:                               1
  Discard:                                  0
  Freespace initialized:                    1

errors (size 8):

Running bcachefs fsck -v /dev/sda1:/dev/sdb:/dev/sdc:/dev/sdd1:

mounting version 1.3: rebalance_work opts=ro,metadata_replicas=2,data_replicas=2,metadata_checksum=xxhash,data_checksum=xxhash,metadata_target=ssd,foreground_target=ssd,background_target=hdd,promote_target=ssd,root_reserve_percent=2,degraded,verbose,fsck,fix_errors=ask,read_only
recovering from unclean shutdown
Doing compatible version upgrade from 1.3: rebalance_work to 1.4: member_seq

starting journal read
journal read done on device /dev/sda1, ret 0
journal read done on device /dev/sdd1, ret 0
journal read done on device /dev/sdb, ret 0
journal read done on device /dev/sdc, ret 0
journal read done, replaying entries 439092-440606
Journal keys: 893109 read, 234741 after sorting and compacting
alloc_read... done
stripes_read... done
snapshots_read... done
check_allocations... done
going read-write
journal_replay...bch2_btree_update_start(): error journal_reclaim_would_deadlock
bch2_btree_update_start(): error journal_reclaim_would_deadlock
bch2_btree_update_start(): error journal_reclaim_would_deadlock
bch2_btree_update_start(): error journal_reclaim_would_deadlock
bch2_btree_update_start(): error journal_reclaim_would_deadlock
bch2_btree_update_start(): error journal_reclaim_would_deadlock
bch2_btree_update_start(): error journal_reclaim_would_deadlock
bch2_btree_update_start(): error journal_reclaim_would_deadlock
bch2_btree_update_start(): error journal_reclaim_would_deadlock
bch2_btree_update_start(): error journal_reclaim_would_deadlock
bch2_btree_update_start: 18123 callbacks suppressed
bch2_btree_update_start(): error journal_reclaim_would_deadlock
...

Please let me know what I could do, if anything, to try to safely recover my filesystem?

Or if you'd like anything else from me to help you diagnose and fix this bug?

joeknock90 commented 5 months ago

I am now experiencing the exact same issue after a power outage.

koverstreet commented 5 months ago

Can you grab /sys/fs/bcachefs//dev-0/alloc_debug?

freswa commented 4 months ago

Same issue here, same backtrace including fsck failing with error journal_reclaim_would_deadlock.

~ % cat /sys/fs/bcachefs/8e4c4cd5-bfeb-41ff-9560-42c68f6461de/dev-0/alloc_debug
                 buckets         sectors      fragmented
free             1029752               0               0
sb                     7            6152            1016
journal             8192         8388608               0
btree              15243        15603200            5632
user            33279446     34078288973               0
cached                 0               0               0
parity                 0               0               0
stripe                 0               0               0
need_gc_gens           0               0               0
need_discard           0               0               0
ec                     0

reserves:
stripe           1072922
normal            536475
copygc                28
btree                 14
btree_copygc           0
reclaim                0

freelist_wait           empty
open buckets allocated  1024
open buckets this dev   27
open buckets total      1024
open_buckets_wait       waiting
open_buckets_btree      1023
open_buckets_user       0
buckets_to_invalidate   0
btree reserve cache     0

% sudo bcachefs show-super /dev/dm-8
External UUID:                              8e4c4cd5-bfeb-41ff-9560-42c68f6461de
Internal UUID:                              15a62530-11c4-4ab8-900d-f257889e4a43
Magic number:                               c68573f6-66ce-90a9-d96a-60cf803df7ef
Device index:                               6
Label:
Version:                                    1.6: btree_subvolume_children
Version upgrade complete:                   1.3: rebalance_work
Oldest version on disk:                     1.3: rebalance_work
Created:                                    Sun Jan 28 17:59:06 2024
Sequence number:                            288
Time of last write:                         Sat Mar  2 21:30:55 2024
Superblock size:                            9.73 KiB/1.00 MiB
Clean:                                      0
Devices:                                    7
Sections:                                   members_v1,replicas_v0,disk_groups,clean,journal_seq_blacklist,journal_v2,counters,members_v2,errors,ext,downgrade
Features:                                   zstd,journal_seq_blacklist_v3,reflink,new_siphash,inline_data,new_extent_overwrite,btree_ptr_v2,extents_above_btree_updates,btree_updates_journalled,new_varint,journal_no_flush,alloc_v2,extents_across_btree_nodes
Compat features:                            alloc_info,alloc_metadata,extents_above_btree_updates_done,bformat_overflow_done

Options:
  block_size:                               4.00 KiB
  btree_node_size:                          256 KiB
  errors:                                   continue [ro] panic
  metadata_replicas:                        3
  data_replicas:                            3
  metadata_replicas_required:               2
  data_replicas_required:                   2
  encoded_extent_max:                       64.0 KiB
  metadata_checksum:                        none [crc32c] crc64 xxhash
  data_checksum:                            none [crc32c] crc64 xxhash
  compression:                              zstd:1
  background_compression:                   zstd:15
  str_hash:                                 crc32c crc64 [siphash]
  metadata_target:                          none
  foreground_target:                        ssd
  background_target:                        hdd
  promote_target:                           ssd
  erasure_code:                             0
  inodes_32bit:                             1
  shard_inode_numbers:                      1
  inodes_use_key_cache:                     1
  gc_reserve_percent:                       8
  gc_reserve_bytes:                         0 B
  root_reserve_percent:                     0
  wide_macs:                                0
  acl:                                      1
  usrquota:                                 0
  grpquota:                                 0
  prjquota:                                 0
  journal_flush_delay:                      1000
  journal_flush_disabled:                   0
  journal_reclaim_delay:                    100
  journal_transaction_names:                1
  version_upgrade:                          [compatible] incompatible none
  nocow:                                    0

members_v2 (size 912):
Device:                                     0
  Label:                                    hdd0 (1)
  UUID:                                     e969b88a-d108-495f-8b09-72d0c47c23b7
  Size:                                     16.4 TiB
  read errors:                              0
  write errors:                             0
  checksum errors:                          0
  seqread iops:                             0
  seqwrite iops:                            0
  randread iops:                            0
  randwrite iops:                           0
  Bucket size:                              512 KiB
  First bucket:                             0
  Buckets:                                  34332640
  Last mount:                               Sat Mar  2 21:35:26 2024
  Last superblock write:                    285
  State:                                    rw
  Data allowed:                             journal,btree,user
  Has data:                                 journal,btree,user,cached
  Durability:                               1
  Discard:                                  0
  Freespace initialized:                    1
Device:                                     1
  Label:                                    ssd0 (3)
  UUID:                                     1018e3e7-971f-483e-9b51-add348accb77
  Size:                                     3.64 TiB
  read errors:                              0
  write errors:                             0
  checksum errors:                          0
  seqread iops:                             0
  seqwrite iops:                            0
  randread iops:                            0
  randwrite iops:                           0
  Bucket size:                              512 KiB
  First bucket:                             0
  Buckets:                                  7630863
  Last mount:                               Sat Mar  2 21:35:26 2024
  Last superblock write:                    285
  State:                                    rw
  Data allowed:                             journal,btree,user
  Has data:                                 journal,btree,user,cached
  Durability:                               1
  Discard:                                  1
  Freespace initialized:                    1
Device:                                     2
  Label:                                    ssd1 (4)
  UUID:                                     24806e9d-3848-4a13-aedc-1f2f39f3f7d6
  Size:                                     3.64 TiB
  read errors:                              0
  write errors:                             0
  checksum errors:                          0
  seqread iops:                             0
  seqwrite iops:                            0
  randread iops:                            0
  randwrite iops:                           0
  Bucket size:                              512 KiB
  First bucket:                             0
  Buckets:                                  7630863
  Last mount:                               Sat Mar  2 21:35:26 2024
  Last superblock write:                    285
  State:                                    rw
  Data allowed:                             journal,btree,user
  Has data:                                 journal,btree,user,cached
  Durability:                               1
  Discard:                                  1
  Freespace initialized:                    1
Device:                                     3
  Label:                                    hdd1 (5)
  UUID:                                     9dcccfd4-c8f5-4d0c-9025-790b577c1fed
  Size:                                     12.7 TiB
  read errors:                              0
  write errors:                             0
  checksum errors:                          0
  seqread iops:                             0
  seqwrite iops:                            0
  randread iops:                            0
  randwrite iops:                           0
  Bucket size:                              1.00 MiB
  First bucket:                             0
  Buckets:                                  13351920
  Last mount:                               Sat Mar  2 21:35:26 2024
  Last superblock write:                    285
  State:                                    rw
  Data allowed:                             journal,btree,user
  Has data:                                 journal,btree,user,cached
  Durability:                               1
  Discard:                                  0
  Freespace initialized:                    1
Device:                                     4
  Label:                                    hdd2 (6)
  UUID:                                     d8dd0ea2-196b-4266-8813-69ccc1abd52a
  Size:                                     12.7 TiB
  read errors:                              0
  write errors:                             0
  checksum errors:                          0
  seqread iops:                             0
  seqwrite iops:                            0
  randread iops:                            0
  randwrite iops:                           0
  Bucket size:                              1.00 MiB
  First bucket:                             0
  Buckets:                                  13351920
  Last mount:                               Sat Mar  2 21:35:26 2024
  Last superblock write:                    285
  State:                                    rw
  Data allowed:                             journal,btree,user
  Has data:                                 journal,user,cached
  Durability:                               1
  Discard:                                  0
  Freespace initialized:                    1
Device:                                     5
  Label:                                    hdd3 (8)
  UUID:                                     fff75b25-9672-4635-b77d-fceaeaa4d2c2
  Size:                                     12.7 TiB
  read errors:                              0
  write errors:                             0
  checksum errors:                          0
  seqread iops:                             0
  seqwrite iops:                            0
  randread iops:                            0
  randwrite iops:                           0
  Bucket size:                              1.00 MiB
  First bucket:                             0
  Buckets:                                  13351920
  Last mount:                               Sat Mar  2 21:35:26 2024
  Last superblock write:                    285
  State:                                    rw
  Data allowed:                             journal,btree,user
  Has data:                                 journal,user,cached
  Durability:                               1
  Discard:                                  0
  Freespace initialized:                    1
Device:                                     6
  Label:                                    hdd4 (7)
  UUID:                                     c4de381b-6805-49a9-bca2-0ce33e5aecc4
  Size:                                     12.7 TiB
  read errors:                              0
  write errors:                             0
  checksum errors:                          0
  seqread iops:                             0
  seqwrite iops:                            0
  randread iops:                            0
  randwrite iops:                           0
  Bucket size:                              1.00 MiB
  First bucket:                             0
  Buckets:                                  13351920
  Last mount:                               Sat Mar  2 21:35:26 2024
  Last superblock write:                    285
  State:                                    rw
  Data allowed:                             journal,btree,user
  Has data:                                 journal,user,cached
  Durability:                               1
  Discard:                                  0
  Freespace initialized:                    1

errors (size 8):

freswa commented 4 months ago

I'd really like to help debug this issue, but the system this happens on is used for storing backups. I can leave it offline for a few days but not for week(s). @koverstreet Do you think you'll have the time to dig into this in the next days? Otherwise I'd leave the issue for later and format the disks :) Thank you :v:

inodentry commented 4 months ago

While I don't have any important data on my system either (everything is stuff I could recreate by reinstalling the OS and replicating files from other machines), I'd really like to avoid that if possible, because it would be a hassle. Is there some way to get the filesystem working again, perhaps with some data loss (such as clearing/dropping the latest activity from the journal)?

Wingar commented 4 months ago

I can +1 this issue. Identical symptoms. Upstream Kernel source 6.7.6, encryption and erasure coding used with 2 replicas, LZ4. Interestingly it only happened after a clean shutdown, yet the few un-clean shutdowns the system has had didn't cause anything a fsck couldn't fix.

PrefersAwkward commented 4 months ago

Same issues after an unclean shutdown (my computer got powered off accidentally). I've been using BCacheFS for a week.

BCacheFS Tools 1.6.3

I use LZ4, replicas 2, 2 NVME, and 2 HDD. My mkfs is copied here: sudo bcachefs format --compression=lz4 --replicas=2 --label=ssd.ssd1 /dev/nvme1n1 --label=ssd.ssd2 /dev/nvme2n1 --label=hdd.hdd1 /dev/sda --label=hdd.hdd2 /dev/sdb --foreground_target=ssd --promote_target=ssd --background_target=hdd

Wingar commented 4 months ago

I've actually managed to make progress, by using kernel fsck as opposed to userspace fsck. bcachefs mount -o fsck,fix_errors,verbose UUID=2353ad4f-f54a-4a6d-b838-596270f9eebc /mnt It took a while, following along the progress in dmesg in another tty but in the end it actually mounted, and just... works fine.

inodentry commented 4 months ago

root@archiso ~ # cat /sys/fs/bcachefs/0cda7e0c-e826-4f4b-a499-d93bb20ed966/dev-0/alloc_debug
                 buckets         sectors      fragmented
free              700035               0               0
sb                    13            6152             504
journal             8192         4194304               0
btree              14430         7388160               0
user               64296        31376200         1543352
cached           1044478       535068584               0
parity                 0               0               0
stripe                 0               0               0
need_gc_gens           0               0               0
need_discard           0               0               0
ec                     0

reserves:
stripe             57288
normal             28672
copygc                56
btree                 28
btree_copygc           0
reclaim                0

freelist_wait           empty
open buckets allocated  1024
open buckets this dev   511
open buckets total      1024
open_buckets_wait       waiting
open_buckets_btree      1023
open_buckets_user       0
buckets_to_invalidate   0
btree reserve cache     0
root@archiso ~ # cat /sys/fs/bcachefs/0cda7e0c-e826-4f4b-a499-d93bb20ed966/dev-1/alloc_debug
                 buckets         sectors      fragmented
free              699984               0               0
sb                    13            6152             504
journal             8192         4194304               0
btree              14430         7388160               0
user               64231        31376200         1510072
cached           1044594       535067128               0
parity                 0               0               0
stripe                 0               0               0
need_gc_gens           0               0               0
need_discard           0               0               0
ec                     0

reserves:
stripe             57288
normal             28672
copygc                56
btree                 28
btree_copygc           0
reclaim                0

freelist_wait           empty
open buckets allocated  1024
open buckets this dev   512
open buckets total      1024
open_buckets_wait       waiting
open_buckets_btree      1023
open_buckets_user       0
buckets_to_invalidate   0
btree reserve cache     0
root@archiso ~ # cat /sys/fs/bcachefs/0cda7e0c-e826-4f4b-a499-d93bb20ed966/dev-2/alloc_debug
                 buckets         sectors      fragmented
free             2761683               0               0
sb                    13            6152             504
journal             8192         4194304               0
btree                  0               0               0
user             1045588       535068904          272152
cached                 2             336               0
parity                 0               0               0
stripe                 0               0               0
need_gc_gens           0               0               0
need_discard           0               0               0
ec                     0

reserves:
stripe            119288
normal             59672
copygc                56
btree                 28
btree_copygc           0
reclaim                0

freelist_wait           empty
open buckets allocated  1024
open buckets this dev   0
open buckets total      1024
open_buckets_wait       waiting
open_buckets_btree      1023
open_buckets_user       0
buckets_to_invalidate   0
btree reserve cache     0
root@archiso ~ # cat /sys/fs/bcachefs/0cda7e0c-e826-4f4b-a499-d93bb20ed966/dev-3/alloc_debug
                 buckets         sectors      fragmented
free             2761659               0               0
sb                    13            6152             504
journal             8192         4194304               0
btree                  0               0               0
user             1045612       535066808          286536
cached                 2             336               0
parity                 0               0               0
stripe                 0               0               0
need_gc_gens           0               0               0
need_discard           0               0               0
ec                     0

reserves:
stripe            119288
normal             59672
copygc                56
btree                 28
btree_copygc           0
reclaim                0

freelist_wait           empty
open buckets allocated  1024
open buckets this dev   0
open buckets total      1024
open_buckets_wait       waiting
open_buckets_btree      1023
open_buckets_user       0
buckets_to_invalidate   0
btree reserve cache     0

inodentry commented 4 months ago

Oh and now I also see a suspicious looking message in dmesg I didn't see before:

[  487.089684] bcachefs (0cda7e0c-e826-4f4b-a499-d93bb20ed966): Version downgrade required:

[  487.089693] bcachefs (0cda7e0c-e826-4f4b-a499-d93bb20ed966): Version upgrade from 1.3: rebalance_work to 1.4: (unknown version) incomplete
               Doing compatible version upgrade from 1.3: rebalance_work to 1.4: (unknown version)

Sounds like there was some sort of upgrade that was interrupted/unfinished? Interesting, because I have never tried to mount this FS on a kernel newer than 6.7.6. Could the userspace fsck have done this?

I tried to run kernel fsck, but it did not help. The kernel task hangs just like in OP.

inodentry commented 4 months ago

Ok, I tried to boot into a Fedora Rawhide ISO with kernel 6.8-rc7 to see if it would help. Looks like now I can't even try to do anything. Both dmesg and bcachefsck give this error:

bch2_dev_in_fs() Split brain detected between /dev/sdd1 and /dev/sda1:
/dev/sda1 believes seq of /dev/sdd1 to be 49, but /dev/sdd1 has 58
Not using /dev/sdd1
bch2_dev_in_fs() Split brain detected between /dev/sdc and /dev/sda1:
/dev/sda1 believes seq of /dev/sdc to be 49, but /dev/sdc has 58
Not using /dev/sdc
bch2_dev_in_fs() Split brain detected between /dev/sdb and /dev/sda1:
/dev/sda1 believes seq of /dev/sdb to be 49, but /dev/sdb has 58
Not using /dev/sdb
bch2_fs_open() bch_fs_open err opening /dev/sda1: insufficient_devices_to_start

I guess my previous attempt to run kernel fsck messed up the FS. I guess this FS is actually gone now ... oh well.

So much for "the filesystem that won't eat your data" ;)

Wingar commented 4 months ago

Ok, I tried to boot into a Fedora Rawhide ISO with kernel 6.8-rc7 to see if it would help. Looks like now I can't even try to do anything. Both dmesg and bcachefsck give this error:
bch2_dev_in_fs() Split brain detected between /dev/sdd1 and /dev/sda1:
/dev/sda1 believes seq of /dev/sdd1 to be 49, but /dev/sdd1 has 58
Not using /dev/sdd1
bch2_dev_in_fs() Split brain detected between /dev/sdc and /dev/sda1:
/dev/sda1 believes seq of /dev/sdc to be 49, but /dev/sdc has 58
Not using /dev/sdc
bch2_dev_in_fs() Split brain detected between /dev/sdb and /dev/sda1:
/dev/sda1 believes seq of /dev/sdb to be 49, but /dev/sdb has 58
Not using /dev/sdb
bch2_fs_open() bch_fs_open err opening /dev/sda1: insufficient_devices_to_start
I guess my previous attempt to run kernel fsck messed up the FS. I guess this FS is actually gone now ... oh well.

So much for "the filesystem that won't eat your data" ;)

I actually had this issue myself when trying to userspace fsck before I got it mounting. I'm not actually sure what caused it, but in my case the filesystem was still intact, and having checked everything since then I've suffered zero data loss or corruption. I'm not actually sure what happened or why kernel fsck fixed it for me in the end, but I wouldn't throw all hope away yet. This is still early days for bcachefs, and the tools are still immature, but the on-disk format in my experience is very rock solid. Just may take some effort to be able to exploit that better.

inodentry commented 4 months ago

Yeah I was half-joking. :) I know it's early days and the FS is very experimental and buggy. Don't get me wrong, I love bcachefs, I love its feature set and flexibility, and I agree the on-disk format is great (in theory at least). I genuinely think bcachefs is the best COW filesystem design in existence. But, on the other hand ... it did eat my data! 😆 I am making fun of the marketing slogan, because it is cheeky and kinda toxic.

I am not giving up on bcachefs. I am very hopeful for the future and can't wait for it to improve and mature. That's why I am here filing bug reports with all the data I can collect.

But no, kernel fsck did not fix it for me. The same errors appear with both userspace and kernel fsck. As a last resort, I tried it with very_degraded and devices missing, and it just deadlocked on journal replay again.

Wingar commented 4 months ago

Have you tried using -o nochanges? If the problem is journal replay then what happens when it's not allowed to replay the journal?

inodentry commented 4 months ago

Just tried -o ro,nochanges,very_degraded with one device at a time (to avoid the "split brain" error). Same "journal would deadlock" error. Another attempt completely froze the kernel ...

Yea I don't think there is hope. I doubt there is any use in keeping this FS around either, there is probably no more useful debug info to be gathered, as it is now quite butchered by all my attempts to fsck and mount it. I will just reformat ...

koverstreet commented 4 months ago

Back online Could any of you jump on IRC? irc.oftc.net #bcache - that will make it easier to walk through different debugging steps

PrefersAwkward commented 4 months ago

Back online Could any of you jump on IRC? irc.oftc.net #bcache - that will make it easier to walk through different debugging steps

I joined the IRC. Should I reach out to you there? I'm using PrefersAwkward username

koverstreet commented 4 months ago

Just saw you and missed you

koverstreet commented 4 months ago

From alloc_debug: open_buckets_wait waiting

that's the line that jumps out at me; there's 1024 open buckets and they're not meant to be held for longer than it takes to do an index update after a write, so that indicates some sort of leak or a deadlock.

Can someone who's seeing that check sysfs internal/open_buckets?

inodentry commented 4 months ago

I can no longer even attempt to mount the fs unless i do -o very_degraded and try with only one device, because of the "split brain" errors. I think my attempt at running kernel fsck butchered the FS and now it refuses to use multiple devices.

Would it still be useful to try to debug it in that state?

I have not reformatted yet, I still have the broken FS. I could try to get on IRC tonight (for me in EU, by your posting times above I guess you are in a NA timezone) to help debug if it would be of any use. Otherwise, I will reformat tonight and reinstall my OS.

koverstreet commented 4 months ago

Were you hitting the null ptr deref? I want to get that debugged, but I may have enough to do that.

I wonder if there's anything we can do to recover from split brain scenarios, that's a nasty one.

lorenzpmeier commented 4 months ago

Hi Kent,

I wonder if there's anything we can do to recover from split brain scenarios, that's a nasty one.

Check https://github.com/koverstreet/bcachefs/issues/648#issuecomment-1962395561 I seem to face a similar problem.

If I do a fsck it always says the other drives have a different seq, expecting 65 but getting higher values from the other drives.

koverstreet commented 4 months ago

Ok, I'm trying to figure out how to reproduce this; I've tested kernel 6.7.9 and tools 1.6.4, going back and forth, and nothing. Anyone have the steps to repro?

freswa commented 4 months ago

Don't know if this is a reliable reproducer, but this is what I think many of us have in common:

Create multi disk array
Put data on it (at least for me we're talking about 20-30 TiB)
Change something in the config (add foreground/background target and/or disk)
Put more data on it (at least for me we're talking about 20-30 TiB)
Fail to mount fs. For me it happened randomly once and the other time I pulled the plug (accidentally)

Fsck fails but upgrades disks to 1.6

WARNING at libbcachefs/btree_iter.c:2827: btree trans held srcu lock (delaying memory reclaim) for 51 seconds

Split brain

Hope this helps :)

koverstreet commented 4 months ago

Figured it out: it's from writing a superblock without downgrading it.

I'll figure out the fix to backport.

koverstreet commented 4 months ago

Backports are now being tested, will be sending them off to Greg for 6.7 shortly

reedriley commented 4 months ago

I think I might be hitting this on 6.8.2 with bcachefs-tools version 1.6.4? (At least - when trying to run a userspace fsck after an unclean shutdown; I get a stream of constant journal_reclaim_would_deadlock messages and see bcachefs consuming 90% of the memory on a 128GB system...)

It's possible I just haven't waiting long enough for things to make progress and I'll report back if that's wrong.

I'm going to try again now with a kernel fsck so I can at least report back with alloc_debug output if I get stuck at this point again. (And out of hope that my problem is resolved by the same steps @Wingar took above.)

@koverstreet - what's the fix you referenced in the thread above; so I can make sure I have it applied to my kernel / bcachefs-tools?

reedriley commented 4 months ago

Tried again on 6.9-rc1; and that seems to work somewhat better? (No errors or panics during kernel fsck at any rate?)

However, the kernel fsck seems to hang indefinitely after some period of time with no CPU or disk usage; and the filesystem didn't end up being mounted.

dev0/alloc_debug:

``` buckets sectors fragmented free 519025 0 0 sb 7 6152 1016 journal 8192 8388608 0 btree 716852 412315136 321741312 user 685 577568 123872 cached 5956724 6093119456 0 parity 0 0 0 stripe 0 0 0 need_gc_gens 0 0 0 need_discard 612551 0 0 reserves: stripe 244216 normal 122122 copygc 28 btree 14 btree_copygc 0 reclaim 0 freelist_wait empty open buckets allocated 1024 open buckets this dev 343 open buckets total 1024 open_buckets_wait waiting open_buckets_btree 1023 open_buckets_user 0 buckets_to_invalidate 0 btree reserve cache 0 ```

internals/open_buckets:

``` 1 ref 2 btree 1:246391 gen 7 allocated 1024/1024 2 ref 2 btree 0:13848 gen 9 allocated 1024/1024 3 ref 2 btree 1:246390 gen 7 allocated 1024/1024 4 ref 4 btree 8:3090136 gen 3 allocated 2048/2048 5 ref 2 btree 0:13839 gen 10 allocated 1024/1024 6 ref 4 btree 9:3433085 gen 2 allocated 2048/2048 7 ref 2 btree 1:246388 gen 7 allocated 1024/1024 8 ref 2 btree 0:13834 gen 9 allocated 1024/1024 9 ref 2 btree 1:246387 gen 7 allocated 1024/1024 10 ref 4 btree 8:3090113 gen 3 allocated 2048/2048 11 ref 2 btree 0:13815 gen 7 allocated 1024/1024 12 ref 4 btree 9:3433001 gen 3 allocated 2048/2048 13 ref 2 btree 1:246385 gen 7 allocated 1024/1024 14 ref 2 btree 0:13814 gen 7 allocated 1024/1024 15 ref 2 btree 1:246384 gen 7 allocated 1024/1024 16 ref 4 btree 8:3090108 gen 3 allocated 2048/2048 17 ref 2 btree 0:13803 gen 10 allocated 1024/1024 18 ref 4 btree 9:3432908 gen 2 allocated 2048/2048 19 ref 2 btree 1:246383 gen 7 allocated 1024/1024 20 ref 2 btree 0:13802 gen 11 allocated 1024/1024 21 ref 2 btree 1:246381 gen 7 allocated 1024/1024 22 ref 4 btree 8:3090101 gen 3 allocated 2048/2048 23 ref 2 btree 0:13800 gen 10 allocated 1024/1024 24 ref 4 btree 9:3432875 gen 3 allocated 2048/2048 25 ref 2 btree 1:246380 gen 7 allocated 1024/1024 26 ref 2 btree 0:13793 gen 11 allocated 1024/1024 27 ref 2 btree 1:246379 gen 7 allocated 1024/1024 28 ref 4 btree 8:3090068 gen 3 allocated 2048/2048 29 ref 2 btree 0:13790 gen 10 allocated 1024/1024 30 ref 4 btree 9:3432859 gen 3 allocated 2048/2048 31 ref 2 btree 1:246378 gen 7 allocated 1024/1024 32 ref 2 btree 0:13784 gen 8 allocated 1024/1024 33 ref 2 btree 1:246377 gen 7 allocated 1024/1024 34 ref 2 btree 0:13775 gen 9 allocated 1024/1024 35 ref 4 btree 8:3090063 gen 3 allocated 2048/2048 36 ref 2 btree 1:246376 gen 7 allocated 1024/1024 37 ref 4 btree 9:3432825 gen 3 allocated 2048/2048 38 ref 2 btree 0:13764 gen 4 allocated 1024/1024 39 ref 2 btree 1:246375 gen 7 allocated 1024/1024 40 ref 2 btree 0:13699 gen 10 allocated 1024/1024 41 ref 4 btree 8:3090042 gen 3 allocated 2048/2048 42 ref 4 btree 9:3432795 gen 3 allocated 2048/2048 43 ref 2 btree 1:246374 gen 7 allocated 1024/1024 44 ref 2 btree 0:13669 gen 9 allocated 1024/1024 45 ref 2 btree 1:246373 gen 7 allocated 1024/1024 46 ref 2 btree 0:13639 gen 8 allocated 1024/1024 47 ref 4 btree 8:3090041 gen 3 allocated 2048/2048 48 ref 4 btree 9:3432786 gen 2 allocated 2048/2048 49 ref 2 btree 1:246371 gen 7 allocated 1024/1024 50 ref 2 btree 0:13596 gen 7 allocated 1024/1024 51 ref 2 btree 1:246369 gen 7 allocated 1024/1024 52 ref 2 btree 0:13583 gen 9 allocated 1024/1024 53 ref 4 btree 8:3090032 gen 3 allocated 2048/2048 54 ref 2 btree 1:246367 gen 7 allocated 1024/1024 55 ref 4 btree 9:3432579 gen 3 allocated 2048/2048 56 ref 2 btree 0:13558 gen 15 allocated 1024/1024 57 ref 2 btree 1:246366 gen 7 allocated 1024/1024 58 ref 2 btree 0:13554 gen 9 allocated 1024/1024 59 ref 4 btree 8:3089967 gen 4 allocated 2048/2048 60 ref 4 btree 9:3432440 gen 3 allocated 2048/2048 61 ref 2 btree 1:246365 gen 7 allocated 1024/1024 62 ref 2 btree 0:13530 gen 7 allocated 1024/1024 63 ref 2 btree 1:246364 gen 7 allocated 1024/1024 64 ref 2 btree 0:13528 gen 8 allocated 1024/1024 65 ref 4 btree 8:3089940 gen 3 allocated 2048/2048 66 ref 2 btree 1:246363 gen 7 allocated 1024/1024 67 ref 4 btree 9:3432436 gen 2 allocated 2048/2048 68 ref 2 btree 0:13524 gen 8 allocated 1024/1024 69 ref 2 btree 1:246362 gen 7 allocated 1024/1024 70 ref 2 btree 0:13506 gen 11 allocated 1024/1024 71 ref 4 btree 8:3089939 gen 3 allocated 2048/2048 72 ref 2 btree 1:246360 gen 7 allocated 1024/1024 73 ref 4 btree 9:3432415 gen 3 allocated 2048/2048 74 ref 2 btree 0:13503 gen 9 allocated 1024/1024 75 ref 2 btree 1:246359 gen 7 allocated 1024/1024 76 ref 2 btree 0:13502 gen 8 allocated 1024/1024 77 ref 4 btree 8:3089933 gen 3 allocated 2048/2048 78 ref 2 btree 1:246358 gen 6 allocated 1024/1024 79 ref 4 btree 9:3432294 gen 3 allocated 2048/2048 80 ref 2 btree 0:13501 gen 9 allocated 1024/1024 81 ref 2 btree 1:246357 gen 7 allocated 1024/1024 82 ref 2 btree 0:13494 gen 10 allocated 1024/1 ```

The kernel logs for the fsck, including soft lockups and hung tasks, are:

``` [ 395.092278] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): mounting version 1.4: member_seq opts=metadata_replicas=3,data_replicas=2,metadata_replicas_required=2,metadata_checksum=xxhash,data_checksum=xxhash,metadata_target=ssd,foreground_target=ssd,background_target=hdd,promote_target=ssd,verbose,fsck,fix_errors=yes [ 395.092285] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): recovering from unclean shutdown [ 395.092287] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): superblock requires following recovery passes to be run: check_subvols,check_dirents [ 395.092290] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): Doing compatible version upgrade from 1.4: member_seq to 1.6: btree_subvolume_children [ 395.226053] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): starting journal read [ 400.640460] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): journal read done on device nvme3n1, ret 0 [ 401.319941] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): journal read done on device nvme2n1, ret 0 [ 410.213328] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): journal read done on device nvme1n1, ret 0 [ 410.722499] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): journal read done on device nvme0n1, ret 0 [ 410.798651] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): journal read done on device sdh, ret 0 [ 411.013440] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): journal read done on device sdg, ret 0 [ 411.305538] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): journal read done on device sdf, ret 0 [ 411.750431] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): journal read done on device sdd, ret 0 [ 416.290871] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): journal read done on device sdc, ret 0 [ 419.044430] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): journal read done on device sda, ret 0 [ 419.044475] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): journal read done, replaying entries 7352723-7368839 [ 419.044482] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): dropped unflushed entries 7368840-7368842 [ 448.362752] watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [bcachefs:2381] [ 448.363964] Modules linked in: poly1305_generic libpoly1305 poly1305_x86_64 chacha_generic chacha_x86_64 libchacha bcachefs lz4hc_compress lz4_compress xor raid6_pq af_packet snd_ctl_led ledtrig_audio snd_soc_skl_hda_dsp snd_soc_hdac_hdmi snd_sof_probes snd_soc_intel_hda_dsp_common snd_hda_codec_hdmi snd_soc_dmic snd_sof_pci_intel_cnl snd_sof_intel_hda_common snd_soc_hdac_hda soundwire_intel snd_sof_intel_hda_mlink soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof intel_rapl_msr snd_sof_utils soundwire_generic_allocation soundwire_bus intel_rapl_common intel_uncore_frequency hid_sensor_incl_3d hid_sensor_magn_3d hid_sensor_rotation intel_uncore_frequency_common hid_sensor_gyro_3d hid_sensor_als hid_sensor_accel_3d hid_sensor_trigger industrialio_triggered_buffer kfifo_buf hid_sensor_iio_common intel_tcc_cooling industrialio snd_soc_avs x86_pkg_temp_thermal snd_soc_hda_codec intel_powerclamp snd_soc_skl hid_sensor_custom coretemp kvm_intel hid_sensor_hub snd_soc_sst_ipc snd_soc_sst_dsp [ 448.363989] snd_hda_ext_core snd_soc_acpi_intel_match kvm snd_soc_acpi intel_ishtp_hid snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine crc32_pclmul i915 polyval_clmulni polyval_generic gf128mul ghash_clmulni_intel sha512_ssse3 snd_hda_intel sha512_generic snd_intel_dspcfg snd_intel_sdw_acpi tps6598x sha256_ssse3 ledtrig_netdev snd_hda_codec cfg80211 snd_hda_core snd_hwdep snd_pcm snd_timer snd soundcore sha1_ssse3 typec roles drm_buddy ttm rfkill drm_display_helper aesni_intel cec i40e 8021q intel_gtt igc i2c_algo_bit cmdlinepart libaes crypto_simd cryptd ir_rc6_decoder jc42 intel_pmc_core spi_nor rc_rc6_mce mei_wdt ee1004 ptp pps_core intel_lpss_pci intel_vsec pmt_telemetry pmt_class mtd mei_hdcp mei_pxp i2c_i801 wmi_bmof intel_ish_ipc rapl mei_me intel_lpss idma64 intel_ishtp joydev virt_dma intel_cstate wdat_wdt video spi_intel_pci mei intel_uncore watchdog intel_wmi_thunderbolt input_leds backlight 8250_pci i2c_smbus spi_intel evdev mousedev intel_pch_thermal mac_hid fan ite_cir pinctrl_cannonlake wmi [ 448.364027] tiny_power_button thermal serial_multi_instantiate button acpi_pad acpi_tad xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_tcpudp ip6t_rpfilter ipt_rpfilter xt_pkttype nft_compat nf_tables libcrc32c crc32c_generic sch_fq_codel atkbd libps2 vivaldi_fmap fuse configfs efi_pstore nfnetlink efivarfs dmi_sysfs ip_tables x_tables autofs4 squashfs isofs cdrom sd_mod hid_generic usbhid hid uas usb_storage sdhci_pci ahci cqhci xhci_pci libahci xhci_pci_renesas sdhci nvme firmware_class libata led_class nvme_core xhci_hcd scsi_mod mmc_core t10_pi crc32c_intel rtc_cmos crc64_rocksoft crc64 crc_t10dif crct10dif_generic scsi_common crct10dif_pclmul crct10dif_common serio dm_mod dax nls_iso8859_1 nls_cp437 vfat fat overlay loop [ 448.364057] CPU: 1 PID: 2381 Comm: bcachefs Not tainted 6.9.0-rc1 #1-NixOS [ 448.364058] Hardware name: QNAP Systems, Inc. TVS-1688/Default string, BIOS Q063AR22 03/08/2022 [ 448.364059] RIP: 0010:sort_r+0x1ef/0x370 [ 448.364064] Code: 39 34 24 0f 84 bd fe ff ff 48 89 44 24 10 44 8b 74 24 2c 4c 89 ed 4d 89 fd 4d 89 cf 4d 29 ef 48 8b 4c 24 20 4d 89 e0 4c 89 ea <4c> 89 f8 4d 89 f9 48 89 ee 4c 21 f0 48 f7 d8 4c 21 e8 49 29 c1 4d [ 448.364065] RSP: 0018:ffffa8dc8080b900 EFLAGS: 00000202 [ 448.364067] RAX: 00000000007061d0 RBX: ffffa8dd1aa00000 RCX: 0000000000000000 [ 448.364068] RDX: 0000000000000018 RSI: ffffa8dd3a6eb3a0 RDI: ffffa8dd1aa1fcc8 [ 448.364069] RBP: ffffa8dd3a6eb3a0 R08: ffffa8dc8080b988 R09: 000000000003f990 [ 448.364070] R10: 000000000000001b R11: ffffffffffe00000 R12: ffffa8dc8080b988 [ 448.364070] R13: 0000000000000018 R14: 0000000000000008 R15: 000000000001fcb0 [ 448.364072] FS: 00007fdc24641c00(0000) GS:ffff90561c080000(0000) knlGS:0000000000000000 [ 448.364073] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 448.364074] CR2: 00007fbc5eabc5c0 CR3: 000000010140a002 CR4: 00000000003706f0 [ 448.364074] Call Trace: [ 448.364076] [ 448.364079] ? watchdog_timer_fn+0x1dd/0x260 [ 448.364081] ? __pfx_watchdog_timer_fn+0x10/0x10 [ 448.364083] ? __hrtimer_run_queues+0x10f/0x2a0 [ 448.364086] ? hrtimer_interrupt+0xfa/0x230 [ 448.364087] ? __sysvec_apic_timer_interrupt+0x55/0x150 [ 448.364090] ? sysvec_apic_timer_interrupt+0x6e/0x80 [ 448.364092] [ 448.364093] [ 448.364093] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 448.364097] ? sort_r+0x1ef/0x370 [ 448.364099] sort+0x31/0x50 [ 448.364101] ? __pfx_journal_sort_key_cmp+0x10/0x10 [bcachefs] [ 448.364148] __journal_keys_sort+0x2b/0x100 [bcachefs] [ 448.364182] bch2_journal_keys_sort+0x370/0x3b0 [bcachefs] [ 448.364216] bch2_fs_recovery+0x3c8/0x1630 [bcachefs] [ 448.364260] ? __bch2_print+0x87/0xe0 [bcachefs] [ 448.364302] ? bch2_printbuf_exit+0x20/0x30 [bcachefs] [ 448.364344] ? print_mount_opts+0x131/0x180 [bcachefs] [ 448.364385] bch2_fs_start+0x2cd/0x410 [bcachefs] [ 448.364425] bch2_fs_open+0x6c4/0x6e0 [bcachefs] [ 448.364465] bch2_mount+0x5ad/0x770 [bcachefs] [ 448.364507] legacy_get_tree+0x28/0x50 [ 448.364510] vfs_get_tree+0x26/0xe0 [ 448.364512] path_mount+0x4ca/0xb10 [ 448.364514] __x64_sys_mount+0x11a/0x150 [ 448.364516] do_syscall_64+0xba/0x210 [ 448.364518] entry_SYSCALL_64_after_hwframe+0x6d/0x75 [ 448.364520] RIP: 0033:0x7fdc2476f2ae [ 448.364541] Code: 48 8b 0d 85 ab 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 52 ab 0c 00 f7 d8 64 89 01 48 [ 448.364542] RSP: 002b:00007ffda03c6ab8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 [ 448.364543] RAX: ffffffffffffffda RBX: 000055a3e1ca4e60 RCX: 00007fdc2476f2ae [ 448.364544] RDX: 000055a3e1ccbe30 RSI: 000055a3e1cafac0 RDI: 000055a3e1ca5840 [ 448.364545] RBP: 000055a3e1ca5840 R08: 000055a3e1ca4e60 R09: 0000000000000000 [ 448.364545] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 [ 448.364546] R13: 000000000000006a R14: 00007ffda03c6ba0 R15: 000055a3e1ca0f20 [ 448.364548] [ 476.361367] watchdog: BUG: soft lockup - CPU#1 stuck for 48s! [bcachefs:2381] [ 476.362964] Modules linked in: poly1305_generic libpoly1305 poly1305_x86_64 chacha_generic chacha_x86_64 libchacha bcachefs lz4hc_compress lz4_compress xor raid6_pq af_packet snd_ctl_led ledtrig_audio snd_soc_skl_hda_dsp snd_soc_hdac_hdmi snd_sof_probes snd_soc_intel_hda_dsp_common snd_hda_codec_hdmi snd_soc_dmic snd_sof_pci_intel_cnl snd_sof_intel_hda_common snd_soc_hdac_hda soundwire_intel snd_sof_intel_hda_mlink soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof intel_rapl_msr snd_sof_utils soundwire_generic_allocation soundwire_bus intel_rapl_common intel_uncore_frequency hid_sensor_incl_3d hid_sensor_magn_3d hid_sensor_rotation intel_uncore_frequency_common hid_sensor_gyro_3d hid_sensor_als hid_sensor_accel_3d hid_sensor_trigger industrialio_triggered_buffer kfifo_buf hid_sensor_iio_common intel_tcc_cooling industrialio snd_soc_avs x86_pkg_temp_thermal snd_soc_hda_codec intel_powerclamp snd_soc_skl hid_sensor_custom coretemp kvm_intel hid_sensor_hub snd_soc_sst_ipc snd_soc_sst_dsp [ 476.362988] snd_hda_ext_core snd_soc_acpi_intel_match kvm snd_soc_acpi intel_ishtp_hid snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine crc32_pclmul i915 polyval_clmulni polyval_generic gf128mul ghash_clmulni_intel sha512_ssse3 snd_hda_intel sha512_generic snd_intel_dspcfg snd_intel_sdw_acpi tps6598x sha256_ssse3 ledtrig_netdev snd_hda_codec cfg80211 snd_hda_core snd_hwdep snd_pcm snd_timer snd soundcore sha1_ssse3 typec roles drm_buddy ttm rfkill drm_display_helper aesni_intel cec i40e 8021q intel_gtt igc i2c_algo_bit cmdlinepart libaes crypto_simd cryptd ir_rc6_decoder jc42 intel_pmc_core spi_nor rc_rc6_mce mei_wdt ee1004 ptp pps_core intel_lpss_pci intel_vsec pmt_telemetry pmt_class mtd mei_hdcp mei_pxp i2c_i801 wmi_bmof intel_ish_ipc rapl mei_me intel_lpss idma64 intel_ishtp joydev virt_dma intel_cstate wdat_wdt video spi_intel_pci mei intel_uncore watchdog intel_wmi_thunderbolt input_leds backlight 8250_pci i2c_smbus spi_intel evdev mousedev intel_pch_thermal mac_hid fan ite_cir pinctrl_cannonlake wmi [ 476.363024] tiny_power_button thermal serial_multi_instantiate button acpi_pad acpi_tad xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_tcpudp ip6t_rpfilter ipt_rpfilter xt_pkttype nft_compat nf_tables libcrc32c crc32c_generic sch_fq_codel atkbd libps2 vivaldi_fmap fuse configfs efi_pstore nfnetlink efivarfs dmi_sysfs ip_tables x_tables autofs4 squashfs isofs cdrom sd_mod hid_generic usbhid hid uas usb_storage sdhci_pci ahci cqhci xhci_pci libahci xhci_pci_renesas sdhci nvme firmware_class libata led_class nvme_core xhci_hcd scsi_mod mmc_core t10_pi crc32c_intel rtc_cmos crc64_rocksoft crc64 crc_t10dif crct10dif_generic scsi_common crct10dif_pclmul crct10dif_common serio dm_mod dax nls_iso8859_1 nls_cp437 vfat fat overlay loop [ 476.363053] CPU: 1 PID: 2381 Comm: bcachefs Tainted: G L 6.9.0-rc1 #1-NixOS [ 476.363055] Hardware name: QNAP Systems, Inc. TVS-1688/Default string, BIOS Q063AR22 03/08/2022 [ 476.363056] RIP: 0010:journal_sort_key_cmp+0x4f/0xd0 [bcachefs] [ 476.363105] Code: b6 77 0d 0f b6 4a 0d 31 c0 39 f1 0f 92 c0 39 ce 83 d8 00 85 c0 74 05 c3 cc cc cc cc 4c 8b 4f 10 4c 8b 42 10 31 c0 49 8b 71 20 <49> 8b 48 20 48 39 f1 0f 92 c0 48 39 ce 83 d8 00 85 c0 75 d9 49 8b [ 476.363106] RSP: 0018:ffffa8dc8080b8f8 EFLAGS: 00000246 [ 476.363107] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 476.363108] RDX: ffffa8dd1b7f0b90 RSI: 0000000000000003 RDI: ffffa8dd1b7f0b78 [ 476.363109] RBP: ffffa8dd1aa00000 R08: ffff9039701d1d10 R09: ffff9038033453b8 [ 476.363110] R10: 000000000000001b R11: ffffffffffe00000 R12: ffffa8dc8080b988 [ 476.363110] R13: 0000000000df0b90 R14: 0000000000df0b78 R15: 0000000000000018 [ 476.363111] FS: 00007fdc24641c00(0000) GS:ffff90561c080000(0000) knlGS:0000000000000000 [ 476.363112] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 476.363113] CR2: 00007fbc5eabc5c0 CR3: 000000010140a002 CR4: 00000000003706f0 [ 476.363114] Call Trace: [ 476.363116] [ 476.363118] ? watchdog_timer_fn+0x1dd/0x260 [ 476.363122] ? __pfx_watchdog_timer_fn+0x10/0x10 [ 476.363123] ? __hrtimer_run_queues+0x10f/0x2a0 [ 476.363126] ? hrtimer_interrupt+0xfa/0x230 [ 476.363127] ? __sysvec_apic_timer_interrupt+0x55/0x150 [ 476.363130] ? sysvec_apic_timer_interrupt+0x6e/0x80 [ 476.363133] [ 476.363133] [ 476.363134] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 476.363137] ? journal_sort_key_cmp+0x4f/0xd0 [bcachefs] [ 476.363171] sort_r+0x103/0x370 [ 476.363173] sort+0x31/0x50 [ 476.363175] ? __pfx_journal_sort_key_cmp+0x10/0x10 [bcachefs] [ 476.363207] __journal_keys_sort+0x2b/0x100 [bcachefs] [ 476.363240] bch2_journal_keys_sort+0x370/0x3b0 [bcachefs] [ 476.363273] bch2_fs_recovery+0x3c8/0x1630 [bcachefs] [ 476.363318] ? __bch2_print+0x87/0xe0 [bcachefs] [ 476.363359] ? bch2_printbuf_exit+0x20/0x30 [bcachefs] [ 476.363401] ? print_mount_opts+0x131/0x180 [bcachefs] [ 476.363442] bch2_fs_start+0x2cd/0x410 [bcachefs] [ 476.363481] bch2_fs_open+0x6c4/0x6e0 [bcachefs] [ 476.363521] bch2_mount+0x5ad/0x770 [bcachefs] [ 476.363564] legacy_get_tree+0x28/0x50 [ 476.363567] vfs_get_tree+0x26/0xe0 [ 476.363569] path_mount+0x4ca/0xb10 [ 476.363571] __x64_sys_mount+0x11a/0x150 [ 476.363573] do_syscall_64+0xba/0x210 [ 476.363575] entry_SYSCALL_64_after_hwframe+0x6d/0x75 [ 476.363577] RIP: 0033:0x7fdc2476f2ae [ 476.363597] Code: 48 8b 0d 85 ab 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 52 ab 0c 00 f7 d8 64 89 01 48 [ 476.363598] RSP: 002b:00007ffda03c6ab8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 [ 476.363599] RAX: ffffffffffffffda RBX: 000055a3e1ca4e60 RCX: 00007fdc2476f2ae [ 476.363600] RDX: 000055a3e1ccbe30 RSI: 000055a3e1cafac0 RDI: 000055a3e1ca5840 [ 476.363601] RBP: 000055a3e1ca5840 R08: 000055a3e1ca4e60 R09: 0000000000000000 [ 476.363601] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 [ 476.363602] R13: 000000000000006a R14: 00007ffda03c6ba0 R15: 000055a3e1ca0f20 [ 476.363604] [ 504.359854] watchdog: BUG: soft lockup - CPU#1 stuck for 74s! [bcachefs:2381] [ 504.361406] Modules linked in: poly1305_generic libpoly1305 poly1305_x86_64 chacha_generic chacha_x86_64 libchacha bcachefs lz4hc_compress lz4_compress xor raid6_pq af_packet snd_ctl_led ledtrig_audio snd_soc_skl_hda_dsp snd_soc_hdac_hdmi snd_sof_probes snd_soc_intel_hda_dsp_common snd_hda_codec_hdmi snd_soc_dmic snd_sof_pci_intel_cnl snd_sof_intel_hda_common snd_soc_hdac_hda soundwire_intel snd_sof_intel_hda_mlink soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof intel_rapl_msr snd_sof_utils soundwire_generic_allocation soundwire_bus intel_rapl_common intel_uncore_frequency hid_sensor_incl_3d hid_sensor_magn_3d hid_sensor_rotation intel_uncore_frequency_common hid_sensor_gyro_3d hid_sensor_als hid_sensor_accel_3d hid_sensor_trigger industrialio_triggered_buffer kfifo_buf hid_sensor_iio_common intel_tcc_cooling industrialio snd_soc_avs x86_pkg_temp_thermal snd_soc_hda_codec intel_powerclamp snd_soc_skl hid_sensor_custom coretemp kvm_intel hid_sensor_hub snd_soc_sst_ipc snd_soc_sst_dsp [ 504.361430] snd_hda_ext_core snd_soc_acpi_intel_match kvm snd_soc_acpi intel_ishtp_hid snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine crc32_pclmul i915 polyval_clmulni polyval_generic gf128mul ghash_clmulni_intel sha512_ssse3 snd_hda_intel sha512_generic snd_intel_dspcfg snd_intel_sdw_acpi tps6598x sha256_ssse3 ledtrig_netdev snd_hda_codec cfg80211 snd_hda_core snd_hwdep snd_pcm snd_timer snd soundcore sha1_ssse3 typec roles drm_buddy ttm rfkill drm_display_helper aesni_intel cec i40e 8021q intel_gtt igc i2c_algo_bit cmdlinepart libaes crypto_simd cryptd ir_rc6_decoder jc42 intel_pmc_core spi_nor rc_rc6_mce mei_wdt ee1004 ptp pps_core intel_lpss_pci intel_vsec pmt_telemetry pmt_class mtd mei_hdcp mei_pxp i2c_i801 wmi_bmof intel_ish_ipc rapl mei_me intel_lpss idma64 intel_ishtp joydev virt_dma intel_cstate wdat_wdt video spi_intel_pci mei intel_uncore watchdog intel_wmi_thunderbolt input_leds backlight 8250_pci i2c_smbus spi_intel evdev mousedev intel_pch_thermal mac_hid fan ite_cir pinctrl_cannonlake wmi [ 504.361466] tiny_power_button thermal serial_multi_instantiate button acpi_pad acpi_tad xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_tcpudp ip6t_rpfilter ipt_rpfilter xt_pkttype nft_compat nf_tables libcrc32c crc32c_generic sch_fq_codel atkbd libps2 vivaldi_fmap fuse configfs efi_pstore nfnetlink efivarfs dmi_sysfs ip_tables x_tables autofs4 squashfs isofs cdrom sd_mod hid_generic usbhid hid uas usb_storage sdhci_pci ahci cqhci xhci_pci libahci xhci_pci_renesas sdhci nvme firmware_class libata led_class nvme_core xhci_hcd scsi_mod mmc_core t10_pi crc32c_intel rtc_cmos crc64_rocksoft crc64 crc_t10dif crct10dif_generic scsi_common crct10dif_pclmul crct10dif_common serio dm_mod dax nls_iso8859_1 nls_cp437 vfat fat overlay loop [ 504.361494] CPU: 1 PID: 2381 Comm: bcachefs Tainted: G L 6.9.0-rc1 #1-NixOS [ 504.361496] Hardware name: QNAP Systems, Inc. TVS-1688/Default string, BIOS Q063AR22 03/08/2022 [ 504.361497] RIP: 0010:journal_sort_key_cmp+0x4f/0xd0 [bcachefs] [ 504.361545] Code: b6 77 0d 0f b6 4a 0d 31 c0 39 f1 0f 92 c0 39 ce 83 d8 00 85 c0 74 05 c3 cc cc cc cc 4c 8b 4f 10 4c 8b 42 10 31 c0 49 8b 71 20 <49> 8b 48 20 48 39 f1 0f 92 c0 48 39 ce 83 d8 00 85 c0 75 d9 49 8b [ 504.361546] RSP: 0018:ffffa8dc8080b8f8 EFLAGS: 00000246 [ 504.361547] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 504.361548] RDX: ffffa8dd1b388770 RSI: 00000000080a868e RDI: ffffa8dd1b388758 [ 504.361549] RBP: ffffa8dd1aa00000 R08: ffff9037db5d9f78 R09: ffff903846798960 [ 504.361550] R10: 000000000000001b R11: ffffffffffe00000 R12: ffffa8dc8080b988 [ 504.361551] R13: 0000000000988770 R14: 0000000000988758 R15: 0000000000000018 [ 504.361551] FS: 00007fdc24641c00(0000) GS:ffff90561c080000(0000) knlGS:0000000000000000 [ 504.361552] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 504.361553] CR2: 00007fbc5eabc5c0 CR3: 000000010140a002 CR4: 00000000003706f0 [ 504.361554] Call Trace: [ 504.361556] [ 504.361559] ? watchdog_timer_fn+0x1dd/0x260 [ 504.361561] ? __pfx_watchdog_timer_fn+0x10/0x10 [ 504.361563] ? __hrtimer_run_queues+0x10f/0x2a0 [ 504.361565] ? hrtimer_interrupt+0xfa/0x230 [ 504.361567] ? __sysvec_apic_timer_interrupt+0x55/0x150 [ 504.361569] ? sysvec_apic_timer_interrupt+0x6e/0x80 [ 504.361572] [ 504.361573] [ 504.361573] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 504.361577] ? journal_sort_key_cmp+0x4f/0xd0 [bcachefs] [ 504.361611] sort_r+0x103/0x370 [ 504.361613] sort+0x31/0x50 [ 504.361614] ? __pfx_journal_sort_key_cmp+0x10/0x10 [bcachefs] [ 504.361647] __journal_keys_sort+0x2b/0x100 [bcachefs] [ 504.361680] bch2_journal_keys_sort+0x370/0x3b0 [bcachefs] [ 504.361713] bch2_fs_recovery+0x3c8/0x1630 [bcachefs] [ 504.361758] ? __bch2_print+0x87/0xe0 [bcachefs] [ 504.361799] ? bch2_printbuf_exit+0x20/0x30 [bcachefs] [ 504.361841] ? print_mount_opts+0x131/0x180 [bcachefs] [ 504.361881] bch2_fs_start+0x2cd/0x410 [bcachefs] [ 504.361921] bch2_fs_open+0x6c4/0x6e0 [bcachefs] [ 504.361961] bch2_mount+0x5ad/0x770 [bcachefs] [ 504.362003] legacy_get_tree+0x28/0x50 [ 504.362006] vfs_get_tree+0x26/0xe0 [ 504.362008] path_mount+0x4ca/0xb10 [ 504.362010] __x64_sys_mount+0x11a/0x150 [ 504.362012] do_syscall_64+0xba/0x210 [ 504.362014] entry_SYSCALL_64_after_hwframe+0x6d/0x75 [ 504.362016] RIP: 0033:0x7fdc2476f2ae [ 504.362037] Code: 48 8b 0d 85 ab 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 52 ab 0c 00 f7 d8 64 89 01 48 [ 504.362038] RSP: 002b:00007ffda03c6ab8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 [ 504.362039] RAX: ffffffffffffffda RBX: 000055a3e1ca4e60 RCX: 00007fdc2476f2ae [ 504.362040] RDX: 000055a3e1ccbe30 RSI: 000055a3e1cafac0 RDI: 000055a3e1ca5840 [ 504.362041] RBP: 000055a3e1ca5840 R08: 000055a3e1ca4e60 R09: 0000000000000000 [ 504.362041] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 [ 504.362042] R13: 000000000000006a R14: 00007ffda03c6ba0 R15: 000055a3e1ca0f20 [ 504.362044] [ 507.928233] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): Journal keys: 57412797 read, 25847495 after sorting and compacting [ 509.107457] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): alloc_read... done [ 509.585400] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): stripes_read... done [ 509.585405] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): snapshots_read... done [ 509.585413] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): check_allocations... [ 700.293029] Process accounting resumed [ 1757.043065] done [ 1757.320980] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): going read-write [ 1758.798397] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): journal_replay... [ 1967.367860] INFO: task kworker/9:1:105 blocked for more than 122 seconds. [ 1967.371062] Tainted: G L 6.9.0-rc1 #1-NixOS [ 1967.373767] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1967.373768] task:kworker/9:1 state:D stack:0 pid:105 tgid:105 ppid:2 flags:0x00004000 [ 1967.374577] Workqueue: bcachefs_write_ref bch2_do_discards_work [bcachefs] [ 1967.374623] Call Trace: [ 1967.374624] [ 1967.374625] __schedule+0x3ec/0x1540 [ 1967.374629] schedule+0x27/0xf0 [ 1967.374631] __closure_sync+0x7e/0x150 [ 1967.374633] bch2_btree_update_start+0x8bb/0x8d0 [bcachefs] [ 1967.374670] ? __pfx_closure_sync_fn+0x10/0x10 [ 1967.374672] __bch2_foreground_maybe_merge+0x568/0xd40 [bcachefs] [ 1967.374708] ? __bch2_journal_key_search+0x199/0x1b0 [bcachefs] [ 1967.374742] __bch2_trans_commit+0xa45/0x16e0 [bcachefs] [ 1967.374777] bch2_discard_one_bucket+0x633/0x840 [bcachefs] [ 1967.374805] bch2_do_discards_work+0x1f8/0x390 [bcachefs] [ 1967.374837] process_one_work+0x18b/0x3b0 [ 1967.374841] worker_thread+0x245/0x350 [ 1967.374843] ? __pfx_worker_thread+0x10/0x10 [ 1967.374846] ? __pfx_worker_thread+0x10/0x10 [ 1967.374849] kthread+0xcd/0x100 [ 1967.374852] ? __pfx_kthread+0x10/0x10 [ 1967.374854] ret_from_fork+0x31/0x50 [ 1967.374857] ? __pfx_kthread+0x10/0x10 [ 1967.374859] ret_from_fork_asm+0x1a/0x30 [ 1967.374863] [ 1967.374874] INFO: task kworker/u48:55:523 blocked for more than 122 seconds. [ 1967.374875] Tainted: G L 6.9.0-rc1 #1-NixOS [ 1967.374876] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1967.377176] task:kworker/u48:55 state:D stack:0 pid:523 tgid:523 ppid:2 flags:0x00004000 [ 1967.377178] Workqueue: btree_update btree_interior_update_work [bcachefs] [ 1967.377213] Call Trace: [ 1967.377214] [ 1967.377215] __schedule+0x3ec/0x1540 [ 1967.377217] schedule+0x27/0xf0 [ 1967.377219] __closure_sync+0x7e/0x150 [ 1967.377221] bch2_btree_update_start+0x8bb/0x8d0 [bcachefs] [ 1967.377256] ? __pfx_closure_sync_fn+0x10/0x10 [ 1967.377258] __bch2_foreground_maybe_merge+0x568/0xd40 [bcachefs] [ 1967.377292] ? journal_validate_key+0x2b7/0x610 [bcachefs] [ 1967.377337] __bch2_trans_commit+0x3a6/0x16e0 [bcachefs] [ 1967.377372] btree_interior_update_work+0x764/0x9e0 [bcachefs] [ 1967.377407] ? process_one_work+0x18b/0x3b0 [ 1967.377408] process_one_work+0x18b/0x3b0 [ 1967.377410] worker_thread+0x245/0x350 [ 1967.377412] ? __pfx_worker_thread+0x10/0x10 [ 1967.377414] kthread+0xcd/0x100 [ 1967.377415] ? __pfx_kthread+0x10/0x10 [ 1967.377416] ret_from_fork+0x31/0x50 [ 1967.377418] ? __pfx_kthread+0x10/0x10 [ 1967.377419] ret_from_fork_asm+0x1a/0x30 [ 1967.377422] [ 1967.377428] INFO: task bcachefs:2381 blocked for more than 122 seconds. [ 1967.378208] Tainted: G L 6.9.0-rc1 #1-NixOS [ 1967.379029] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1967.379029] task:bcachefs state:D stack:0 pid:2381 tgid:2381 ppid:2380 flags:0x00004002 [ 1967.379032] Call Trace: [ 1967.379032] [ 1967.379033] __schedule+0x3ec/0x1540 [ 1967.379839] schedule+0x27/0xf0 [ 1967.379841] __closure_sync+0x7e/0x150 [ 1967.379843] bch2_btree_update_start+0x8bb/0x8d0 [bcachefs] [ 1967.379879] ? __pfx_closure_sync_fn+0x10/0x10 [ 1967.379881] bch2_btree_split_leaf+0x4f/0x190 [bcachefs] [ 1967.379916] ? bch2_journal_replay+0x178/0x5a0 [bcachefs] [ 1967.379959] bch2_trans_commit_error+0x21c/0x4f0 [bcachefs] [ 1967.379993] ? six_trylock_ip+0x1f/0x50 [bcachefs] [ 1967.380035] __bch2_trans_commit+0xd20/0x16e0 [bcachefs] [ 1967.380069] ? bch2_trans_iter_exit+0x71/0x90 [bcachefs] [ 1967.380101] bch2_journal_replay+0x178/0x5a0 [bcachefs] [ 1967.380144] bch2_run_recovery_pass+0x35/0xa0 [bcachefs] [ 1967.380186] bch2_fs_recovery+0x134f/0x1630 [bcachefs] [ 1967.380227] ? __bch2_print+0x87/0xe0 [bcachefs] [ 1967.380267] ? bch2_printbuf_exit+0x20/0x30 [bcachefs] [ 1967.380309] ? print_mount_opts+0x131/0x180 [bcachefs] [ 1967.380349] bch2_fs_start+0x2cd/0x410 [bcachefs] [ 1967.380389] bch2_fs_open+0x6c4/0x6e0 [bcachefs] [ 1967.380429] bch2_mount+0x5ad/0x770 [bcachefs] [ 1967.380471] legacy_get_tree+0x28/0x50 [ 1967.380474] vfs_get_tree+0x26/0xe0 [ 1967.380476] path_mount+0x4ca/0xb10 [ 1967.380478] __x64_sys_mount+0x11a/0x150 [ 1967.380480] do_syscall_64+0xba/0x210 [ 1967.380483] entry_SYSCALL_64_after_hwframe+0x6d/0x75 [ 1967.380485] RIP: 0033:0x7fdc2476f2ae [ 1967.380504] RSP: 002b:00007ffda03c6ab8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 [ 1967.380506] RAX: ffffffffffffffda RBX: 000055a3e1ca4e60 RCX: 00007fdc2476f2ae [ 1967.380507] RDX: 000055a3e1ccbe30 RSI: 000055a3e1cafac0 RDI: 000055a3e1ca5840 [ 1967.380507] RBP: 000055a3e1ca5840 R08: 000055a3e1ca4e60 R09: 0000000000000000 [ 1967.380508] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 [ 1967.380509] R13: 000000000000006a R14: 00007ffda03c6ba0 R15: 000055a3e1ca0f20 [ 1967.380510] [ 2090.243894] INFO: task kworker/9:1:105 blocked for more than 245 seconds. [ 2090.243991] Tainted: G L 6.9.0-rc1 #1-NixOS [ 2090.244067] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 2090.244160] task:kworker/9:1 state:D stack:0 pid:105 tgid:105 ppid:2 flags:0x00004000 [ 2090.244171] Workqueue: bcachefs_write_ref bch2_do_discards_work [bcachefs] [ 2090.244308] Call Trace: [ 2090.244310] [ 2090.244316] __schedule+0x3ec/0x1540 [ 2090.244332] schedule+0x27/0xf0 [ 2090.244339] __closure_sync+0x7e/0x150 [ 2090.244347] bch2_btree_update_start+0x8bb/0x8d0 [bcachefs] [ 2090.244494] ? __pfx_closure_sync_fn+0x10/0x10 [ 2090.244504] __bch2_foreground_maybe_merge+0x568/0xd40 [bcachefs] [ 2090.244648] ? __bch2_journal_key_search+0x199/0x1b0 [bcachefs] [ 2090.244785] __bch2_trans_commit+0xa45/0x16e0 [bcachefs] [ 2090.244947] bch2_discard_one_bucket+0x633/0x840 [bcachefs] [ 2090.245130] bch2_do_discards_work+0x1f8/0x390 [bcachefs] [ 2090.245300] process_one_work+0x18b/0x3b0 [ 2090.245312] worker_thread+0x245/0x350 [ 2090.245319] ? __pfx_worker_thread+0x10/0x10 [ 2090.245326] ? __pfx_worker_thread+0x10/0x10 [ 2090.245333] kthread+0xcd/0x100 [ 2090.245339] ? __pfx_kthread+0x10/0x10 [ 2090.245345] ret_from_fork+0x31/0x50 [ 2090.245351] ? __pfx_kthread+0x10/0x10 [ 2090.245356] ret_from_fork_asm+0x1a/0x30 [ 2090.245368] [ 2090.245392] INFO: task kworker/u48:55:523 blocked for more than 245 seconds. [ 2090.245480] Tainted: G L 6.9.0-rc1 #1-NixOS [ 2090.245556] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 2090.245674] task:kworker/u48:55 state:D stack:0 pid:523 tgid:523 ppid:2 flags:0x00004000 [ 2090.245684] Workqueue: btree_update btree_interior_update_work [bcachefs] [ 2090.245840] Call Trace: [ 2090.245842] [ 2090.245846] __schedule+0x3ec/0x1540 [ 2090.245857] schedule+0x27/0xf0 [ 2090.245883] __closure_sync+0x7e/0x150 [ 2090.245893] bch2_btree_update_start+0x8bb/0x8d0 [bcachefs] [ 2090.246095] ? __pfx_closure_sync_fn+0x10/0x10 [ 2090.246114] __bch2_foreground_maybe_merge+0x568/0xd40 [bcachefs] [ 2090.246268] ? journal_validate_key+0x2b7/0x610 [bcachefs] [ 2090.246466] __bch2_trans_commit+0x3a6/0x16e0 [bcachefs] [ 2090.246645] btree_interior_update_work+0x764/0x9e0 [bcachefs] [ 2090.246814] ? process_one_work+0x18b/0x3b0 [ 2090.246824] process_one_work+0x18b/0x3b0 [ 2090.246833] worker_thread+0x245/0x350 [ 2090.246841] ? __pfx_worker_thread+0x10/0x10 [ 2090.246849] kthread+0xcd/0x100 [ 2090.246856] ? __pfx_kthread+0x10/0x10 [ 2090.246878] ret_from_fork+0x31/0x50 [ 2090.246886] ? __pfx_kthread+0x10/0x10 [ 2090.246892] ret_from_fork_asm+0x1a/0x30 [ 2090.246905] [ 2090.246920] INFO: task bcachefs:2381 blocked for more than 245 seconds. [ 2090.247025] Tainted: G L 6.9.0-rc1 #1-NixOS [ 2090.247028] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 2090.247030] task:bcachefs state:D stack:0 pid:2381 tgid:2381 ppid:2380 flags:0x00004002 [ 2090.247039] Call Trace: [ 2090.247041] [ 2090.247044] __schedule+0x3ec/0x1540 [ 2090.247255] schedule+0x27/0xf0 [ 2090.247263] __closure_sync+0x7e/0x150 [ 2090.247271] bch2_btree_update_start+0x8bb/0x8d0 [bcachefs] [ 2090.247441] ? __pfx_closure_sync_fn+0x10/0x10 [ 2090.247450] bch2_btree_split_leaf+0x4f/0x190 [bcachefs] [ 2090.247618] ? bch2_journal_replay+0x178/0x5a0 [bcachefs] [ 2090.247829] bch2_trans_commit_error+0x21c/0x4f0 [bcachefs] [ 2090.248009] ? six_trylock_ip+0x1f/0x50 [bcachefs] [ 2090.248214] __bch2_trans_commit+0xd20/0x16e0 [bcachefs] [ 2090.248379] ? bch2_trans_iter_exit+0x71/0x90 [bcachefs] [ 2090.248538] bch2_journal_replay+0x178/0x5a0 [bcachefs] [ 2090.248746] bch2_run_recovery_pass+0x35/0xa0 [bcachefs] [ 2090.248961] bch2_fs_recovery+0x134f/0x1630 [bcachefs] [ 2090.249164] ? __bch2_print+0x87/0xe0 [bcachefs] [ 2090.249360] ? bch2_printbuf_exit+0x20/0x30 [bcachefs] [ 2090.249562] ? print_mount_opts+0x131/0x180 [bcachefs] [ 2090.249744] bch2_fs_start+0x2cd/0x410 [bcachefs] [ 2090.249783] bch2_fs_open+0x6c4/0x6e0 [bcachefs] [ 2090.249823] bch2_mount+0x5ad/0x770 [bcachefs] [ 2090.249867] legacy_get_tree+0x28/0x50 [ 2090.249871] vfs_get_tree+0x26/0xe0 [ 2090.249874] path_mount+0x4ca/0xb10 [ 2090.249876] __x64_sys_mount+0x11a/0x150 [ 2090.249878] do_syscall_64+0xba/0x210 [ 2090.249880] entry_SYSCALL_64_after_hwframe+0x6d/0x75 [ 2090.249883] RIP: 0033:0x7fdc2476f2ae [ 2090.249901] RSP: 002b:00007ffda03c6ab8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 [ 2090.249903] RAX: ffffffffffffffda RBX: 000055a3e1ca4e60 RCX: 00007fdc2476f2ae [ 2090.249905] RDX: 000055a3e1ccbe30 RSI: 000055a3e1cafac0 RDI: 000055a3e1ca5840 [ 2090.249906] RBP: 000055a3e1ca5840 R08: 000055a3e1ca4e60 R09: 0000000000000000 [ 2090.249906] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 [ 2090.249907] R13: 000000000000006a R14: 00007ffda03c6ba0 R15: 000055a3e1ca0f20 [ 2090.249908] [ 2213.124405] INFO: task kworker/9:1:105 blocked for more than 368 seconds. [ 2213.127502] Tainted: G L 6.9.0-rc1 #1-NixOS [ 2213.129992] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 2213.130827] task:kworker/9:1 state:D stack:0 pid:105 tgid:105 ppid:2 flags:0x00004000 [ 2213.130830] Workqueue: bcachefs_write_ref bch2_do_discards_work [bcachefs] [ 2213.130874] Call Trace: [ 2213.130875] [ 2213.130877] __schedule+0x3ec/0x1540 [ 2213.130881] schedule+0x27/0xf0 [ 2213.130883] __closure_sync+0x7e/0x150 [ 2213.130885] bch2_btree_update_start+0x8bb/0x8d0 [bcachefs] [ 2213.130921] ? __pfx_closure_sync_fn+0x10/0x10 [ 2213.130924] __bch2_foreground_maybe_merge+0x568/0xd40 [bcachefs] [ 2213.130960] ? __bch2_journal_key_search+0x199/0x1b0 [bcachefs] [ 2213.130993] __bch2_trans_commit+0xa45/0x16e0 [bcachefs] [ 2213.131029] bch2_discard_one_bucket+0x633/0x840 [bcachefs] [ 2213.131057] bch2_do_discards_work+0x1f8/0x390 [bcachefs] [ 2213.131084] process_one_work+0x18b/0x3b0 [ 2213.131087] worker_thread+0x245/0x350 [ 2213.131088] ? __pfx_worker_thread+0x10/0x10 [ 2213.131090] ? __pfx_worker_thread+0x10/0x10 [ 2213.131092] kthread+0xcd/0x100 [ 2213.131093] ? __pfx_kthread+0x10/0x10 [ 2213.131095] ret_from_fork+0x31/0x50 [ 2213.131097] ? __pfx_kthread+0x10/0x10 [ 2213.131098] ret_from_fork_asm+0x1a/0x30 [ 2213.131101] [ 2213.131109] INFO: task kworker/u48:55:523 blocked for more than 368 seconds. [ 2213.131110] Tainted: G L 6.9.0-rc1 #1-NixOS [ 2213.131110] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 2213.131111] task:kworker/u48:55 state:D stack:0 pid:523 tgid:523 ppid:2 flags:0x00004000 [ 2213.131113] Workqueue: btree_update btree_interior_update_work [bcachefs] [ 2213.131147] Call Trace: [ 2213.131147] [ 2213.131148] __schedule+0x3ec/0x1540 [ 2213.131151] schedule+0x27/0xf0 [ 2213.131152] __closure_sync+0x7e/0x150 [ 2213.131154] bch2_btree_update_start+0x8bb/0x8d0 [bcachefs] [ 2213.131188] ? __pfx_closure_sync_fn+0x10/0x10 [ 2213.131190] __bch2_foreground_maybe_merge+0x568/0xd40 [bcachefs] [ 2213.131225] ? journal_validate_key+0x2b7/0x610 [bcachefs] [ 2213.131269] __bch2_trans_commit+0x3a6/0x16e0 [bcachefs] [ 2213.131304] btree_interior_update_work+0x764/0x9e0 [bcachefs] [ 2213.131338] ? process_one_work+0x18b/0x3b0 [ 2213.131340] process_one_work+0x18b/0x3b0 [ 2213.131342] worker_thread+0x245/0x350 [ 2213.131344] ? __pfx_worker_thread+0x10/0x10 [ 2213.131345] kthread+0xcd/0x100 [ 2213.131346] ? __pfx_kthread+0x10/0x10 [ 2213.131348] ret_from_fork+0x31/0x50 [ 2213.131349] ? __pfx_kthread+0x10/0x10 [ 2213.131350] ret_from_fork_asm+0x1a/0x30 [ 2213.131352] [ 2213.131358] INFO: task bcachefs:2381 blocked for more than 368 seconds. [ 2213.131358] Tainted: G L 6.9.0-rc1 #1-NixOS [ 2213.131359] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 2213.131359] task:bcachefs state:D stack:0 pid:2381 tgid:2381 ppid:2380 flags:0x00004002 [ 2213.131361] Call Trace: [ 2213.131361] [ 2213.131362] __schedule+0x3ec/0x1540 [ 2213.131364] schedule+0x27/0xf0 [ 2213.131366] __closure_sync+0x7e/0x150 [ 2213.131367] bch2_btree_update_start+0x8bb/0x8d0 [bcachefs] [ 2213.131404] ? __pfx_closure_sync_fn+0x10/0x10 [ 2213.131406] bch2_btree_split_leaf+0x4f/0x190 [bcachefs] [ 2213.131440] ? bch2_journal_replay+0x178/0x5a0 [bcachefs] [ 2213.131483] bch2_trans_commit_error+0x21c/0x4f0 [bcachefs] [ 2213.131516] ? six_trylock_ip+0x1f/0x50 [bcachefs] [ 2213.131558] __bch2_trans_commit+0xd20/0x16e0 [bcachefs] [ 2213.131592] ? bch2_trans_iter_exit+0x71/0x90 [bcachefs] [ 2213.131624] bch2_journal_replay+0x178/0x5a0 [bcachefs] [ 2213.131666] bch2_run_recovery_pass+0x35/0xa0 [bcachefs] [ 2213.136731] bch2_fs_recovery+0x134f/0x1630 [bcachefs] [ 2213.136774] ? __bch2_print+0x87/0xe0 [bcachefs] [ 2213.136814] ? bch2_printbuf_exit+0x20/0x30 [bcachefs] [ 2213.136855] ? print_mount_opts+0x131/0x180 [bcachefs] [ 2213.136895] bch2_fs_start+0x2cd/0x410 [bcachefs] [ 2213.136934] bch2_fs_open+0x6c4/0x6e0 [bcachefs] [ 2213.136974] bch2_mount+0x5ad/0x770 [bcachefs] [ 2213.137016] legacy_get_tree+0x28/0x50 [ 2213.137019] vfs_get_tree+0x26/0xe0 [ 2213.137021] path_mount+0x4ca/0xb10 [ 2213.137023] __x64_sys_mount+0x11a/0x150 [ 2213.137025] do_syscall_64+0xba/0x210 [ 2213.137028] entry_SYSCALL_64_after_hwframe+0x6d/0x75 [ 2213.137030] RIP: 0033:0x7fdc2476f2ae [ 2213.137052] RSP: 002b:00007ffda03c6ab8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 [ 2213.137054] RAX: ffffffffffffffda RBX: 000055a3e1ca4e60 RCX: 00007fdc2476f2ae [ 2213.137054] RDX: 000055a3e1ccbe30 RSI: 000055a3e1cafac0 RDI: 000055a3e1ca5840 [ 2213.137055] RBP: 000055a3e1ca5840 R08: 000055a3e1ca4e60 R09: 0000000000000000 [ 2213.137056] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 [ 2213.137056] R13: 000000000000006a R14: 00007ffda03c6ba0 R15: 000055a3e1ca0f20 [ 2213.137058] [ 2336.003917] INFO: task kworker/9:1:105 blocked for more than 491 seconds. [ 2336.006977] Tainted: G L 6.9.0-rc1 #1-NixOS [ 2336.009912] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 2336.010733] task:kworker/9:1 state:D stack:0 pid:105 tgid:105 ppid:2 flags:0x00004000 [ 2336.010736] Workqueue: bcachefs_write_ref bch2_do_discards_work [bcachefs] [ 2336.010782] Call Trace: [ 2336.010783] [ 2336.010786] __schedule+0x3ec/0x1540 [ 2336.010790] schedule+0x27/0xf0 [ 2336.010792] __closure_sync+0x7e/0x150 [ 2336.010794] bch2_btree_update_start+0x8bb/0x8d0 [bcachefs] [ 2336.010830] ? __pfx_closure_sync_fn+0x10/0x10 [ 2336.010834] __bch2_foreground_maybe_merge+0x568/0xd40 [bcachefs] [ 2336.010870] ? __bch2_journal_key_search+0x199/0x1b0 [bcachefs] [ 2336.010906] __bch2_trans_commit+0xa45/0x16e0 [bcachefs] [ 2336.010942] bch2_discard_one_bucket+0x633/0x840 [bcachefs] [ 2336.010970] bch2_do_discards_work+0x1f8/0x390 [bcachefs] [ 2336.010998] process_one_work+0x18b/0x3b0 [ 2336.011001] worker_thread+0x245/0x350 [ 2336.011002] ? __pfx_worker_thread+0x10/0x10 [ 2336.011004] ? __pfx_worker_thread+0x10/0x10 [ 2336.011006] kthread+0xcd/0x100 [ 2336.011007] ? __pfx_kthread+0x10/0x10 [ 2336.011009] ret_from_fork+0x31/0x50 [ 2336.011011] ? __pfx_kthread+0x10/0x10 [ 2336.011012] ret_from_fork_asm+0x1a/0x30 [ 2336.011015] ```

(Apologies if this is unrelated; just seems to match everyone else's reports here. Figured information from running bcachefs mount -o fsck,fix_errors on 6.9-rc1 might be useful.)

reedriley commented 4 months ago

I rebuilt the kernel with a bunch more debugging enabled (the bcachefs-specific stuff; and lockdeps). Here's the new kernel output from trying to mount:

Details

``` [ 173.592322] raid6: avx2x4 gen() 50455 MB/s [ 173.609333] raid6: avx2x2 gen() 50761 MB/s [ 173.626348] raid6: avx2x1 gen() 43275 MB/s [ 173.626349] raid6: using algorithm avx2x2 gen() 50761 MB/s [ 173.643356] raid6: .... xor() 31029 MB/s, rmw enabled [ 173.643358] raid6: using avx2x2 recovery algorithm [ 173.652608] xor: automatically using best checksumming function avx [ 173.857121] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): mounting version 1.6: btree_subvolume_children opts=metadata_replicas=3,data_replicas=2,metadata_replicas_required=2,metadata_checksum=xxhash,data_checksum=xxhash,metadata_target=ssd,foreground_target=ssd,background_target=hdd,promote_target=ssd,fix_errors=yes [ 173.857146] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): recovering from unclean shutdown [ 173.857191] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): superblock requires following recovery passes to be run: check_subvols,check_dirents [ 173.857199] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): Version upgrade from 1.4: member_seq to 1.6: btree_subvolume_children incomplete Doing compatible version upgrade from 1.4: member_seq to 1.6: btree_subvolume_children [ 173.866878] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): starting journal read [ 178.951969] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): journal read done on device nvme2n1, ret 0 [ 180.800000] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): journal read done on device nvme3n1, ret 0 [ 189.427805] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): journal read done on device sdg, ret 0 [ 189.647470] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): journal read done on device sdc, ret 0 [ 189.987919] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): journal read done on device sdb, ret 0 [ 190.381464] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): journal read done on device sda, ret 0 [ 190.399022] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): journal read done on device nvme1n1, ret 0 [ 191.803748] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): journal read done on device nvme0n1, ret 0 [ 194.935829] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): journal read done on device sdd, ret 0 [ 197.624517] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): journal read done on device sdf, ret 0 [ 197.624611] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): journal read done, replaying entries 7352723-7368894 [ 197.624623] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): dropped unflushed entries 7368895-7368895 [ 228.736695] watchdog: BUG: soft lockup - CPU#5 stuck for 27s! [bcachefs:4284] [ 228.736927] Modules linked in: poly1305_generic libpoly1305 poly1305_x86_64 chacha_generic chacha_x86_64 libchacha bcachefs lz4hc_compress lz4_compress xor raid6_pq af_packet snd_ctl_led ledtrig_audio snd_soc_skl_hda_dsp snd_soc_hdac_hdmi snd_sof_probes snd_soc_intel_hda_dsp_common snd_hda_codec_hdmi snd_soc_dmic hid_sensor_magn_3d hid_sensor_als hid_sensor_accel_3d intel_rapl_msr intel_rapl_common hid_sensor_incl_3d hid_sensor_rotation hid_sensor_gyro_3d hid_sensor_trigger industrialio_triggered_buffer kfifo_buf hid_sensor_iio_common industrialio hid_sensor_custom intel_uncore_frequency intel_uncore_frequency_common snd_sof_pci_intel_cnl intel_tcc_cooling snd_sof_intel_hda_common snd_soc_hdac_hda soundwire_intel snd_sof_intel_hda_mlink soundwire_cadence x86_pkg_temp_thermal intel_powerclamp snd_sof_intel_hda snd_sof_pci coretemp snd_sof_xtensa_dsp kvm_intel snd_sof snd_sof_utils soundwire_generic_allocation soundwire_bus kvm crc32_pclmul snd_soc_avs polyval_clmulni polyval_generic snd_soc_hda_codec gf128mul [ 228.736976] hid_sensor_hub ghash_clmulni_intel snd_soc_skl intel_ishtp_hid sha512_ssse3 tps6598x snd_soc_sst_ipc sha512_generic snd_soc_sst_dsp snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine sha256_ssse3 typec sha1_ssse3 i915 roles cfg80211 snd_hda_intel aesni_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hda_core snd_hwdep snd_pcm rfkill ledtrig_netdev snd_timer snd drm_buddy ttm soundcore drm_display_helper libaes cec 8021q intel_gtt crypto_simd cmdlinepart cryptd i2c_algo_bit spi_nor jc42 i40e igc mtd rapl ir_rc6_decoder intel_ish_ipc mei_wdt wdat_wdt intel_wmi_thunderbolt mei_hdcp intel_cstate mei_pxp ee1004 spi_intel_pci watchdog video intel_ishtp rc_rc6_mce intel_uncore i2c_i801 mei_me joydev intel_lpss_pci wmi_bmof ite_cir mei input_leds ptp pps_core evdev intel_lpss idma64 mousedev backlight intel_pmc_core 8250_pci thermal intel_pch_thermal spi_intel virt_dma serial_multi_instantiate mac_hid intel_vsec pmt_telemetry pmt_class i2c_smbus [ 228.737049] wmi tiny_power_button pinctrl_cannonlake button acpi_tad acpi_pad fan xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_tcpudp ip6t_rpfilter ipt_rpfilter xt_pkttype nft_compat nf_tables libcrc32c crc32c_generic sch_fq_codel atkbd libps2 vivaldi_fmap fuse configfs efi_pstore nfnetlink efivarfs dmi_sysfs ip_tables x_tables autofs4 squashfs isofs cdrom hid_generic sd_mod usbhid hid uas usb_storage ahci libahci sdhci_pci libata nvme cqhci xhci_pci xhci_pci_renesas sdhci nvme_core firmware_class led_class xhci_hcd t10_pi scsi_mod mmc_core crc32c_intel rtc_cmos crc64_rocksoft crc64 crc_t10dif crct10dif_generic scsi_common crct10dif_pclmul crct10dif_common serio dm_mod dax nls_iso8859_1 nls_cp437 vfat fat overlay loop [ 228.737107] irq event stamp: 231870 [ 228.737109] hardirqs last enabled at (231869): [] irqentry_exit+0x3b/0x90 [ 228.737114] hardirqs last disabled at (231870): [] sysvec_apic_timer_interrupt+0xe/0x80 [ 228.737118] softirqs last enabled at (231816): [] __do_softirq+0x350/0x42b [ 228.737121] softirqs last disabled at (231811): [] __irq_exit_rcu+0xab/0xd0 [ 228.737124] CPU: 5 PID: 4284 Comm: bcachefs Not tainted 6.9.0-rc1 #1-NixOS [ 228.737127] Hardware name: QNAP Systems, Inc. TVS-1688/Default string, BIOS Q063AR22 03/08/2022 [ 228.737128] RIP: 0010:journal_sort_key_cmp+0x10/0xd0 [bcachefs] [ 228.737179] Code: 0f 0b 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 48 89 f2 0f b6 77 0c <31> c0 0f b6 4a 0c 39 f1 0f 92 c0 39 ce 83 d8 00 85 c0 75 18 0f b6 [ 228.737180] RSP: 0018:ffff8f61cacbf900 EFLAGS: 00000246 [ 228.737182] RAX: ffffffffc1ef02a0 RBX: 0000000000000000 RCX: 0000000000000004 [ 228.737184] RDX: ffff8f6279365ca0 RSI: 0000000000000000 RDI: ffff8f6279365c88 [ 228.737185] RBP: ffff8f625aa00000 R08: ffff8a43d481d460 R09: 000000000f4b2e38 [ 228.737186] R10: 000000000000001b R11: ffffffffffe00000 R12: ffff8f61cacbf990 [ 228.737187] R13: 000000001e965ca0 R14: 000000001e965c88 R15: 0000000000000018 [ 228.737189] FS: 00007f9239169c00(0000) GS:ffff8a605aa00000(0000) knlGS:0000000000000000 [ 228.737190] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 228.737191] CR2: 0000000006828218 CR3: 0000000125960006 CR4: 00000000003706f0 [ 228.737193] Call Trace: [ 228.737194] [ 228.737196] ? watchdog_timer_fn+0x20a/0x290 [ 228.737199] ? __pfx_watchdog_timer_fn+0x10/0x10 [ 228.737202] ? __hrtimer_run_queues+0x19b/0x360 [ 228.737207] ? hrtimer_interrupt+0xfa/0x230 [ 228.737212] ? __sysvec_apic_timer_interrupt+0x6f/0x1e0 [ 228.737215] ? sysvec_apic_timer_interrupt+0x6e/0x80 [ 228.737218] [ 228.737219] [ 228.737220] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 228.737227] ? __pfx_journal_sort_key_cmp+0x10/0x10 [bcachefs] [ 228.737265] ? journal_sort_key_cmp+0x10/0xd0 [bcachefs] [ 228.737302] sort_r+0x103/0x370 [ 228.737309] sort+0x31/0x50 [ 228.737312] ? __pfx_journal_sort_key_cmp+0x10/0x10 [bcachefs] [ 228.737348] __journal_keys_sort+0x2b/0x100 [bcachefs] [ 228.737385] bch2_journal_keys_sort+0x370/0x3b0 [bcachefs] [ 228.737425] bch2_fs_recovery+0x380/0x1620 [bcachefs] [ 228.737476] ? local_clock_noinstr+0xd/0xc0 [ 228.737478] ? local_clock+0x15/0x30 [ 228.737483] ? bch2_get_next_online_dev+0x13b/0x370 [bcachefs] [ 228.737516] bch2_fs_start+0x13f/0x220 [bcachefs] [ 228.737565] bch2_fs_open+0x65b/0x690 [bcachefs] [ 228.737617] bch2_mount+0x581/0x720 [bcachefs] [ 228.737677] legacy_get_tree+0x31/0x60 [ 228.737680] vfs_get_tree+0x29/0x100 [ 228.737683] ? capable+0x36/0x70 [ 228.737686] path_mount+0x4ca/0xbe0 [ 228.737691] __x64_sys_mount+0x11a/0x150 [ 228.737695] do_syscall_64+0xcd/0x220 [ 228.737699] entry_SYSCALL_64_after_hwframe+0x6d/0x75 [ 228.737701] RIP: 0033:0x7f92392972ae [ 228.737719] Code: 48 8b 0d 85 ab 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 52 ab 0c 00 f7 d8 64 89 01 48 [ 228.737721] RSP: 002b:00007ffd839a9aa8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 [ 228.737723] RAX: ffffffffffffffda RBX: 000055f9fb47a860 RCX: 00007f92392972ae [ 228.737724] RDX: 000055f9fb479c30 RSI: 000055f9fb4791c0 RDI: 000055f9fb451e40 [ 228.737726] RBP: 000055f9fb451e40 R08: 000055f9fb47a860 R09: 0000000000000000 [ 228.737727] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 [ 228.737728] R13: 000000000000006a R14: 00007ffd839a9b90 R15: 000055f9fb44df20 [ 228.737734] [ 256.738503] watchdog: BUG: soft lockup - CPU#5 stuck for 53s! [bcachefs:4284] [ 256.738546] Modules linked in: poly1305_generic libpoly1305 poly1305_x86_64 chacha_generic chacha_x86_64 libchacha bcachefs lz4hc_compress lz4_compress xor raid6_pq af_packet snd_ctl_led ledtrig_audio snd_soc_skl_hda_dsp snd_soc_hdac_hdmi snd_sof_probes snd_soc_intel_hda_dsp_common snd_hda_codec_hdmi snd_soc_dmic hid_sensor_magn_3d hid_sensor_als hid_sensor_accel_3d intel_rapl_msr intel_rapl_common hid_sensor_incl_3d hid_sensor_rotation hid_sensor_gyro_3d hid_sensor_trigger industrialio_triggered_buffer kfifo_buf hid_sensor_iio_common industrialio hid_sensor_custom intel_uncore_frequency intel_uncore_frequency_common snd_sof_pci_intel_cnl intel_tcc_cooling snd_sof_intel_hda_common snd_soc_hdac_hda soundwire_intel snd_sof_intel_hda_mlink soundwire_cadence x86_pkg_temp_thermal intel_powerclamp snd_sof_intel_hda snd_sof_pci coretemp snd_sof_xtensa_dsp kvm_intel snd_sof snd_sof_utils soundwire_generic_allocation soundwire_bus kvm crc32_pclmul snd_soc_avs polyval_clmulni polyval_generic snd_soc_hda_codec gf128mul [ 256.738594] hid_sensor_hub ghash_clmulni_intel snd_soc_skl intel_ishtp_hid sha512_ssse3 tps6598x snd_soc_sst_ipc sha512_generic snd_soc_sst_dsp snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine sha256_ssse3 typec sha1_ssse3 i915 roles cfg80211 snd_hda_intel aesni_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hda_core snd_hwdep snd_pcm rfkill ledtrig_netdev snd_timer snd drm_buddy ttm soundcore drm_display_helper libaes cec 8021q intel_gtt crypto_simd cmdlinepart cryptd i2c_algo_bit spi_nor jc42 i40e igc mtd rapl ir_rc6_decoder intel_ish_ipc mei_wdt wdat_wdt intel_wmi_thunderbolt mei_hdcp intel_cstate mei_pxp ee1004 spi_intel_pci watchdog video intel_ishtp rc_rc6_mce intel_uncore i2c_i801 mei_me joydev intel_lpss_pci wmi_bmof ite_cir mei input_leds ptp pps_core evdev intel_lpss idma64 mousedev backlight intel_pmc_core 8250_pci thermal intel_pch_thermal spi_intel virt_dma serial_multi_instantiate mac_hid intel_vsec pmt_telemetry pmt_class i2c_smbus [ 256.738669] wmi tiny_power_button pinctrl_cannonlake button acpi_tad acpi_pad fan xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_tcpudp ip6t_rpfilter ipt_rpfilter xt_pkttype nft_compat nf_tables libcrc32c crc32c_generic sch_fq_codel atkbd libps2 vivaldi_fmap fuse configfs efi_pstore nfnetlink efivarfs dmi_sysfs ip_tables x_tables autofs4 squashfs isofs cdrom hid_generic sd_mod usbhid hid uas usb_storage ahci libahci sdhci_pci libata nvme cqhci xhci_pci xhci_pci_renesas sdhci nvme_core firmware_class led_class xhci_hcd t10_pi scsi_mod mmc_core crc32c_intel rtc_cmos crc64_rocksoft crc64 crc_t10dif crct10dif_generic scsi_common crct10dif_pclmul crct10dif_common serio dm_mod dax nls_iso8859_1 nls_cp437 vfat fat overlay loop [ 256.738728] irq event stamp: 304580 [ 256.738729] hardirqs last enabled at (304579): [] irqentry_exit+0x3b/0x90 [ 256.738734] hardirqs last disabled at (304580): [] sysvec_apic_timer_interrupt+0xe/0x80 [ 256.738737] softirqs last enabled at (304552): [] __do_softirq+0x350/0x42b [ 256.738740] softirqs last disabled at (304533): [] __irq_exit_rcu+0xab/0xd0 [ 256.738744] CPU: 5 PID: 4284 Comm: bcachefs Tainted: G L 6.9.0-rc1 #1-NixOS [ 256.738747] Hardware name: QNAP Systems, Inc. TVS-1688/Default string, BIOS Q063AR22 03/08/2022 [ 256.738748] RIP: 0010:journal_sort_key_cmp+0x10/0xd0 [bcachefs] [ 256.738800] Code: 0f 0b 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 48 89 f2 0f b6 77 0c <31> c0 0f b6 4a 0c 39 f1 0f 92 c0 39 ce 83 d8 00 85 c0 75 18 0f b6 [ 256.738802] RSP: 0018:ffff8f61cacbf900 EFLAGS: 00000246 [ 256.738804] RAX: ffffffffc1ef02a0 RBX: 0000000000000000 RCX: 0000000000aae3d5 [ 256.738806] RDX: ffff8f625ae4aea0 RSI: 0000000000000004 RDI: ffff8f625ae4ae88 [ 256.738807] RBP: ffff8f625aa00000 R08: ffff8a432c90ae70 R09: 0000000000225738 [ 256.738808] R10: 000000000000001b R11: ffffffffffe00000 R12: ffff8f61cacbf990 [ 256.738809] R13: 000000000044aea0 R14: 000000000044ae88 R15: 0000000000000018 [ 256.738810] FS: 00007f9239169c00(0000) GS:ffff8a605aa00000(0000) knlGS:0000000000000000 [ 256.738812] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 256.738813] CR2: 0000000006828218 CR3: 0000000125960006 CR4: 00000000003706f0 [ 256.738814] Call Trace: [ 256.738817] [ 256.738821] ? watchdog_timer_fn+0x20a/0x290 [ 256.738824] ? __pfx_watchdog_timer_fn+0x10/0x10 [ 256.738827] ? __hrtimer_run_queues+0x19b/0x360 [ 256.738832] ? hrtimer_interrupt+0xfa/0x230 [ 256.738837] ? __sysvec_apic_timer_interrupt+0x6f/0x1e0 [ 256.738841] ? sysvec_apic_timer_interrupt+0x6e/0x80 [ 256.738844] [ 256.738845] [ 256.738846] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 256.738852] ? __pfx_journal_sort_key_cmp+0x10/0x10 [bcachefs] [ 256.738891] ? journal_sort_key_cmp+0x10/0xd0 [bcachefs] [ 256.738928] sort_r+0x103/0x370 [ 256.738935] sort+0x31/0x50 [ 256.738937] ? __pfx_journal_sort_key_cmp+0x10/0x10 [bcachefs] [ 256.738974] __journal_keys_sort+0x2b/0x100 [bcachefs] [ 256.739011] bch2_journal_keys_sort+0x370/0x3b0 [bcachefs] [ 256.739051] bch2_fs_recovery+0x380/0x1620 [bcachefs] [ 256.739102] ? local_clock_noinstr+0xd/0xc0 [ 256.739104] ? local_clock+0x15/0x30 [ 256.739109] ? bch2_get_next_online_dev+0x13b/0x370 [bcachefs] [ 256.739142] bch2_fs_start+0x13f/0x220 [bcachefs] [ 256.739190] bch2_fs_open+0x65b/0x690 [bcachefs] [ 256.739242] bch2_mount+0x581/0x720 [bcachefs] [ 256.739303] legacy_get_tree+0x31/0x60 [ 256.739307] vfs_get_tree+0x29/0x100 [ 256.739310] ? capable+0x36/0x70 [ 256.739313] path_mount+0x4ca/0xbe0 [ 256.739317] __x64_sys_mount+0x11a/0x150 [ 256.739321] do_syscall_64+0xcd/0x220 [ 256.739325] entry_SYSCALL_64_after_hwframe+0x6d/0x75 [ 256.739328] RIP: 0033:0x7f92392972ae [ 256.739347] Code: 48 8b 0d 85 ab 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 52 ab 0c 00 f7 d8 64 89 01 48 [ 256.739348] RSP: 002b:00007ffd839a9aa8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 [ 256.739351] RAX: ffffffffffffffda RBX: 000055f9fb47a860 RCX: 00007f92392972ae [ 256.739352] RDX: 000055f9fb479c30 RSI: 000055f9fb4791c0 RDI: 000055f9fb451e40 [ 256.739353] RBP: 000055f9fb451e40 R08: 000055f9fb47a860 R09: 0000000000000000 [ 256.739354] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 [ 256.739355] R13: 000000000000006a R14: 00007ffd839a9b90 R15: 000055f9fb44df20 [ 256.739362] [ 284.738993] watchdog: BUG: soft lockup - CPU#5 stuck for 79s! [bcachefs:4284] [ 284.739034] Modules linked in: poly1305_generic libpoly1305 poly1305_x86_64 chacha_generic chacha_x86_64 libchacha bcachefs lz4hc_compress lz4_compress xor raid6_pq af_packet snd_ctl_led ledtrig_audio snd_soc_skl_hda_dsp snd_soc_hdac_hdmi snd_sof_probes snd_soc_intel_hda_dsp_common snd_hda_codec_hdmi snd_soc_dmic hid_sensor_magn_3d hid_sensor_als hid_sensor_accel_3d intel_rapl_msr intel_rapl_common hid_sensor_incl_3d hid_sensor_rotation hid_sensor_gyro_3d hid_sensor_trigger industrialio_triggered_buffer kfifo_buf hid_sensor_iio_common industrialio hid_sensor_custom intel_uncore_frequency intel_uncore_frequency_common snd_sof_pci_intel_cnl intel_tcc_cooling snd_sof_intel_hda_common snd_soc_hdac_hda soundwire_intel snd_sof_intel_hda_mlink soundwire_cadence x86_pkg_temp_thermal intel_powerclamp snd_sof_intel_hda snd_sof_pci coretemp snd_sof_xtensa_dsp kvm_intel snd_sof snd_sof_utils soundwire_generic_allocation soundwire_bus kvm crc32_pclmul snd_soc_avs polyval_clmulni polyval_generic snd_soc_hda_codec gf128mul [ 284.739084] hid_sensor_hub ghash_clmulni_intel snd_soc_skl intel_ishtp_hid sha512_ssse3 tps6598x snd_soc_sst_ipc sha512_generic snd_soc_sst_dsp snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine sha256_ssse3 typec sha1_ssse3 i915 roles cfg80211 snd_hda_intel aesni_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hda_core snd_hwdep snd_pcm rfkill ledtrig_netdev snd_timer snd drm_buddy ttm soundcore drm_display_helper libaes cec 8021q intel_gtt crypto_simd cmdlinepart cryptd i2c_algo_bit spi_nor jc42 i40e igc mtd rapl ir_rc6_decoder intel_ish_ipc mei_wdt wdat_wdt intel_wmi_thunderbolt mei_hdcp intel_cstate mei_pxp ee1004 spi_intel_pci watchdog video intel_ishtp rc_rc6_mce intel_uncore i2c_i801 mei_me joydev intel_lpss_pci wmi_bmof ite_cir mei input_leds ptp pps_core evdev intel_lpss idma64 mousedev backlight intel_pmc_core 8250_pci thermal intel_pch_thermal spi_intel virt_dma serial_multi_instantiate mac_hid intel_vsec pmt_telemetry pmt_class i2c_smbus [ 284.739158] wmi tiny_power_button pinctrl_cannonlake button acpi_tad acpi_pad fan xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_tcpudp ip6t_rpfilter ipt_rpfilter xt_pkttype nft_compat nf_tables libcrc32c crc32c_generic sch_fq_codel atkbd libps2 vivaldi_fmap fuse configfs efi_pstore nfnetlink efivarfs dmi_sysfs ip_tables x_tables autofs4 squashfs isofs cdrom hid_generic sd_mod usbhid hid uas usb_storage ahci libahci sdhci_pci libata nvme cqhci xhci_pci xhci_pci_renesas sdhci nvme_core firmware_class led_class xhci_hcd t10_pi scsi_mod mmc_core crc32c_intel rtc_cmos crc64_rocksoft crc64 crc_t10dif crct10dif_generic scsi_common crct10dif_pclmul crct10dif_common serio dm_mod dax nls_iso8859_1 nls_cp437 vfat fat overlay loop [ 284.739218] irq event stamp: 377848 [ 284.739220] hardirqs last enabled at (377847): [] irqentry_exit+0x3b/0x90 [ 284.739225] hardirqs last disabled at (377848): [] sysvec_apic_timer_interrupt+0xe/0x80 [ 284.739229] softirqs last enabled at (377822): [] __do_softirq+0x350/0x42b [ 284.739232] softirqs last disabled at (377815): [] __irq_exit_rcu+0xab/0xd0 [ 284.739236] CPU: 5 PID: 4284 Comm: bcachefs Tainted: G L 6.9.0-rc1 #1-NixOS [ 284.739238] Hardware name: QNAP Systems, Inc. TVS-1688/Default string, BIOS Q063AR22 03/08/2022 [ 284.739239] RIP: 0010:journal_sort_key_cmp+0x4f/0xd0 [bcachefs] [ 284.739292] Code: b6 77 0d 0f b6 4a 0d 31 c0 39 f1 0f 92 c0 39 ce 83 d8 00 85 c0 74 05 c3 cc cc cc cc 4c 8b 4f 10 4c 8b 42 10 31 c0 49 8b 71 20 <49> 8b 48 20 48 39 f1 0f 92 c0 48 39 ce 83 d8 00 85 c0 75 d9 49 8b [ 284.739293] RSP: 0018:ffff8f61cacbf900 EFLAGS: 00000246 [ 284.739295] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 284.739297] RDX: ffff8f625b9972f0 RSI: 00000000080a868e RDI: ffff8f625b9972d8 [ 284.739298] RBP: ffff8f625aa00000 R08: ffff8a426cbe7af0 R09: ffff8a4202863290 [ 284.739299] R10: 000000000000001b R11: ffffffffffe00000 R12: ffff8f61cacbf990 [ 284.739300] R13: 0000000000f972f0 R14: 0000000000f972d8 R15: 0000000000000018 [ 284.739301] FS: 00007f9239169c00(0000) GS:ffff8a605aa00000(0000) knlGS:0000000000000000 [ 284.739303] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 284.739304] CR2: 0000000006828218 CR3: 0000000125960006 CR4: 00000000003706f0 [ 284.739306] Call Trace: [ 284.739308] [ 284.739312] ? watchdog_timer_fn+0x20a/0x290 [ 284.739315] ? __pfx_watchdog_timer_fn+0x10/0x10 [ 284.739318] ? __hrtimer_run_queues+0x19b/0x360 [ 284.739323] ? hrtimer_interrupt+0xfa/0x230 [ 284.739328] ? __sysvec_apic_timer_interrupt+0x6f/0x1e0 [ 284.739332] ? sysvec_apic_timer_interrupt+0x6e/0x80 [ 284.739335] [ 284.739336] [ 284.739337] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 284.739345] ? journal_sort_key_cmp+0x4f/0xd0 [bcachefs] [ 284.739384] sort_r+0x103/0x370 [ 284.739391] sort+0x31/0x50 [ 284.739393] ? __pfx_journal_sort_key_cmp+0x10/0x10 [bcachefs] [ 284.739430] __journal_keys_sort+0x2b/0x100 [bcachefs] [ 284.739468] bch2_journal_keys_sort+0x370/0x3b0 [bcachefs] [ 284.739508] bch2_fs_recovery+0x380/0x1620 [bcachefs] [ 284.739559] ? local_clock_noinstr+0xd/0xc0 [ 284.739561] ? local_clock+0x15/0x30 [ 284.739566] ? bch2_get_next_online_dev+0x13b/0x370 [bcachefs] [ 284.739599] bch2_fs_start+0x13f/0x220 [bcachefs] [ 284.739648] bch2_fs_open+0x65b/0x690 [bcachefs] [ 284.739700] bch2_mount+0x581/0x720 [bcachefs] [ 284.739760] legacy_get_tree+0x31/0x60 [ 284.739763] vfs_get_tree+0x29/0x100 [ 284.739766] ? capable+0x36/0x70 [ 284.739769] path_mount+0x4ca/0xbe0 [ 284.739774] __x64_sys_mount+0x11a/0x150 [ 284.739778] do_syscall_64+0xcd/0x220 [ 284.739782] entry_SYSCALL_64_after_hwframe+0x6d/0x75 [ 284.739785] RIP: 0033:0x7f92392972ae [ 284.739803] Code: 48 8b 0d 85 ab 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 52 ab 0c 00 f7 d8 64 89 01 48 [ 284.739805] RSP: 002b:00007ffd839a9aa8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 [ 284.739807] RAX: ffffffffffffffda RBX: 000055f9fb47a860 RCX: 00007f92392972ae [ 284.739808] RDX: 000055f9fb479c30 RSI: 000055f9fb4791c0 RDI: 000055f9fb451e40 [ 284.739810] RBP: 000055f9fb451e40 R08: 000055f9fb47a860 R09: 0000000000000000 [ 284.739811] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 [ 284.739812] R13: 000000000000006a R14: 00007ffd839a9b90 R15: 000055f9fb44df20 [ 284.739818] [ 289.147060] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): Journal keys: 57428170 read, 25858000 after sorting and compacting [ 290.369203] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): alloc_read... done [ 292.394583] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): stripes_read... done [ 292.394603] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): snapshots_read... done [ 293.005340] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): going read-write [ 293.029926] bcachefs (7ab736a2-cdb4-4858-9d5c-df97d2757ee6): journal_replay... [ 361.899594] ------------[ cut here ]------------ [ 361.899627] btree trans held srcu lock (delaying memory reclaim) for 68 seconds [ 361.899657] ------------[ cut here ]------------ [ 361.899680] btree trans held srcu lock (delaying memory reclaim) for 68 seconds [ 361.899672] WARNING: CPU: 0 PID: 4284 at fs/bcachefs/btree_iter.c:2831 bch2_trans_put+0x39a/0x440 [bcachefs] [ 361.899721] WARNING: CPU: 3 PID: 390 at fs/bcachefs/btree_iter.c:2831 bch2_trans_srcu_unlock+0x162/0x180 [bcachefs] [ 361.899854] Modules linked in: poly1305_generic libpoly1305 poly1305_x86_64 chacha_generic chacha_x86_64 libchacha bcachefs lz4hc_compress lz4_compress xor [ 361.899895] Modules linked in: poly1305_generic libpoly1305 poly1305_x86_64 chacha_generic [ 361.899915] raid6_pq af_packet [ 361.899921] chacha_x86_64 [ 361.899926] snd_ctl_led ledtrig_audio [ 361.899931] libchacha bcachefs [ 361.899936] snd_soc_skl_hda_dsp snd_soc_hdac_hdmi [ 361.899940] lz4hc_compress lz4_compress [ 361.899945] snd_sof_probes snd_soc_intel_hda_dsp_common [ 361.899950] xor raid6_pq [ 361.899954] snd_hda_codec_hdmi snd_soc_dmic [ 361.899959] af_packet snd_ctl_led [ 361.899963] hid_sensor_magn_3d hid_sensor_als [ 361.899968] ledtrig_audio snd_soc_skl_hda_dsp [ 361.899972] hid_sensor_accel_3d intel_rapl_msr [ 361.899979] snd_soc_hdac_hdmi snd_sof_probes [ 361.899983] intel_rapl_common hid_sensor_incl_3d [ 361.899988] snd_soc_intel_hda_dsp_common snd_hda_codec_hdmi [ 361.899992] hid_sensor_rotation hid_sensor_gyro_3d [ 361.899998] snd_soc_dmic hid_sensor_magn_3d [ 361.900002] hid_sensor_trigger industrialio_triggered_buffer [ 361.900008] hid_sensor_als hid_sensor_accel_3d [ 361.900012] kfifo_buf [ 361.900016] intel_rapl_msr intel_rapl_common [ 361.900059] hid_sensor_iio_common industrialio [ 361.900067] hid_sensor_incl_3d [ 361.900071] hid_sensor_custom [ 361.900074] hid_sensor_rotation [ 361.900078] intel_uncore_frequency [ 361.900078] hid_sensor_gyro_3d hid_sensor_trigger [ 361.900085] intel_uncore_frequency_common snd_sof_pci_intel_cnl [ 361.900088] industrialio_triggered_buffer kfifo_buf [ 361.900095] intel_tcc_cooling snd_sof_intel_hda_common [ 361.900099] hid_sensor_iio_common industrialio [ 361.900105] snd_soc_hdac_hda soundwire_intel [ 361.900109] hid_sensor_custom intel_uncore_frequency [ 361.900115] snd_sof_intel_hda_mlink soundwire_cadence [ 361.900119] intel_uncore_frequency_common snd_sof_pci_intel_cnl [ 361.900124] x86_pkg_temp_thermal intel_powerclamp [ 361.900128] intel_tcc_cooling snd_sof_intel_hda_common [ 361.900134] snd_sof_intel_hda snd_sof_pci [ 361.900138] snd_soc_hdac_hda soundwire_intel [ 361.900143] coretemp snd_sof_xtensa_dsp [ 361.900148] snd_sof_intel_hda_mlink soundwire_cadence [ 361.900152] kvm_intel snd_sof [ 361.900157] x86_pkg_temp_thermal intel_powerclamp [ 361.900161] snd_sof_utils [ 361.900165] snd_sof_intel_hda snd_sof_pci [ 361.900171] soundwire_generic_allocation soundwire_bus [ 361.900175] coretemp snd_sof_xtensa_dsp [ 361.900181] kvm crc32_pclmul [ 361.900184] kvm_intel snd_sof snd_sof_utils [ 361.900192] snd_soc_avs polyval_clmulni [ 361.900198] soundwire_generic_allocation [ 361.900202] polyval_generic [ 361.900202] soundwire_bus [ 361.900208] snd_soc_hda_codec gf128mul [ 361.900213] kvm crc32_pclmul [ 361.900219] hid_sensor_hub ghash_clmulni_intel [ 361.900225] snd_soc_avs [ 361.900229] snd_soc_skl [ 361.900230] polyval_clmulni [ 361.900235] intel_ishtp_hid sha512_ssse3 [ 361.900240] polyval_generic [ 361.900246] tps6598x [ 361.900246] snd_soc_hda_codec [ 361.900251] snd_soc_sst_ipc [ 361.900256] gf128mul [ 361.900256] sha512_generic snd_soc_sst_dsp [ 361.900263] hid_sensor_hub ghash_clmulni_intel [ 361.900268] snd_hda_ext_core [ 361.900273] snd_soc_skl [ 361.900274] snd_soc_acpi_intel_match snd_soc_acpi [ 361.900280] intel_ishtp_hid sha512_ssse3 [ 361.900286] snd_soc_core snd_compress [ 361.900292] tps6598x [ 361.900296] ac97_bus [ 361.900297] snd_soc_sst_ipc [ 361.900303] snd_pcm_dmaengine [ 361.900304] sha512_generic [ 361.900309] sha256_ssse3 [ 361.900310] snd_soc_sst_dsp [ 361.900315] typec [ 361.900317] snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi [ 361.900326] sha1_ssse3 [ 361.900331] snd_soc_core [ 361.900332] i915 [ 361.900337] snd_compress [ 361.900337] roles cfg80211 [ 361.900344] ac97_bus snd_pcm_dmaengine sha256_ssse3 [ 361.900351] snd_hda_intel [ 361.900356] typec [ 361.900356] aesni_intel snd_intel_dspcfg [ 361.900363] sha1_ssse3 i915 roles [ 361.900369] snd_intel_sdw_acpi [ 361.900374] cfg80211 [ 361.900375] snd_hda_codec [ 361.900380] snd_hda_intel [ 361.900380] snd_hda_core snd_hwdep [ 361.900387] aesni_intel snd_intel_dspcfg [ 361.900390] snd_pcm rfkill [ 361.900396] snd_intel_sdw_acpi [ 361.900399] ledtrig_netdev [ 361.900400] snd_hda_codec [ 361.900403] snd_timer [ 361.900403] snd_hda_core [ 361.900407] snd [ 361.900408] snd_hwdep [ 361.900412] drm_buddy [ 361.900412] snd_pcm [ 361.900416] ttm [ 361.900416] rfkill [ 361.900420] soundcore [ 361.900421] ledtrig_netdev [ 361.900424] drm_display_helper [ 361.900425] snd_timer [ 361.900428] libaes [ 361.900429] snd [ 361.900433] cec [ 361.900434] drm_buddy [ 361.900438] 8021q [ 361.900438] ttm [ 361.900442] intel_gtt [ 361.900443] soundcore [ 361.900445] crypto_simd [ 361.900446] drm_display_helper libaes [ 361.900452] cmdlinepart [ 361.900453] cec 8021q [ 361.900458] cryptd [ 361.900460] intel_gtt [ 361.900464] i2c_algo_bit [ 361.900464] crypto_simd cmdlinepart [ 361.900470] spi_nor [ 361.900472] cryptd i2c_algo_bit [ 361.900475] jc42 [ 361.900480] spi_nor [ 361.900481] i40e [ 361.900483] jc42 i40e [ 361.900487] igc [ 361.900490] igc [ 361.900492] mtd [ 361.900493] mtd rapl [ 361.900499] rapl [ 361.900500] ir_rc6_decoder intel_ish_ipc [ 361.900504] ir_rc6_decoder [ 361.900508] mei_wdt [ 361.900509] intel_ish_ipc [ 361.900512] wdat_wdt intel_wmi_thunderbolt [ 361.900515] mei_wdt [ 361.900519] mei_hdcp [ 361.900521] wdat_wdt [ 361.900523] intel_cstate mei_pxp [ 361.900526] intel_wmi_thunderbolt [ 361.900530] ee1004 [ 361.900532] mei_hdcp [ 361.900533] spi_intel_pci watchdog [ 361.900537] intel_cstate [ 361.900541] video [ 361.900543] mei_pxp [ 361.900545] intel_ishtp [ 361.900547] ee1004 [ 361.900548] rc_rc6_mce [ 361.900550] spi_intel_pci [ 361.900552] intel_uncore [ 361.900554] watchdog [ 361.900555] i2c_i801 [ 361.900557] video [ 361.900559] mei_me joydev [ 361.900562] intel_ishtp [ 361.900567] intel_lpss_pci [ 361.900568] rc_rc6_mce [ 361.900570] wmi_bmof ite_cir [ 361.900574] intel_uncore [ 361.900577] mei [ 361.900579] i2c_i801 [ 361.900581] input_leds ptp [ 361.900585] mei_me [ 361.900589] pps_core [ 361.900590] joydev [ 361.900592] evdev intel_lpss [ 361.900596] intel_lpss_pci [ 361.900600] idma64 [ 361.900601] wmi_bmof [ 361.900603] mousedev [ 361.900607] ite_cir [ 361.900607] backlight [ 361.900611] intel_pmc_core [ 361.900613] mei [ 361.900615] 8250_pci thermal [ 361.900618] input_leds [ 361.900622] intel_pch_thermal [ 361.900624] ptp [ 361.900625] spi_intel virt_dma [ 361.900631] pps_core [ 361.900632] serial_multi_instantiate mac_hid [ 361.900636] evdev [ 361.900640] intel_vsec [ 361.900642] intel_lpss [ 361.900643] pmt_telemetry pmt_class [ 361.900647] idma64 [ 361.900650] i2c_smbus [ 361.900653] mousedev [ 361.900654] wmi tiny_power_button [ 361.900659] backlight [ 361.900661] pinctrl_cannonlake [ 361.900663] intel_pmc_core [ 361.900664] button [ 361.900668] 8250_pci [ 361.900669] acpi_tad [ 361.900671] thermal [ 361.900672] acpi_pad [ 361.900675] intel_pch_thermal [ 361.900676] fan xt_conntrack [ 361.900681] spi_intel [ 361.900682] nf_conntrack nf_defrag_ipv6 [ 361.900688] virt_dma [ 361.900689] nf_defrag_ipv4 xt_tcpudp [ 361.900693] serial_multi_instantiate [ 361.900696] ip6t_rpfilter [ 361.900699] mac_hid [ 361.900700] ipt_rpfilter xt_pkttype [ 361.900705] intel_vsec [ 361.900706] nft_compat nf_tables [ 361.900711] pmt_telemetry [ 361.900713] libcrc32c crc32c_generic [ 361.900716] pmt_class [ 361.900720] sch_fq_codel [ 361.900722] i2c_smbus [ 361.900724] atkbd libps2 [ 361.900729] wmi [ 361.900730] vivaldi_fmap [ 361.900732] tiny_power_button [ 361.900734] fuse [ 361.900737] pinctrl_cannonlake [ 361.900737] configfs [ 361.900740] efi_pstore [ 361.900743] button [ 361.900744] nfnetlink efivarfs [ 361.900749] acpi_tad [ 361.900751] dmi_sysfs ip_tables [ 361.900756] acpi_pad [ 361.900757] x_tables autofs4 [ 361.900761] fan [ 361.900765] squashfs [ 361.900767] xt_conntrack [ 361.900768] isofs cdrom [ 361.900791] nf_conntrack [ 361.900795] hid_generic [ 361.900797] nf_defrag_ipv6 [ 361.900799] sd_mod usbhid [ 361.900805] nf_defrag_ipv4 [ 361.900806] hid uas [ 361.900810] xt_tcpudp [ 361.900814] usb_storage [ 361.900816] ip6t_rpfilter [ 361.900818] ahci libahci [ 361.900823] ipt_rpfilter [ 361.900825] sdhci_pci libata [ 361.900829] xt_pkttype [ 361.900833] nvme [ 361.900835] nft_compat [ 361.900836] cqhci xhci_pci [ 361.900841] nf_tables [ 361.900845] xhci_pci_renesas [ 361.900847] libcrc32c [ 361.900849] sdhci nvme_core [ 361.900853] crc32c_generic [ 361.900856] firmware_class [ 361.900857] sch_fq_codel [ 361.900860] led_class [ 361.900861] atkbd [ 361.900865] xhci_hcd [ 361.900867] libps2 [ 361.900869] t10_pi scsi_mod [ 361.900872] vivaldi_fmap [ 361.900875] mmc_core [ 361.900878] fuse [ 361.900878] crc32c_intel rtc_cmos [ 361.900884] configfs [ 361.900886] crc64_rocksoft crc64 [ 361.900891] efi_pstore [ 361.900893] crc_t10dif crct10dif_generic [ 361.900897] nfnetlink [ 361.900901] scsi_common [ 361.900903] efivarfs [ 361.900905] crct10dif_pclmul crct10dif_common [ 361.900909] dmi_sysfs [ 361.900913] serio [ 361.900915] ip_tables [ 361.900916] dm_mod [ 361.900921] x_tables [ 361.900921] dax [ 361.900925] autofs4 [ 361.900926] nls_iso8859_1 [ 361.900930] squashfs [ 361.900931] nls_cp437 vfat [ 361.900937] isofs [ 361.900938] fat overlay [ 361.900944] cdrom [ 361.900946] loop [ 361.900949] hid_generic [ 361.900956] sd_mod [ 361.900955] CPU: 3 PID: 390 Comm: kworker/3:2 Tainted: G L 6.9.0-rc1 #1-NixOS [ 361.900961] usbhid [ 361.900964] Hardware name: QNAP Systems, Inc. TVS-1688/Default string, BIOS Q063AR22 03/08/2022 [ 361.900967] hid uas usb_storage ahci libahci sdhci_pci libata nvme cqhci xhci_pci xhci_pci_renesas sdhci [ 361.900970] Workqueue: bcachefs_write_ref bch2_do_discards_work [bcachefs] [ 361.901061] nvme_core firmware_class led_class xhci_hcd t10_pi scsi_mod mmc_core crc32c_intel rtc_cmos crc64_rocksoft crc64 crc_t10dif crct10dif_generic scsi_common [ 361.901127] crct10dif_pclmul crct10dif_common serio dm_mod dax nls_iso8859_1 nls_cp437 vfat fat overlay loop [ 361.901184] CPU: 0 PID: 4284 Comm: bcachefs Tainted: G L 6.9.0-rc1 #1-NixOS [ 361.901196] Hardware name: QNAP Systems, Inc. TVS-1688/Default string, BIOS Q063AR22 03/08/2022 [ 361.901131] RIP: 0010:bch2_trans_srcu_unlock+0x162/0x180 [bcachefs] [ 361.901296] Code: d1 c7 48 c7 c7 20 15 06 c2 48 b8 cf f7 53 e3 a5 9b c4 20 48 29 ca 48 c1 ea 03 48 f7 e2 48 89 d6 48 c1 ee 04 e8 bf b1 1c c6 90 <0f> 0b 90 90 e9 27 ff ff ff 90 0f 0b 90 e9 36 ff ff ff 66 66 2e 0f [ 361.901302] RSP: 0018:ffff8f61c0d53ca0 EFLAGS: 00010282 [ 361.901310] RAX: 0000000000000000 RBX: ffff8a418c490000 RCX: 0000000000000027 [ 361.901316] RDX: ffff8a605a3e19c8 RSI: 0000000000000001 RDI: ffff8a605a3e19c0 [ 361.901320] RBP: ffff8a41d2200000 R08: 0000000000000000 R09: 0000000000000003 [ 361.901203] RIP: 0010:bch2_trans_put+0x39a/0x440 [bcachefs] [ 361.901324] R10: ffff8f61c0d53b50 R11: ffffffff89d596c8 R12: ffff8a418c490000 [ 361.901329] R13: ffff8a418c4904f0 R14: 0000000000000005 R15: ffff8a418c4904f0 [ 361.901333] FS: 0000000000000000(0000) GS:ffff8a605a200000(0000) knlGS:0000000000000000 [ 361.901338] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 361.901343] CR2: 00007f9872c5e008 CR3: 0000000125960006 CR4: 00000000003706f0 [ 361.901348] Call Trace: [ 361.901354] [ 361.901362] ? __warn+0x8c/0x190 [ 361.901447] Code: d1 c7 48 c7 c7 20 15 06 c2 48 29 c2 48 b8 cf f7 53 e3 a5 9b c4 20 48 c1 ea 03 48 f7 e2 48 89 d6 48 c1 ee 04 e8 77 a3 1c c6 90 <0f> 0b 90 90 45 8b ac 24 a8 00 00 00 49 8d af e0 43 00 00 41 83 fd [ 361.901375] ? bch2_trans_srcu_unlock+0x162/0x180 [bcachefs] [ 361.901457] RSP: 0018:ffff8f61cacbf8b0 EFLAGS: 00010286 [ 361.901471] RAX: 0000000000000000 RBX: 0000000000000009 RCX: 0000000000000027 [ 361.901478] RDX: ffff8a60597e19c8 RSI: 0000000000000001 RDI: ffff8a60597e19c0 [ 361.901486] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000003 [ 361.901492] R10: ffff8f61cacbf760 R11: ffffffff89d596c8 R12: ffff8a4412040000 [ 361.901500] R13: 0000000000000000 R14: 000000000000000a R15: ffff8a41d2200000 [ 361.901507] FS: 00007f9239169c00(0000) GS:ffff8a6059600000(0000) knlGS:0000000000000000 [ 361.901517] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 361.901525] CR2: 0000000016503670 CR3: 0000000125960001 CR4: 00000000003706f0 [ 361.901534] Call Trace: [ 361.901542] [ 361.901545] ? report_bug+0x164/0x190 [ 361.901553] ? __warn+0x8c/0x190 [ 361.901566] ? handle_bug+0x3d/0x80 [ 361.901576] ? exc_invalid_op+0x17/0x70 [ 361.901586] ? asm_exc_invalid_op+0x1a/0x20 [ 361.901570] ? bch2_trans_put+0x39a/0x440 [bcachefs] [ 361.901615] ? bch2_trans_srcu_unlock+0x162/0x180 [bcachefs] [ 361.901818] ? report_bug+0x164/0x190 [ 361.901874] ? handle_bug+0x3d/0x80 [ 361.901783] bch2_trans_begin+0x4cc/0x4e0 [bcachefs] [ 361.901892] ? exc_invalid_op+0x17/0x70 [ 361.901911] ? asm_exc_invalid_op+0x1a/0x20 [ 361.901977] ? bch2_trans_begin+0x25d/0x4e0 [bcachefs] [ 361.901963] ? bch2_trans_put+0x39a/0x440 [bcachefs] [ 361.902180] bch2_do_discards_work+0x185/0x410 [bcachefs] [ 361.902291] __bch2_fs_log_msg+0x20a/0x330 [bcachefs] [ 361.902362] ? bch2_do_discards_work+0xaf/0x410 [bcachefs] [ 361.902517] process_one_work+0x230/0x740 [ 361.902550] worker_thread+0x183/0x350 [ 361.902562] ? __pfx_worker_thread+0x10/0x10 [ 361.902572] kthread+0xd0/0x100 [ 361.902584] ? __pfx_kthread+0x10/0x10 [ 361.902599] ret_from_fork+0x31/0x50 [ 361.902609] ? __pfx_kthread+0x10/0x10 [ 361.902621] ret_from_fork_asm+0x1a/0x30 [ 361.902661] [ 361.902665] irq event stamp: 185355 [ 361.902670] hardirqs last enabled at (185361): [] vprintk_emit+0x2fd/0x330 [ 361.902681] hardirqs last disabled at (185366): [] vprintk_emit+0x2b0/0x330 [ 361.902688] softirqs last enabled at (183868): [] __do_softirq+0x350/0x42b [ 361.902701] softirqs last disabled at (183861): [] __irq_exit_rcu+0xab/0xd0 [ 361.902711] ---[ end trace 0000000000000000 ]--- [ 361.902583] bch2_journal_log_msg+0x67/0x90 [bcachefs] [ 361.902880] bch2_journal_replay+0x30b/0x5a0 [bcachefs] [ 361.903271] bch2_run_recovery_pass+0x35/0xa0 [bcachefs] [ 361.903616] bch2_run_recovery_passes+0x94/0xf0 [bcachefs] [ 361.903955] bch2_fs_recovery+0x6d9/0x1620 [bcachefs] [ 361.904336] ? local_clock_noinstr+0xd/0xc0 [ 361.904356] ? local_clock+0x15/0x30 [ 361.904395] ? bch2_get_next_online_dev+0x13b/0x370 [bcachefs] [ 361.904649] bch2_fs_start+0x13f/0x220 [bcachefs] [ 361.904922] bch2_fs_open+0x65b/0x690 [bcachefs] [ 361.904998] bch2_mount+0x581/0x720 [bcachefs] [ 361.905097] legacy_get_tree+0x31/0x60 [ 361.905102] vfs_get_tree+0x29/0x100 [ 361.905105] ? capable+0x36/0x70 [ 361.905109] path_mount+0x4ca/0xbe0 [ 361.905116] __x64_sys_mount+0x11a/0x150 [ 361.905123] do_syscall_64+0xcd/0x220 [ 361.905129] entry_SYSCALL_64_after_hwframe+0x6d/0x75 [ 361.905132] RIP: 0033:0x7f92392972ae [ 361.905150] Code: 48 8b 0d 85 ab 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 52 ab 0c 00 f7 d8 64 89 01 48 [ 361.905153] RSP: 002b:00007ffd839a9aa8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 [ 361.905156] RAX: ffffffffffffffda RBX: 000055f9fb47a860 RCX: 00007f92392972ae [ 361.905157] RDX: 000055f9fb479c30 RSI: 000055f9fb4791c0 RDI: 000055f9fb451e40 [ 361.905159] RBP: 000055f9fb451e40 R08: 000055f9fb47a860 R09: 0000000000000000 [ 361.905160] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 [ 361.905162] R13: 000000000000006a R14: 00007ffd839a9b90 R15: 000055f9fb44df20 [ 361.905174] [ 361.905175] irq event stamp: 2102229 [ 361.905177] hardirqs last enabled at (2102235): [] vprintk_emit+0x2fd/0x330 [ 361.905180] hardirqs last disabled at (2102240): [] vprintk_emit+0x2b0/0x330 [ 361.905182] softirqs last enabled at (2100474): [] __do_softirq+0x350/0x42b [ 361.905185] softirqs last disabled at (2100453): [] __irq_exit_rcu+0xab/0xd0 [ 361.905188] ---[ end trace 0000000000000000 ]--- [ 361.905217] ------------[ cut here ]------------ [ 361.905224] btree trans held srcu lock (delaying memory reclaim) for 68 seconds [ 361.905236] WARNING: CPU: 6 PID: 4284 at fs/bcachefs/btree_iter.c:2831 bch2_trans_srcu_unlock+0x162/0x180 [bcachefs] [ 361.905275] Modules linked in: poly1305_generic libpoly1305 poly1305_x86_64 chacha_generic chacha_x86_64 libchacha bcachefs lz4hc_compress lz4_compress xor raid6_pq af_packet snd_ctl_led ledtrig_audio snd_soc_skl_hda_dsp snd_soc_hdac_hdmi snd_sof_probes snd_soc_intel_hda_dsp_common snd_hda_codec_hdmi snd_soc_dmic hid_sensor_magn_3d hid_sensor_als hid_sensor_accel_3d intel_rapl_msr intel_rapl_common hid_sensor_incl_3d hid_sensor_rotation hid_sensor_gyro_3d hid_sensor_trigger industrialio_triggered_buffer kfifo_buf hid_sensor_iio_common industrialio hid_sensor_custom intel_uncore_frequency intel_uncore_frequency_common snd_sof_pci_intel_cnl intel_tcc_cooling snd_sof_intel_hda_common snd_soc_hdac_hda soundwire_intel snd_sof_intel_hda_mlink soundwire_cadence x86_pkg_temp_thermal intel_powerclamp snd_sof_intel_hda snd_sof_pci coretemp snd_sof_xtensa_dsp kvm_intel snd_sof snd_sof_utils soundwire_generic_allocation soundwire_bus kvm crc32_pclmul snd_soc_avs polyval_clmulni polyval_generic snd_soc_hda_codec gf128mul [ 361.905310] hid_sensor_hub ghash_clmulni_intel snd_soc_skl intel_ishtp_hid sha512_ssse3 tps6598x snd_soc_sst_ipc sha512_generic snd_soc_sst_dsp snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine sha256_ssse3 typec sha1_ssse3 i915 roles cfg80211 snd_hda_intel aesni_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hda_core snd_hwdep snd_pcm rfkill ledtrig_netdev snd_timer snd drm_buddy ttm soundcore drm_display_helper libaes cec 8021q intel_gtt crypto_simd cmdlinepart cryptd i2c_algo_bit spi_nor jc42 i40e igc mtd rapl ir_rc6_decoder intel_ish_ipc mei_wdt wdat_wdt intel_wmi_thunderbolt mei_hdcp intel_cstate mei_pxp ee1004 spi_intel_pci watchdog video intel_ishtp rc_rc6_mce intel_uncore i2c_i801 mei_me joydev intel_lpss_pci wmi_bmof ite_cir mei input_leds ptp pps_core evdev intel_lpss idma64 mousedev backlight intel_pmc_core 8250_pci thermal intel_pch_thermal spi_intel virt_dma serial_multi_instantiate mac_hid intel_vsec pmt_telemetry pmt_class i2c_smbus [ 361.905362] wmi tiny_power_button pinctrl_cannonlake button acpi_tad acpi_pad fan xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_tcpudp ip6t_rpfilter ipt_rpfilter xt_pkttype nft_compat nf_tables libcrc32c crc32c_generic sch_fq_codel atkbd libps2 vivaldi_fmap fuse configfs efi_pstore nfnetlink efivarfs dmi_sysfs ip_tables x_tables autofs4 squashfs isofs cdrom hid_generic sd_mod usbhid hid uas usb_storage ahci libahci sdhci_pci libata nvme cqhci xhci_pci xhci_pci_renesas sdhci nvme_core firmware_class led_class xhci_hcd t10_pi scsi_mod mmc_core crc32c_intel rtc_cmos crc64_rocksoft crc64 crc_t10dif crct10dif_generic scsi_common crct10dif_pclmul crct10dif_common serio dm_mod dax nls_iso8859_1 nls_cp437 vfat fat overlay loop [ 361.905404] CPU: 6 PID: 4284 Comm: bcachefs Tainted: G W L 6.9.0-rc1 #1-NixOS [ 361.905405] Hardware name: QNAP Systems, Inc. TVS-1688/Default string, BIOS Q063AR22 03/08/2022 [ 361.905406] RIP: 0010:bch2_trans_srcu_unlock+0x162/0x180 [bcachefs] [ 361.905443] Code: d1 c7 48 c7 c7 20 15 06 c2 48 b8 cf f7 53 e3 a5 9b c4 20 48 29 ca 48 c1 ea 03 48 f7 e2 48 89 d6 48 c1 ee 04 e8 bf b1 1c c6 90 <0f> 0b 90 90 e9 27 ff ff ff 90 0f 0b 90 e9 36 ff ff ff 66 66 2e 0f [ 361.905444] RSP: 0018:ffff8f61cacbf930 EFLAGS: 00010282 [ 361.905446] RAX: 0000000000000000 RBX: ffff8a4412bb8000 RCX: 0000000000000027 [ 361.905447] RDX: ffff8a605afe19c8 RSI: 0000000000000001 RDI: ffff8a605afe19c0 [ 361.905448] RBP: ffff8a41d2200000 R08: 0000000000000001 R09: 0000000000000000 [ 361.905449] R10: 0000000000000001 R11: 0000000000000001 R12: ffff8a4412bb8000 [ 361.905450] R13: ffff8a41d2201aa8 R14: ffff8a41d22019a8 R15: ffff8a42b2a3ea68 [ 361.905451] FS: 00007f9239169c00(0000) GS:ffff8a605ae00000(0000) knlGS:0000000000000000 [ 361.905452] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 361.905453] CR2: 0000000016610ec0 CR3: 0000000125960004 CR4: 00000000003706f0 [ 361.905454] Call Trace: [ 361.905455] [ 361.905457] ? __warn+0x8c/0x190 [ 361.905459] ? bch2_trans_srcu_unlock+0x162/0x180 [bcachefs] [ 361.905497] ? report_bug+0x164/0x190 [ 361.905500] ? handle_bug+0x3d/0x80 [ 361.905502] ? exc_invalid_op+0x17/0x70 [ 361.905504] ? asm_exc_invalid_op+0x1a/0x20 [ 361.905510] ? bch2_trans_srcu_unlock+0x162/0x180 [bcachefs] [ 361.905548] bch2_trans_begin+0x4cc/0x4e0 [bcachefs] [ 361.905587] bch2_journal_replay+0x134/0x5a0 [bcachefs] [ 361.905638] bch2_run_recovery_pass+0x35/0xa0 [bcachefs] [ 361.905687] bch2_run_recovery_passes+0x94/0xf0 [bcachefs] [ 361.905736] bch2_fs_recovery+0x6d9/0x1620 [bcachefs] [ 361.905785] ? local_clock_noinstr+0xd/0xc0 [ 361.905788] ? local_clock+0x15/0x30 [ 361.905792] ? bch2_get_next_online_dev+0x13b/0x370 [bcachefs] [ 361.905825] bch2_fs_start+0x13f/0x220 [bcachefs] [ 361.905873] bch2_fs_open+0x65b/0x690 [bcachefs] [ 361.905925] bch2_mount+0x581/0x720 [bcachefs] [ 361.905986] legacy_get_tree+0x31/0x60 [ 361.905989] vfs_get_tree+0x29/0x100 [ 361.905991] ? capable+0x36/0x70 [ 361.905994] path_mount+0x4ca/0xbe0 [ 361.905998] __x64_sys_mount+0x11a/0x150 [ 361.906002] do_syscall_64+0xcd/0x220 [ 361.906006] entry_SYSCALL_64_after_hwframe+0x6d/0x75 [ 361.906008] RIP: 0033:0x7f92392972ae [ 361.906017] Code: 48 8b 0d 85 ab 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 52 ab 0c 00 f7 d8 64 89 01 48 [ 361.906018] RSP: 002b:00007ffd839a9aa8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 [ 361.906032] RAX: ffffffffffffffda RBX: 000055f9fb47a860 RCX: 00007f92392972ae [ 361.906034] RDX: 000055f9fb479c30 RSI: 000055f9fb4791c0 RDI: 000055f9fb451e40 [ 361.906035] RBP: 000055f9fb451e40 R08: 000055f9fb47a860 R09: 0000000000000000 [ 361.906036] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 [ 361.906037] R13: 000000000000006a R14: 00007ffd839a9b90 R15: 000055f9fb44df20 [ 361.906044] [ 361.906045] irq event stamp: 2103993 [ 361.906045] hardirqs last enabled at (2103999): [] vprintk_emit+0x2fd/0x330 [ 361.906047] hardirqs last disabled at (2104004): [] vprintk_emit+0x2b0/0x330 [ 361.906049] softirqs last enabled at (2103948): [] __do_softirq+0x350/0x42b [ 361.906051] softirqs last disabled at (2103943): [] __irq_exit_rcu+0xab/0xd0 [ 361.906053] ---[ end trace 0000000000000000 ]--- [ 362.073917] BUG: MAX_LOCK_DEPTH too low! [ 362.074840] turning off the locking correctness validator. [ 362.075743] Please attach the output of /proc/lock_stat to the bug report [ 362.076624] depth: 48 max: 48! [ 362.077462] 48 locks held by kworker/u48:50/527: [ 362.078300] #0: ffff8a418f06a958 ((wq_completion)btree_update){+.+.}-{0:0}, at: process_one_work+0x445/0x740 [ 362.079168] #1: ffff8f61c3ce7e58 ((work_completion)(&c->btree_interior_update_work)){+.+.}-{0:0}, at: process_one_work+0x1ed/0x740 [ 362.080054] #2: ffff8a41d22043f0 (&c->btree_trans_barrier){.+.+}-{0:0}, at: bch2_btree_path_traverse_one+0xa5b/0xb10 [bcachefs] [ 362.080992] #3: ffff8a41c97c7158 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x71/0x290 [bcachefs] [ 362.081938] #4: ffff8a440f32f398 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x71/0x290 [bcachefs] [ 362.082869] #5: ffff8a41c97c7938 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x71/0x290 [bcachefs] [ 362.083788] #6: ffff8a41c4a52618 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x71/0x290 [bcachefs] [ 362.084701] #7: ffff8a41c97c6978 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x71/0x290 [bcachefs] [ 362.085596] #8: ffff8a41c4a52a98 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x71/0x290 [bcachefs] [ 362.086491] #9: ffff8a4413a3f878 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x71/0x290 [bcachefs] [ 362.087381] #10: ffff8a4413a3f878 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x21d/0x290 [bcachefs] [ 362.088296] #11: ffff8a418d356478 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x71/0x290 [bcachefs] [ 362.089205] #12: ffff8a418d356478 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x21d/0x290 [bcachefs] [ 362.090119] #13: ffff8a4413a3cc78 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x71/0x290 [bcachefs] [ 362.091024] #14: ffff8a4413a3cc78 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x21d/0x290 [bcachefs] [ 362.091931] #15: ffff8a4413a3e078 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x71/0x290 [bcachefs] [ 362.092837] #16: ffff8a4411923c78 (&dev->mutex){....}-{3:3}, at: btree_node_lock.isra.0+0x29/0x150 [bcachefs] [ 362.093755] #17: ffff8a4411923c78 (&dev->mutex){....}-{3:3}, at: btree_node_lock.isra.0+0x14a/0x150 [bcachefs] [ 362.094669] #18: ffff8a418d352c78 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x71/0x290 [bcachefs] [ 362.095586] #19: ffff8a418d352c78 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x21d/0x290 [bcachefs] [ 362.096498] #20: ffff8a418d350078 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x71/0x290 [bcachefs] [ 362.097415] #21: ffff8a418d350078 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x21d/0x290 [bcachefs] [ 362.098326] #22: ffff8a4413a38878 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x71/0x290 [bcachefs] [ 362.099243] #23: ffff8a41a3621878 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x71/0x290 [bcachefs] [ 362.100143] #24: ffff8a41a3621878 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x21d/0x290 [bcachefs] [ 362.101047] #25: ffff8a4413a3c078 (&dev->mutex){....}-{3:3}, at: btree_node_lock.isra.0+0x29/0x150 [bcachefs] [ 362.101946] #26: ffff8a4411921478 (&dev->mutex){....}-{3:3}, at: btree_node_lock.isra.0+0x29/0x150 [bcachefs] [ 362.102842] #27: ffff8a4413a39078 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x71/0x290 [bcachefs] [ 362.103751] #28: ffff8a418d357478 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x71/0x290 [bcachefs] [ 362.104650] #29: ffff8a418d357478 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x21d/0x290 [bcachefs] [ 362.105549] #30: ffff8a4413a3c878 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x71/0x290 [bcachefs] [ 362.106448] #31: ffff8a418d355c78 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x71/0x290 [bcachefs] [ 362.107337] #32: ffff8a41a3626478 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x71/0x290 [bcachefs] [ 362.108224] #33: ffff8a418d353078 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x71/0x290 [bcachefs] [ 362.109116] #34: ffff8a418d353078 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x21d/0x290 [bcachefs] [ 362.110007] #35: ffff8a418d355c78 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x21d/0x290 [bcachefs] [ 362.110890] #36: ffff8a41a3626478 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x21d/0x290 [bcachefs] [ 362.111770] #37: ffff8a418d351c78 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x71/0x290 [bcachefs] [ 362.112653] #38: ffff8a418d352078 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x71/0x290 [bcachefs] [ 362.113528] #39: ffff8a41a3622c78 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x71/0x290 [bcachefs] [ 362.114353] #40: ffff8a41a3a1e078 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x71/0x290 [bcachefs] [ 362.115142] #41: ffff8a41d2228af8 (&c->gc_lock){.+.+}-{3:3}, at: bch2_btree_update_start+0x2ed/0x920 [bcachefs] [ 362.115929] #42: ffff8a41d221fe80 (&wp->lock){+.+.}-{3:3}, at: bch2_alloc_sectors_start_trans+0x898/0x9a0 [bcachefs] [ 362.116725] #43: ffff8a41a3621878 (&dev->mutex){....}-{3:3}, at: __bch2_btree_node_relock+0x21d/0x290 [bcachefs] [ 362.117540] #44: ffff8a4413a3b878 (&dev->mutex){....}-{3:3}, at: __btree_node_reclaim+0x110/0x3a0 [bcachefs] [ 362.118349] #45: ffff8a41d2201d58 (&bc->lock){+.+.}-{3:3}, at: bch2_btree_node_hash_insert+0x3a/0xb0 [bcachefs] [ 362.119173] #46: ffffffff89d5cbc0 (rcu_read_lock){....}-{1:2}, at: __bch2_btree_node_hash_insert+0x89/0x740 [bcachefs] [ 362.120006] #47: ffff8a44a2880038 (rhashtable_bucket){....}-{0:0}, at: __bch2_btree_node_hash_insert+0x1a8/0x740 [bcachefs] [ 362.120859] INFO: lockdep is turned off. [ 362.121663] CPU: 8 PID: 527 Comm: kworker/u48:50 Tainted: G W L 6.9.0-rc1 #1-NixOS [ 362.122504] Hardware name: QNAP Systems, Inc. TVS-1688/Default string, BIOS Q063AR22 03/08/2022 [ 362.123345] Workqueue: btree_update btree_interior_update_work [bcachefs] [ 362.124220] Call Trace: [ 362.125057] [ 362.125883] dump_stack_lvl+0x77/0xb0 [ 362.126702] __lock_acquire+0x1087/0x16e0 [ 362.127526] lock_acquire+0xcb/0x2e0 [ 362.128347] ? __bch2_btree_node_hash_insert+0x1a8/0x740 [bcachefs] [ 362.129220] __bch2_btree_node_hash_insert+0x1ae/0x740 [bcachefs] [ 362.130092] ? __bch2_btree_node_hash_insert+0x1a8/0x740 [bcachefs] [ 362.130975] bch2_btree_node_hash_insert+0x45/0xb0 [bcachefs] [ 362.131863] bch2_btree_node_fill+0xf9/0x5e0 [bcachefs] [ 362.132741] __bch2_btree_node_get+0x3db/0x520 [bcachefs] [ 362.133622] ? bch2_bucket_alloc_freelist+0x1c1/0x3a0 [bcachefs] [ 362.134500] bch2_btree_path_traverse_one+0x25a/0xb10 [bcachefs] [ 362.135371] ? bch2_bucket_alloc_freelist+0x1c1/0x3a0 [bcachefs] [ 362.136237] ? bch2_btree_path_traverse_one+0xd1/0xb10 [bcachefs] [ 362.137099] bch2_btree_iter_peek_upto+0x154/0xdf0 [bcachefs] [ 362.137950] ? btree_trans_verify_sorted_refs+0x1b/0x100 [bcachefs] [ 362.138792] ? bch2_bucket_alloc_freelist+0x6f/0x3a0 [bcachefs] [ 362.139614] ? bch2_bucket_alloc_freelist+0x1c1/0x3a0 [bcachefs] [ 362.140435] bch2_bucket_alloc_freelist+0x1c1/0x3a0 [bcachefs] [ 362.141265] ? bch2_bucket_alloc_freelist+0x6f/0x3a0 [bcachefs] [ 362.142105] ? mark_held_locks+0x40/0x70 [ 362.142921] ? bch2_bucket_alloc_freelist+0x6f/0x3a0 [bcachefs] [ 362.143779] bch2_bucket_alloc_trans+0x1f5/0x600 [bcachefs] [ 362.144641] ? __pfx_local_clock_noinstr+0x1/0x10 [ 362.145463] ? bch2_bucket_alloc_set_trans+0x1a7/0x7b0 [bcachefs] [ 362.146322] ? bch2_bucket_alloc_set_trans+0xae/0x7b0 [bcachefs] [ 362.147176] bch2_bucket_alloc_set_trans+0x1e7/0x7b0 [bcachefs] [ 362.148034] ? lock_acquire+0xcb/0x2e0 [ 362.148866] ? __lock_acquire+0x414/0x16e0 [ 362.149696] ? check_irq_usage+0xeb/0xb20 [ 362.150517] ? lock_acquire+0xcb/0x2e0 [ 362.151340] ? bch2_target_to_mask+0x10f/0x2d0 [bcachefs] [ 362.152204] ? find_held_lock+0x32/0x90 [ 362.153028] ? local_clock_noinstr+0xd/0xc0 [ 362.153847] ? local_clock+0x15/0x30 [ 362.154665] ? lock_release+0x27e/0x3f0 [ 362.155482] __open_bucket_add_buckets+0x377/0x5b0 [bcachefs] [ 362.156348] bch2_alloc_sectors_start_trans+0x68d/0x9a0 [bcachefs] [ 362.157211] ? find_held_lock+0x32/0x90 [ 362.158040] ? local_clock_noinstr+0xd/0xc0 [ 362.158875] ? local_clock+0x15/0x30 [ 362.159696] ? lock_release+0x27e/0x3f0 [ 362.160521] __bch2_btree_node_alloc+0x298/0x440 [bcachefs] [ 362.161393] bch2_btree_reserve_get+0x9f/0xd0 [bcachefs] [ 362.162263] bch2_btree_update_start+0x57b/0x920 [bcachefs] [ 362.163139] ? bch2_bkey_format_done+0x1d3/0x240 [bcachefs] [ 362.164018] __bch2_foreground_maybe_merge+0x5b6/0xe40 [bcachefs] [ 362.164907] ? journal_validate_key+0x2b7/0x610 [bcachefs] [ 362.165812] ? run_btree_triggers+0x13d/0x430 [bcachefs] [ 362.166717] __bch2_trans_commit+0xd6f/0x19c0 [bcachefs] [ 362.167624] btree_interior_update_work+0x7d2/0xae0 [bcachefs] [ 362.168536] ? process_one_work+0x230/0x740 [ 362.169416] process_one_work+0x230/0x740 [ 362.170299] ? process_one_work+0x263/0x740 [ 362.171186] worker_thread+0x183/0x350 [ 362.172069] ? __pfx_worker_thread+0x10/0x10 [ 362.172939] kthread+0xd0/0x100 [ 362.173794] ? __pfx_kthread+0x10/0x10 [ 362.174643] ret_from_fork+0x31/0x50 [ 362.175476] ? __pfx_kthread+0x10/0x10 [ 362.176299] ret_from_fork_asm+0x1a/0x30 [ 362.177103] [ 493.070088] INFO: task kworker/u48:50:527 blocked for more than 122 seconds. [ 493.073070] Tainted: G W L 6.9.0-rc1 #1-NixOS [ 493.075980] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 493.078923] task:kworker/u48:50 state:D stack:0 pid:527 tgid:527 ppid:2 flags:0x00004000 [ 493.078935] Workqueue: btree_update btree_interior_update_work [bcachefs] [ 493.084983] Call Trace: [ 493.084986] [ 493.084993] __schedule+0x519/0x19f0 [ 493.092915] schedule+0x3a/0x150 [ 493.092922] __closure_sync+0x80/0x180 [ 493.092930] bch2_btree_update_start+0x908/0x920 [bcachefs] [ 493.098511] ? __pfx_closure_sync_fn+0x10/0x10 [ 493.100161] __bch2_foreground_maybe_merge+0x5b6/0xe40 [bcachefs] [ 493.101669] ? journal_validate_key+0x2b7/0x610 [bcachefs] [ 493.103211] ? run_btree_triggers+0x13d/0x430 [bcachefs] [ 493.104725] __bch2_trans_commit+0xd6f/0x19c0 [bcachefs] [ 493.106052] btree_interior_update_work+0x7d2/0xae0 [bcachefs] [ 493.107357] ? process_one_work+0x230/0x740 [ 493.108594] process_one_work+0x230/0x740 [ 493.108597] ? process_one_work+0x263/0x740 [ 493.108601] worker_thread+0x183/0x350 [ 493.108603] ? __pfx_worker_thread+0x10/0x10 [ 493.108605] kthread+0xd0/0x100 [ 493.108609] ? __pfx_kthread+0x10/0x10 [ 493.115136] ret_from_fork+0x31/0x50 [ 493.115139] ? __pfx_kthread+0x10/0x10 [ 493.117008] ret_from_fork_asm+0x1a/0x30 [ 493.117013] [ 493.117026] INFO: task bcachefs:4284 blocked for more than 122 seconds. [ 493.117027] Tainted: G W L 6.9.0-rc1 #1-NixOS [ 493.117028] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 493.117029] task:bcachefs state:D stack:0 pid:4284 tgid:4284 ppid:4283 flags:0x00004002 [ 493.117032] Call Trace: [ 493.117032] [ 493.117034] __schedule+0x519/0x19f0 [ 493.117038] schedule+0x3a/0x150 [ 493.125821] __closure_sync+0x80/0x180 [ 493.125823] bch2_btree_update_start+0x908/0x920 [bcachefs] [ 493.127415] ? verify_update_old_key+0x1c9/0x1f0 [bcachefs] [ 493.128256] ? __pfx_closure_sync_fn+0x10/0x10 [ 493.128260] bch2_btree_split_leaf+0x5b/0x1d0 [bcachefs] [ 493.129891] ? bch2_journal_replay+0x178/0x5a0 [bcachefs] [ 493.130722] bch2_trans_commit_error+0x240/0x4f0 [bcachefs] [ 493.131547] ? bch2_trans_lock_write+0x10b/0x1c0 [bcachefs] [ 493.132372] ? six_trylock_ip+0x27/0x80 [bcachefs] [ 493.133214] __bch2_trans_commit+0xdd0/0x19c0 [bcachefs] [ 493.134056] bch2_journal_replay+0x178/0x5a0 [bcachefs] [ 493.134901] bch2_run_recovery_pass+0x35/0xa0 [bcachefs] [ 493.135748] bch2_run_recovery_passes+0x94/0xf0 [bcachefs] [ 493.136593] bch2_fs_recovery+0x6d9/0x1620 [bcachefs] [ 493.137436] ? local_clock_noinstr+0xd/0xc0 [ 493.138235] ? local_clock+0x15/0x30 [ 493.138239] ? bch2_get_next_online_dev+0x13b/0x370 [bcachefs] [ 493.139806] bch2_fs_start+0x13f/0x220 [bcachefs] [ 493.140640] bch2_fs_open+0x65b/0x690 [bcachefs] [ 493.141475] bch2_mount+0x581/0x720 [bcachefs] [ 493.142309] legacy_get_tree+0x31/0x60 [ 493.143098] vfs_get_tree+0x29/0x100 [ 493.143101] ? capable+0x36/0x70 [ 493.143103] path_mount+0x4ca/0xbe0 [ 493.145427] __x64_sys_mount+0x11a/0x150 [ 493.145430] do_syscall_64+0xcd/0x220 [ 493.146969] entry_SYSCALL_64_after_hwframe+0x6d/0x75 [ 493.146972] RIP: 0033:0x7f92392972ae [ 493.148544] RSP: 002b:00007ffd839a9aa8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 [ 493.148546] RAX: ffffffffffffffda RBX: 000055f9fb47a860 RCX: 00007f92392972ae [ 493.148547] RDX: 000055f9fb479c30 RSI: 000055f9fb4791c0 RDI: 000055f9fb451e40 [ 493.148548] RBP: 000055f9fb451e40 R08: 000055f9fb47a860 R09: 0000000000000000 [ 493.151743] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 [ 493.151744] R13: 000000000000006a R14: 00007ffd839a9b90 R15: 000055f9fb44df20 [ 493.151746] [ 493.151757] INFO: lockdep is turned off. [ 615.944804] INFO: task kworker/u48:50:527 blocked for more than 245 seconds. [ 615.944909] Tainted: G W L 6.9.0-rc1 #1-NixOS [ 615.944988] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 615.945082] task:kworker/u48:50 state:D stack:0 pid:527 tgid:527 ppid:2 flags:0x00004000 [ 615.945202] Workqueue: btree_update btree_interior_update_work [bcachefs] [ 615.945470] Call Trace: [ 615.945505] [ 615.945543] __schedule+0x519/0x19f0 [ 615.945609] schedule+0x3a/0x150 [ 615.945660] __closure_sync+0x80/0x180 [ 615.945717] bch2_btree_update_start+0x908/0x920 [bcachefs] [ 615.945973] ? __pfx_closure_sync_fn+0x10/0x10 [ 615.946072] __bch2_foreground_maybe_merge+0x5b6/0xe40 [bcachefs] [ 615.946247] ? journal_validate_key+0x2b7/0x610 [bcachefs] [ 615.946516] ? run_btree_triggers+0x13d/0x430 [bcachefs] [ 615.946757] __bch2_trans_commit+0xd6f/0x19c0 [bcachefs] [ 615.946944] btree_interior_update_work+0x7d2/0xae0 [bcachefs] [ 615.947197] ? process_one_work+0x230/0x740 [ 615.947263] process_one_work+0x230/0x740 [ 615.947323] ? process_one_work+0x263/0x740 [ 615.947391] worker_thread+0x183/0x350 [ 615.947449] ? __pfx_worker_thread+0x10/0x10 [ 615.950329] kthread+0xd0/0x100 [ 615.951314] ? __pfx_kthread+0x10/0x10 [ 615.951317] ret_from_fork+0x31/0x50 [ 615.951319] ? __pfx_kthread+0x10/0x10 [ 615.951321] ret_from_fork_asm+0x1a/0x30 [ 615.951325] [ 615.951331] INFO: task bcachefs:4284 blocked for more than 245 seconds. [ 615.951332] Tainted: G W L 6.9.0-rc1 #1-NixOS [ 615.951333] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 615.951334] task:bcachefs state:D stack:0 pid:4284 tgid:4284 ppid:4283 flags:0x00004002 [ 615.951336] Call Trace: [ 615.951336] [ 615.951337] __schedule+0x519/0x19f0 [ 615.963245] schedule+0x3a/0x150 [ 615.964281] __closure_sync+0x80/0x180 [ 615.964283] bch2_btree_update_start+0x908/0x920 [bcachefs] [ 615.964325] ? verify_update_old_key+0x1c9/0x1f0 [bcachefs] [ 615.967356] ? __pfx_closure_sync_fn+0x10/0x10 [ 615.967359] bch2_btree_split_leaf+0x5b/0x1d0 [bcachefs] [ 615.969303] ? bch2_journal_replay+0x178/0x5a0 [bcachefs] [ 615.970300] bch2_trans_commit_error+0x240/0x4f0 [bcachefs] [ 615.971267] ? bch2_trans_lock_write+0x10b/0x1c0 [bcachefs] [ 615.972237] ? six_trylock_ip+0x27/0x80 [bcachefs] [ 615.973221] __bch2_trans_commit+0xdd0/0x19c0 [bcachefs] [ 615.974212] bch2_journal_replay+0x178/0x5a0 [bcachefs] [ 615.975220] bch2_run_recovery_pass+0x35/0xa0 [bcachefs] [ 615.976239] bch2_run_recovery_passes+0x94/0xf0 [bcachefs] [ 615.977263] bch2_fs_recovery+0x6d9/0x1620 [bcachefs] [ 615.978291] ? local_clock_noinstr+0xd/0xc0 [ 615.979283] ? local_clock+0x15/0x30 [ 615.979287] ? bch2_get_next_online_dev+0x13b/0x370 [bcachefs] [ 615.981287] bch2_fs_start+0x13f/0x220 [bcachefs] [ 615.982349] bch2_fs_open+0x65b/0x690 [bcachefs] [ 615.983419] bch2_mount+0x581/0x720 [bcachefs] [ 615.984474] legacy_get_tree+0x31/0x60 [ 615.985468] vfs_get_tree+0x29/0x100 [ 615.985471] ? capable+0x36/0x70 [ 615.985473] path_mount+0x4ca/0xbe0 [ 615.985476] __x64_sys_mount+0x11a/0x150 [ 615.989263] do_syscall_64+0xcd/0x220 [ 615.989266] entry_SYSCALL_64_after_hwframe+0x6d/0x75 [ 615.991167] RIP: 0033:0x7f92392972ae [ 615.992146] RSP: 002b:00007ffd839a9aa8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 [ 615.992148] RAX: ffffffffffffffda RBX: 000055f9fb47a860 RCX: 00007f92392972ae [ 615.992149] RDX: 000055f9fb479c30 RSI: 000055f9fb4791c0 RDI: 000055f9fb451e40 [ 615.992149] RBP: 000055f9fb451e40 R08: 000055f9fb47a860 R09: 0000000000000000 [ 615.992150] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 [ 615.992151] R13: 000000000000006a R14: 00007ffd839a9b90 R15: 000055f9fb44df20 [ 615.992153] [ 615.992163] INFO: lockdep is turned off. [ 738.820729] INFO: task kworker/3:2:390 blocked for more than 122 seconds. [ 738.820832] Tainted: G W L 6.9.0-rc1 #1-NixOS [ 738.820910] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 738.821006] task:kworker/3:2 state:D stack:0 pid:390 tgid:390 ppid:2 flags:0x00004000 [ 738.821127] Workqueue: bcachefs_write_ref bch2_do_discards_work [bcachefs] [ 738.821359] Call Trace: [ 738.821395] [ 738.821432] __schedule+0x519/0x19f0 [ 738.821491] ? _raw_spin_unlock_irqrestore+0x27/0x60 [ 738.821562] ? try_to_wake_up+0x202/0x820 [ 738.821625] schedule+0x3a/0x150 [ 738.821675] __closure_sync+0x80/0x180 [ 738.821752] bch2_btree_update_start+0x908/0x920 [bcachefs] [ 738.821941] ? __pfx_closure_sync_fn+0x10/0x10 [ 738.821957] __bch2_foreground_maybe_merge+0x5b6/0xe40 [bcachefs] [ 738.822334] ? run_btree_triggers+0x13d/0x430 [bcachefs] [ 738.822580] __bch2_trans_commit+0xd6f/0x19c0 [bcachefs] [ 738.822838] bch2_discard_one_bucket+0x72b/0x820 [bcachefs] [ 738.823090] ? bch2_discard_one_bucket+0x350/0x820 [bcachefs] [ 738.823322] bch2_do_discards_work+0x202/0x410 [bcachefs] [ 738.823555] ? bch2_do_discards_work+0xaf/0x410 [bcachefs] [ 738.823793] process_one_work+0x230/0x740 [ 738.826604] worker_thread+0x183/0x350 [ 738.826605] ? __pfx_worker_thread+0x10/0x10 [ 738.826607] kthread+0xd0/0x100 [ 738.826609] ? __pfx_kthread+0x10/0x10 [ 738.826611] ret_from_fork+0x31/0x50 [ 738.826613] ? __pfx_kthread+0x10/0x10 [ 738.826615] ret_from_fork_asm+0x1a/0x30 [ 738.826619] [ 738.835373] INFO: task kworker/u48:50:527 blocked for more than 368 seconds. [ 738.836446] Tainted: G W L 6.9.0-rc1 #1-NixOS [ 738.836447] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 738.836447] task:kworker/u48:50 state:D stack:0 pid:527 tgid:527 ppid:2 flags:0x00004000 [ 738.836450] Workqueue: btree_update btree_interior_update_work [bcachefs] [ 738.839716] Call Trace: [ 738.839717] [ 738.839718] __schedule+0x519/0x19f0 [ 738.843989] schedule+0x3a/0x150 [ 738.843991] __closure_sync+0x80/0x180 [ 738.843993] bch2_btree_update_start+0x908/0x920 [bcachefs] [ 738.847144] ? __pfx_closure_sync_fn+0x10/0x10 [ 738.848193] __bch2_foreground_maybe_merge+0x5b6/0xe40 [bcachefs] [ 738.849283] ? journal_validate_key+0x2b7/0x610 [bcachefs] [ 738.850380] ? run_btree_triggers+0x13d/0x430 [bcachefs] [ 738.851465] __bch2_trans_commit+0xd6f/0x19c0 [bcachefs] [ 738.852540] btree_interior_update_work+0x7d2/0xae0 [bcachefs] [ 738.853605] ? process_one_work+0x230/0x740 [ 738.854626] process_one_work+0x230/0x740 [ 738.854628] ? process_one_work+0x263/0x740 [ 738.854630] worker_thread+0x183/0x350 [ 738.854632] ? __pfx_worker_thread+0x10/0x10 [ 738.854633] kthread+0xd0/0x100 [ 738.854635] ? __pfx_kthread+0x10/0x10 [ 738.860798] ret_from_fork+0x31/0x50 [ 738.860800] ? __pfx_kthread+0x10/0x10 [ 738.860802] ret_from_fork_asm+0x1a/0x30 [ 738.863821] [ 738.863827] INFO: task bcachefs:4284 blocked for more than 368 seconds. [ 738.863828] Tainted: G W L 6.9.0-rc1 #1-NixOS [ 738.863829] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 738.863829] task:bcachefs state:D stack:0 pid:4284 tgid:4284 ppid:4283 flags:0x00004002 [ 738.863831] Call Trace: [ 738.863832] [ 738.863833] __schedule+0x519/0x19f0 [ 738.863836] schedule+0x3a/0x150 [ 738.872986] __closure_sync+0x80/0x180 [ 738.872988] bch2_btree_update_start+0x908/0x920 [bcachefs] [ 738.874998] ? verify_update_old_key+0x1c9/0x1f0 [bcachefs] [ 738.876018] ? __pfx_closure_sync_fn+0x10/0x10 [ 738.876021] bch2_btree_split_leaf+0x5b/0x1d0 [bcachefs] [ 738.877924] ? bch2_journal_replay+0x178/0x5a0 [bcachefs] [ 738.878907] bch2_trans_commit_error+0x240/0x4f0 [bcachefs] [ 738.879888] ? bch2_trans_lock_write+0x10b/0x1c0 [bcachefs] [ 738.880857] ? six_trylock_ip+0x27/0x80 [bcachefs] [ 738.881836] __bch2_trans_commit+0xdd0/0x19c0 [bcachefs] [ 738.882805] bch2_journal_replay+0x178/0x5a0 [bcachefs] [ 738.883774] bch2_run_recovery_pass+0x35/0xa0 [bcachefs] [ 738.884734] bch2_run_recovery_passes+0x94/0xf0 [bcachefs] [ 738.885665] bch2_fs_recovery+0x6d9/0x1620 [bcachefs] [ 738.886579] ? local_clock_noinstr+0xd/0xc0 [ 738.886581] ? local_clock+0x15/0x30 [ 738.886584] ? bch2_get_next_online_dev+0x13b/0x370 [bcachefs] [ 738.889148] bch2_fs_start+0x13f/0x220 [bcachefs] [ 738.890034] bch2_fs_open+0x65b/0x690 [bcachefs] [ 738.890913] bch2_mount+0x581/0x720 [bcachefs] [ 738.891780] legacy_get_tree+0x31/0x60 [ 738.892581] vfs_get_tree+0x29/0x100 [ 738.892583] ? capable+0x36/0x70 [ 738.892585] path_mount+0x4ca/0xbe0 [ 738.894901] __x64_sys_mount+0x11a/0x150 [ 738.894903] do_syscall_64+0xcd/0x220 [ 738.896438] entry_SYSCALL_64_after_hwframe+0x6d/0x75 [ 738.896441] RIP: 0033:0x7f92392972ae [ 738.898001] RSP: 002b:00007ffd839a9aa8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 [ 738.898003] RAX: ffffffffffffffda RBX: 000055f9fb47a860 RCX: 00007f92392972ae [ 738.898004] RDX: 000055f9fb479c30 RSI: 000055f9fb4791c0 RDI: 000055f9fb451e40 [ 738.898005] RBP: 000055f9fb451e40 R08: 000055f9fb47a860 R09: 0000000000000000 [ 738.901185] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 [ 738.901185] R13: 000000000000006a R14: 00007ffd839a9b90 R15: 000055f9fb44df20 [ 738.901188] [ 738.901190] INFO: lockdep is turned off. [ 861.695215] INFO: task kworker/3:2:390 blocked for more than 245 seconds. [ 861.698609] Tainted: G W L 6.9.0-rc1 #1-NixOS [ 861.701321] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 861.702265] task:kworker/3:2 state:D stack:0 pid:390 tgid:390 ppid:2 flags:0x00004000 [ 861.702268] Workqueue: bcachefs_write_ref bch2_do_discards_work [bcachefs] [ 861.704231] Call Trace: [ 861.704232] [ 861.704235] __schedule+0x519/0x19f0 [ 861.704239] ? _raw_spin_unlock_irqrestore+0x27/0x60 [ 861.704242] ? try_to_wake_up+0x202/0x820 [ 861.704245] schedule+0x3a/0x150 [ 861.704247] __closure_sync+0x80/0x180 [ 861.704250] bch2_btree_update_start+0x908/0x920 [bcachefs] [ 861.712155] ? __pfx_closure_sync_fn+0x10/0x10 [ 861.713165] __bch2_foreground_maybe_merge+0x5b6/0xe40 [bcachefs] [ 861.713209] ? run_btree_triggers+0x13d/0x430 [bcachefs] [ 861.715253] __bch2_trans_commit+0xd6f/0x19c0 [bcachefs] [ 861.716328] bch2_discard_one_bucket+0x72b/0x820 [bcachefs] [ 861.716360] ? bch2_discard_one_bucket+0x350/0x820 [bcachefs] [ 861.718430] bch2_do_discards_work+0x202/0x410 [bcachefs] [ 861.719504] ? bch2_do_discards_work+0xaf/0x410 [bcachefs] [ 861.719534] process_one_work+0x230/0x740 [ 861.721647] worker_thread+0x183/0x350 [ 861.721649] ? __pfx_worker_thread+0x10/0x10 [ 861.723787] kthread+0xd0/0x100 [ 861.723789] ? __pfx_kthread+0x10/0x10 [ 861.723792] ret_from_fork+0x31/0x50 [ 861.723793] ? __pfx_kthread+0x10/0x10 [ 861.723795] ret_from_fork_asm+0x1a/0x30 [ 861.729183] [ 861.729192] INFO: task kworker/u48:50:527 blocked for more than 491 seconds. [ 861.731376] Tainted: G W L 6.9.0-rc1 #1-NixOS [ 861.731377] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 861.731378] task:kworker/u48:50 state:D stack:0 pid:527 tgid:527 ppid:2 flags:0x00004000 [ 861.731380] Workqueue: btree_update btree_interior_update_work [bcachefs] [ 861.734799] Call Trace: [ 861.734800] [ 861.734801] __schedule+0x519/0x19f0 [ 861.734805] schedule+0x3a/0x150 [ 861.740516] __closure_sync+0x80/0x180 [ 861.740519] bch2_btree_update_start+0x908/0x920 [bcachefs] [ 861.742719] ? __pfx_closure_sync_fn+0x10/0x10 [ 861.743774] __bch2_foreground_maybe_merge+0x5b6/0xe40 [bcachefs] [ 861.744859] ? journal_validate_key+0x2b7/0x610 [bcachefs] [ 861.745945] ? run_btree_triggers+0x13d/0x430 [bcachefs] [ 861.747007] __bch2_trans_commit+0xd6f/0x19c0 [bcachefs] [ 861.748055] btree_interior_update_work+0x7d2/0xae0 [bcachefs] [ 861.749077] ? process_one_work+0x230/0x740 [ 861.750022] process_one_work+0x230/0x740 [ 861.750024] ? process_one_work+0x263/0x740 [ 861.750026] worker_thread+0x183/0x350 [ 861.750028] ? __pfx_worker_thread+0x10/0x10 [ 861.750029] kthread+0xd0/0x100 [ 861.750031] ? __pfx_kthread+0x10/0x10 [ 861.755577] ret_from_fork+0x31/0x50 [ 861.755579] ? __pfx_kthread+0x10/0x10 [ 861.755581] ret_from_fork_asm+0x1a/0x30 [ 861.758267] [ 861.758273] INFO: task bcachefs:4284 blocked for more than 491 seconds. [ 861.758274] Tainted: G W L 6.9.0-rc1 #1-NixOS [ 861.758274] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 861.758275] task:bcachefs state:D stack:0 pid:4284 tgid:4284 ppid:4283 flags:0x00004002 [ 861.758277] Call Trace: [ 861.758277] [ 861.758278] __schedule+0x519/0x19f0 [ 861.758282] schedule+0x3a/0x150 [ 861.758284] __closure_sync+0x80/0x180 [ 861.766873] bch2_btree_update_start+0x908/0x920 [bcachefs] [ 861.766912] ? verify_update_old_key+0x1c9/0x1f0 [bcachefs] [ 861.768551] ? __pfx_closure_sync_fn+0x10/0x10 [ 861.769353] bch2_btree_split_leaf+0x5b/0x1d0 [bcachefs] [ 861.770177] ? bch2_journal_replay+0x178/0x5a0 [bcachefs] [ 861.771008] bch2_trans_commit_error+0x240/0x4f0 [bcachefs] [ 861.771843] ? bch2_trans_lock_write+0x10b/0x1c0 [bcachefs] [ 861.772675] ? six_trylock_ip+0x27/0x80 [bcachefs] [ 861.773517] __bch2_trans_commit+0xdd0/0x19c0 [bcachefs] [ 861.774351] bch2_journal_replay+0x178/0x5a0 [bcachefs] [ 861.775200] bch2_run_recovery_pass+0x35/0xa0 [bcachefs] [ 861.776051] bch2_run_recovery_passes+0x94/0xf0 [bcachefs] [ 861.776895] bch2_fs_recovery+0x6d9/0x1620 [bcachefs] [ 861.777739] ? local_clock_noinstr+0xd/0xc0 [ 861.778532] ? local_clock+0x15/0x30 [ 861.778535] ? bch2_get_next_online_dev+0x13b/0x370 [bcachefs] [ 861.780124] bch2_fs_start+0x13f/0x220 [bcachefs] [ 861.780961] bch2_fs_open+0x65b/0x690 [bcachefs] [ 861.781791] bch2_mount+0x581/0x720 [bcachefs] [ 861.782620] legacy_get_tree+0x31/0x60 [ 861.783398] vfs_get_tree+0x29/0x100 [ 861.783401] ? capable+0x36/0x70 [ 861.783403] path_mount+0x4ca/0xbe0 [ 861.783406] __x64_sys_mount+0x11a/0x150 [ 861.783408] do_syscall_64+0xcd/0x220 [ 861.783410] entry_SYSCALL_64_after_hwframe+0x6d/0x75 [ 861.783413] RIP: 0033:0x7f92392972ae [ 861.788808] RSP: 002b:00007ffd839a9aa8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 [ 861.788810] RAX: ffffffffffffffda RBX: 000055f9fb47a860 RCX: 00007f92392972ae [ 861.788811] RDX: 000055f9fb479c30 RSI: 000055f9fb4791c0 RDI: 000055f9fb451e40 [ 861.791210] RBP: 000055f9fb451e40 R08: 000055f9fb47a860 R09: 0000000000000000 [ 861.791211] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 [ 861.791212] R13: 000000000000006a R14: 00007ffd839a9b90 R15: 000055f9fb44df20 [ 861.791214] [ 861.791215] Future hung task reports are suppressed, see sysctl kernel.hung_task_warnings [ 861.791217] INFO: lockdep is turned off. ```

I'm also attaching the contents of /proc/lock_stat because lockdep was disabled due to too deep of a chain of locks.

koverstreet commented 3 months ago

I just pushed a fix for this to the bcachefs-testing branch - a new watermark for btree interior updates. Can someone test it and see if it fixes the issue?

ramonacat commented 3 months ago

Hey I built my kernel from bcahefs-testing, but fsck.bcachefs still does not seem to work:

[root@kexec:~]# fsck.bcachefs /dev/nvme1n1p1:/dev/nvme0n1p2
mounting version 1.6: (unknown version) opts=ro,metadata_replicas=2,data_replicas=2,degraded,fsck,fix_errors=ask,read_only
recovering from unclean shutdown
superblock requires following recovery passes to be run:
  check_subvols,check_dirents
Version downgrade required:

journal read done, replaying entries 28078178-28080680
alloc_read... done
stripes_read... done
snapshots_read... done
check_allocations... done
going read-write
journal_replay...bch2_btree_update_start(): error journal_reclaim_would_deadlock
bch2_btree_update_start(): error journal_reclaim_would_deadlock
bch2_btree_update_start(): error journal_reclaim_would_deadlock
bch2_btree_update_start(): error journal_reclaim_would_deadlock
bch2_btree_update_start(): error journal_reclaim_would_deadlock
bch2_btree_update_start(): error journal_reclaim_would_deadlock
bch2_btree_update_start(): error journal_reclaim_would_deadlock
bch2_btree_update_start(): error journal_reclaim_would_deadlock
bch2_btree_update_start(): error journal_reclaim_would_deadlock
bch2_btree_update_start(): error journal_reclaim_would_deadlock
bch2_btree_update_start: 5593 callbacks suppressed
bch2_btree_update_start(): error journal_reclaim_would_deadlock
[...]

(these messages continue, without disk or CPU activity, but nothing about blocked tasks in dmesg)

Edit: trying to bcachefs mount -o fsck,fix_errors /dev/nvme1n1p1:/dev/nvme0n1p2 /mnt/, I am getting something in dmesg:

[ 1493.824285] bcachefs (f60df395-9d99-4067-a59b-0ac1f3cc38a9): mounting version 1.4: member_seq opts=metadata_replicas=2,data_replicas=2,fsck,fix_errors=yes
[ 1493.824292] bcachefs (f60df395-9d99-4067-a59b-0ac1f3cc38a9): recovering from unclean shutdown
[ 1493.824296] bcachefs (f60df395-9d99-4067-a59b-0ac1f3cc38a9): superblock requires following recovery passes to be run:
                 check_subvols,check_dirents
[ 1493.824301] bcachefs (f60df395-9d99-4067-a59b-0ac1f3cc38a9): Doing compatible version upgrade from 1.4: member_seq to 1.6: btree_subvolume_children

[ 1494.613438] bcachefs (f60df395-9d99-4067-a59b-0ac1f3cc38a9): journal read done, replaying entries 28078178-28081265
[ 1496.012522] bcachefs (f60df395-9d99-4067-a59b-0ac1f3cc38a9): alloc_read... done
[ 1496.027695] bcachefs (f60df395-9d99-4067-a59b-0ac1f3cc38a9): stripes_read... done
[ 1496.027701] bcachefs (f60df395-9d99-4067-a59b-0ac1f3cc38a9): snapshots_read... done
[ 1496.027714] bcachefs (f60df395-9d99-4067-a59b-0ac1f3cc38a9): check_allocations...
[ 1503.528973] Process accounting resumed
[ 1578.779521]  done
[ 1578.810930] bcachefs (f60df395-9d99-4067-a59b-0ac1f3cc38a9): going read-write
[ 1578.818998] bcachefs (f60df395-9d99-4067-a59b-0ac1f3cc38a9): journal_replay...
[ 1681.831269] Process accounting resumed
[ 1722.063047] INFO: task kworker/u50:5:252 blocked for more than 122 seconds.
[ 1722.063335]       Not tainted 6.9.0-rc2 #1-NixOS
[ 1722.063528] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1722.063847] task:kworker/u50:5   state:D stack:0     pid:252   tgid:252   ppid:2      flags:0x00004000
[ 1722.063855] Workqueue: btree_update btree_interior_update_work [bcachefs]
[ 1722.063949] Call Trace:
[ 1722.063952]  <TASK>
[ 1722.063957]  __schedule+0x3ec/0x1540
[ 1722.063973]  schedule+0x27/0xf0
[ 1722.063978]  __closure_sync+0x7e/0x150
[ 1722.063985]  bch2_btree_update_start+0x7dc/0x7f0 [bcachefs]
[ 1722.064061]  ? __pfx_closure_sync_fn+0x10/0x10
[ 1722.064068]  ? srso_return_thunk+0x5/0x5f
[ 1722.064074]  __bch2_foreground_maybe_merge+0x568/0xd40 [bcachefs]
[ 1722.064151]  ? journal_validate_key+0x2b7/0x610 [bcachefs]
[ 1722.064251]  __bch2_trans_commit+0x3ab/0x1830 [bcachefs]
[ 1722.064335]  btree_interior_update_work+0x764/0x9e0 [bcachefs]
[ 1722.064413]  ? process_one_work+0x18e/0x3b0
[ 1722.064420]  process_one_work+0x18e/0x3b0
[ 1722.064427]  worker_thread+0x245/0x350
[ 1722.064433]  ? __pfx_worker_thread+0x10/0x10
[ 1722.064438]  kthread+0xd0/0x100
[ 1722.064444]  ? __pfx_kthread+0x10/0x10
[ 1722.064449]  ret_from_fork+0x34/0x50
[ 1722.064454]  ? __pfx_kthread+0x10/0x10
[ 1722.064459]  ret_from_fork_asm+0x1a/0x30
[ 1722.064471]  </TASK>
[ 1722.064479] INFO: task bcachefs:1638 blocked for more than 122 seconds.
[ 1722.064750]       Not tainted 6.9.0-rc2 #1-NixOS
[ 1722.064951] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1722.065269] task:bcachefs        state:D stack:0     pid:1638  tgid:1638  ppid:1204   flags:0x00004006
[ 1722.065275] Call Trace:
[ 1722.065277]  <TASK>
[ 1722.065281]  __schedule+0x3ec/0x1540
[ 1722.065293]  schedule+0x27/0xf0
[ 1722.065298]  __closure_sync+0x7e/0x150
[ 1722.065305]  bch2_btree_update_start+0x7dc/0x7f0 [bcachefs]
[ 1722.065375]  ? srso_return_thunk+0x5/0x5f
[ 1722.065382]  ? __pfx_closure_sync_fn+0x10/0x10
[ 1722.065390]  bch2_btree_split_leaf+0x64/0x2b0 [bcachefs]
[ 1722.065460]  ? srso_return_thunk+0x5/0x5f
[ 1722.065467]  ? bch2_journal_replay+0x172/0x650 [bcachefs]
[ 1722.065554]  bch2_trans_commit_error+0x21c/0x4f0 [bcachefs]
[ 1722.065629]  ? srso_return_thunk+0x5/0x5f
[ 1722.065633]  ? six_trylock_ip+0x1f/0x50 [bcachefs]
[ 1722.065713]  ? srso_return_thunk+0x5/0x5f
[ 1722.065719]  __bch2_trans_commit+0xd87/0x1830 [bcachefs]
[ 1722.065794]  ? bch2_trans_iter_exit+0x71/0x90 [bcachefs]
[ 1722.065871]  bch2_journal_replay+0x172/0x650 [bcachefs]
[ 1722.065959]  ? __bch2_print+0x87/0xe0 [bcachefs]
[ 1722.066040]  bch2_run_recovery_pass+0x38/0xa0 [bcachefs]
[ 1722.066120]  bch2_run_recovery_passes+0xb6/0x180 [bcachefs]
[ 1722.066198]  bch2_fs_recovery+0x6d9/0x1390 [bcachefs]
[ 1722.066276]  ? __bch2_print+0x87/0xe0 [bcachefs]
[ 1722.066355]  ? bch2_printbuf_exit+0x20/0x30 [bcachefs]
[ 1722.066433]  ? srso_return_thunk+0x5/0x5f
[ 1722.066437]  ? print_mount_opts+0x131/0x180 [bcachefs]
[ 1722.066517]  bch2_fs_start+0x2f5/0x470 [bcachefs]
[ 1722.066592]  bch2_fs_open+0x6c4/0x6e0 [bcachefs]
[ 1722.066672]  bch2_mount+0x5bd/0x790 [bcachefs]
[ 1722.066762]  ? srso_return_thunk+0x5/0x5f
[ 1722.066778]  legacy_get_tree+0x2b/0x50
[ 1722.066784]  vfs_get_tree+0x29/0xe0
[ 1722.066790]  ? srso_return_thunk+0x5/0x5f
[ 1722.066794]  path_mount+0x4ca/0xb10
[ 1722.066803]  __x64_sys_mount+0x11a/0x150
[ 1722.066810]  do_syscall_64+0xbd/0x210
[ 1722.066817]  entry_SYSCALL_64_after_hwframe+0x72/0x7a
[ 1722.066822] RIP: 0033:0x7fc030d8a0fe
[ 1722.066839] RSP: 002b:00007ffcbf98a588 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
[ 1722.066844] RAX: ffffffffffffffda RBX: 000000002b7386f0 RCX: 00007fc030d8a0fe
[ 1722.066847] RDX: 000000002b738c50 RSI: 000000002b7362b0 RDI: 000000002b7386f0
[ 1722.066850] RBP: 0000000000000005 R08: 000000002b736030 R09: 0000000000000000
[ 1722.066853] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffcbf98a5f0
[ 1722.066856] R13: 000000000000001e R14: 0000000000000004 R15: 00000000008b6558
[ 1722.066866]  </TASK>

koverstreet commented 3 months ago

Check sysfs dev-0/alloc_debug, internal/btree_updates and post them here

ramonacat commented 3 months ago

Sorry, I was in a bit of a rush so I reformatted and restored backups. The rebuilt environment is exactly the same though, so there's a chance that the problem will reappear if we're lucky.

cotsuka commented 3 months ago

I seem to be seeing this as well. I haven't been able to build a new kernel with the testing branch though, so this is with kernel 6.8.2. If I have more free time I'll attempt building it. Here are the logs I do have in case they help:

dmesg

[   15.359897] bcachefs (64eecf01-40f7-4220-b811-8bd6f0521f12): mounting version 1.4: member_seq opts=metadata_replicas=3,data_replicas=3,metadata_replicas_required=3,background_compression=zstd,foreground_target=hdd,background_target=hdd,promote_target=ssd,erasure_code,verbose
[   15.359906] bcachefs (64eecf01-40f7-4220-b811-8bd6f0521f12): recovering from unclean shutdown
[   15.359910] bcachefs (64eecf01-40f7-4220-b811-8bd6f0521f12): starting journal read
[   23.117277] bcachefs (64eecf01-40f7-4220-b811-8bd6f0521f12): journal read done on device sdc, ret 0
[   23.325003] bcachefs (64eecf01-40f7-4220-b811-8bd6f0521f12): journal read done on device sdf, ret 0
[   23.408892] bcachefs (64eecf01-40f7-4220-b811-8bd6f0521f12): journal read done on device sda, ret 0
[   23.428550] bcachefs (64eecf01-40f7-4220-b811-8bd6f0521f12): journal read done on device sdd, ret 0
[   23.429761] bcachefs (64eecf01-40f7-4220-b811-8bd6f0521f12): journal read done on device sdg, ret 0
[   23.855037] bcachefs (64eecf01-40f7-4220-b811-8bd6f0521f12): journal read done on device sde, ret 0
[   26.429228] bcachefs (64eecf01-40f7-4220-b811-8bd6f0521f12): journal read done on device sdb, ret 0
[   26.429259] bcachefs (64eecf01-40f7-4220-b811-8bd6f0521f12): journal read done, replaying entries 4494202-4495743
[   31.438759] bcachefs (64eecf01-40f7-4220-b811-8bd6f0521f12): Journal keys: 2700653 read, 1411002 after sorting and compacting
[   31.864633] bcachefs (64eecf01-40f7-4220-b811-8bd6f0521f12): alloc_read... done
[   33.471531] bcachefs (64eecf01-40f7-4220-b811-8bd6f0521f12): stripes_read... done
[   33.471538] bcachefs (64eecf01-40f7-4220-b811-8bd6f0521f12): snapshots_read... done
[   34.304783] bcachefs (64eecf01-40f7-4220-b811-8bd6f0521f12): going read-write
[   34.312443] bcachefs (64eecf01-40f7-4220-b811-8bd6f0521f12): journal_replay...
[   41.048106] bcachefs (64eecf01-40f7-4220-b811-8bd6f0521f12): bch2_btree_update_start(): error journal_reclaim_would_deadlock
[   41.048446] bcachefs (64eecf01-40f7-4220-b811-8bd6f0521f12): bch2_btree_update_start(): error journal_reclaim_would_deadlock
[   41.048777] bcachefs (64eecf01-40f7-4220-b811-8bd6f0521f12): bch2_btree_update_start(): error journal_reclaim_would_deadlock
[   41.049105] bcachefs (64eecf01-40f7-4220-b811-8bd6f0521f12): bch2_btree_update_start(): error journal_reclaim_would_deadlock
[   41.049434] bcachefs (64eecf01-40f7-4220-b811-8bd6f0521f12): bch2_btree_update_start(): error journal_reclaim_would_deadlock
[   41.049769] bcachefs (64eecf01-40f7-4220-b811-8bd6f0521f12): bch2_btree_update_start(): error journal_reclaim_would_deadlock
[   41.050099] bcachefs (64eecf01-40f7-4220-b811-8bd6f0521f12): bch2_btree_update_start(): error journal_reclaim_would_deadlock
[   41.050420] bcachefs (64eecf01-40f7-4220-b811-8bd6f0521f12): bch2_btree_update_start(): error journal_reclaim_would_deadlock
[   41.050741] bcachefs (64eecf01-40f7-4220-b811-8bd6f0521f12): bch2_btree_update_start(): error journal_reclaim_would_deadlock
[   41.051065] bcachefs (64eecf01-40f7-4220-b811-8bd6f0521f12): bch2_btree_update_start(): error journal_reclaim_would_deadlock
[   43.961808] kernel BUG at fs/bcachefs/btree_trans_commit.c:935!
[   43.961825] CPU: 8 PID: 414 Comm: mount.bcachefs Not tainted 6.8.2-arch2-1 #1 a430fb92f7ba43092b62bbe6bac995458d3d442d
[   43.961837] RIP: 0010:bch2_trans_commit_error+0x381/0x500 [bcachefs]
[   43.961946]  ? bch2_trans_commit_error+0x381/0x500 [bcachefs 46c358f386912ecd5a5656ed8d373795ac1e5800]
[   43.961978]  ? bch2_trans_commit_error+0x381/0x500 [bcachefs 46c358f386912ecd5a5656ed8d373795ac1e5800]
[   43.962011]  ? bch2_trans_commit_error+0x381/0x500 [bcachefs 46c358f386912ecd5a5656ed8d373795ac1e5800]
[   43.962042]  ? bch2_journal_replay+0x42c/0x550 [bcachefs 46c358f386912ecd5a5656ed8d373795ac1e5800]
[   43.962078]  ? bch2_journal_replay+0x42c/0x550 [bcachefs 46c358f386912ecd5a5656ed8d373795ac1e5800]
[   43.962106]  ? bch2_trans_commit_error+0x381/0x500 [bcachefs 46c358f386912ecd5a5656ed8d373795ac1e5800]
[   43.962135]  ? bch2_trans_commit_error+0x1ab/0x500 [bcachefs 46c358f386912ecd5a5656ed8d373795ac1e5800]
[   43.962163]  __bch2_trans_commit+0xbdf/0x1730 [bcachefs 46c358f386912ecd5a5656ed8d373795ac1e5800]
[   43.962190]  ? bch2_trans_iter_exit+0x71/0x90 [bcachefs 46c358f386912ecd5a5656ed8d373795ac1e5800]
[   43.962221]  bch2_journal_replay+0x42c/0x550 [bcachefs 46c358f386912ecd5a5656ed8d373795ac1e5800]
[   43.962251]  bch2_run_recovery_pass+0x38/0xa0 [bcachefs 46c358f386912ecd5a5656ed8d373795ac1e5800]
[   43.962279]  bch2_fs_recovery+0x175a/0x1aa0 [bcachefs 46c358f386912ecd5a5656ed8d373795ac1e5800]
[   43.962312]  ? bch2_printbuf_exit+0x20/0x30 [bcachefs 46c358f386912ecd5a5656ed8d373795ac1e5800]
[   43.962345]  ? print_mount_opts+0x2b5/0x3e0 [bcachefs 46c358f386912ecd5a5656ed8d373795ac1e5800]
[   43.962378]  bch2_fs_start+0x2cd/0x410 [bcachefs 46c358f386912ecd5a5656ed8d373795ac1e5800]
[   43.962407]  bch2_fs_open+0x10fc/0x1640 [bcachefs 46c358f386912ecd5a5656ed8d373795ac1e5800]
[   43.962449]  ? bch2_mount+0x57f/0x730 [bcachefs 46c358f386912ecd5a5656ed8d373795ac1e5800]
[   43.962482]  bch2_mount+0x57f/0x730 [bcachefs 46c358f386912ecd5a5656ed8d373795ac1e5800]
[   43.962696] Modules linked in: ccm algif_aead crypto_null des3_ede_x86_64 cbc des_generic libdes algif_skcipher cmac md4 algif_hash af_alg ip6t_REJECT nf_reject_ipv6 intel_rapl_msr intel_rapl_common xt_hl ip6t_rt snd_hda_codec_realtek kvm_amd snd_hda_codec_generic mt7921e kvm mt7921_common snd_hda_codec_hdmi mt792x_lib irqbypass crct10dif_pclmul mt76_connac_lib snd_hda_intel ipt_REJECT crc32_pclmul polyval_clmulni snd_intel_dspcfg nf_reject_ipv4 mt76 polyval_generic snd_intel_sdw_acpi gf128mul snd_hda_codec btusb ghash_clmulni_intel snd_hda_core ext4 mac80211 sha512_ssse3 btrtl sha1_ssse3 snd_hwdep btintel aesni_intel snd_pcm btbcm r8169 crypto_simd mbcache btmtk vfat xt_LOG snd_timer libarc4 fat nf_log_syslog realtek jbd2 cryptd bluetooth sp5100_tco xt_comment snd mdio_devres rapl wmi_bmof pcspkr cfg80211 soundcore ccp xt_multiport libphy ecdh_generic i2c_piix4 joydev k10temp crc16 gpio_amdpt xt_recent bcachefs mac_hid gpio_generic rfkill nft_limit lz4hc_compress lz4_compress xt_limit xt_addrtype xt_tcpudp
[   43.962848] RIP: 0010:bch2_trans_commit_error+0x381/0x500 [bcachefs]
[  176.851509] bcachefs (/dev/sda): error reading superblock: (null)

dev-0/alloc_debug:

                 buckets         sectors      fragmented
free            39838753               0               0
sb                    13            6152             504
journal             8192         4194304               0
btree             118357        60598784               0
user             6705591      3428202231         5063294
cached           6736834      1859999418               0
parity                 0               0               0
stripe                 0               0               0
need_gc_gens           0               0               0
need_discard           4               0               0

reserves:
stripe           1669048
normal            834552
copygc                56
btree                 28
btree_copygc           0
reclaim                0

freelist_wait           empty
open buckets allocated  988
open buckets this dev   158
open buckets total      1024
open_buckets_wait       empty
open_buckets_btree      987
open_buckets_user       0
buckets_to_invalidate   0
btree reserve cache     0

internal/btree_updates:

00000000194a073f m 1 w 1 r 1 j 0
0000000023eeef38 m 1 w 1 r 1 j 0
0000000033a603f2 m 1 w 1 r 1 j 4494984
00000000b710ddb1 m 1 w 1 r 1 j 4495752
00000000ef329170 m 1 w 1 r 1 j 4494735
000000007c935588 m 1 w 1 r 1 j 0
000000008efd8759 m 1 w 1 r 1 j 0
00000000bf65462a m 1 w 1 r 1 j 4494202
00000000326b5ee7 m 1 w 1 r 1 j 0
000000001ac70bde m 1 w 1 r 1 j 4494453
000000000f51acf2 m 1 w 1 r 1 j 0
000000003bf6e6e4 m 1 w 1 r 1 j 4494202
00000000630f5c75 m 1 w 1 r 1 j 0
00000000e10784c3 m 1 w 1 r 1 j 0
000000005b78a96b m 1 w 1 r 1 j 0
00000000c74084e9 m 1 w 1 r 1 j 0
000000009eda92aa m 1 w 1 r 1 j 0
00000000a876575d m 1 w 1 r 1 j 0
000000006df89aef m 1 w 1 r 1 j 0
00000000962f539c m 1 w 0 r 1 j 0
000000004f7664e4 m 1 w 0 r 1 j 0
00000000b55e3dd3 m 1 w 0 r 1 j 4495756
00000000ef9a2888 m 1 w 0 r 1 j 4495755
000000003982e633 m 1 w 0 r 1 j 4495757
000000009f850f3c m 1 w 0 r 1 j 4495757
00000000f5efcf71 m 1 w 0 r 1 j 0
00000000286be7aa m 1 w 0 r 1 j 0
000000006f9255cd m 1 w 0 r 1 j 0
000000000c8b2b47 m 1 w 0 r 1 j 0
000000005ab4ec85 m 1 w 0 r 1 j 4495759
00000000380c3c1a m 1 w 0 r 1 j 4495759
000000002e4a250a m 1 w 0 r 1 j 0
00000000035e98ae m 1 w 0 r 1 j 0
00000000b7697e68 m 1 w 0 r 1 j 0
0000000021894e39 m 1 w 0 r 1 j 0
000000002977a250 m 1 w 0 r 1 j 4495759
0000000081d6e0b5 m 1 w 0 r 1 j 4495759
00000000fad38885 m 1 w 0 r 1 j 0
000000009f58380f m 1 w 0 r 1 j 0
000000005892f9f9 m 1 w 0 r 1 j 4495759
00000000a0559217 m 1 w 0 r 1 j 4495759
00000000997f1173 m 1 w 0 r 1 j 0
0000000003d25906 m 1 w 0 r 1 j 0
00000000aeab97cd m 1 w 0 r 1 j 4495759
0000000008a8e6a7 m 1 w 0 r 1 j 0
00000000f496193c m 1 w 0 r 1 j 0
000000004326ae0a m 1 w 0 r 1 j 0
00000000e2bffef0 m 1 w 0 r 1 j 0
000000001ae00149 m 1 w 0 r 1 j 4495759
00000000371ef815 m 1 w 0 r 1 j 0
000000001e808411 m 1 w 0 r 1 j 0
00000000fcd39749 m 1 w 0 r 1 j 4495759
00000000a15fdcd9 m 1 w 0 r 1 j 4495759
00000000d70e2a78 m 1 w 0 r 1 j 0
00000000930a5238 m 1 w 0 r 1 j 0
00000000869d6139 m 1 w 0 r 1 j 0
00000000bf318bd2 m 1 w 0 r 1 j 4495760
00000000b46a8731 m 1 w 0 r 1 j 0
00000000edccba69 m 1 w 0 r 1 j 4495760
0000000004f045a4 m 1 w 0 r 1 j 4495760
00000000591c700e m 1 w 0 r 1 j 0
00000000bc0020d4 m 1 w 0 r 1 j 4495760
000000000656e003 m 1 w 0 r 1 j 0
000000009eedfdf3 m 1 w 0 r 1 j 4495761
000000008d5a3887 m 1 w 0 r 1 j 4495761
000000001ebd3619 m 1 w 0 r 1 j 0
00000000cc24fdae m 1 w 0 r 1 j 4495761
0000000086f094a2 m 1 w 0 r 1 j 4495761
000000006cc78af8 m 1 w 0 r 1 j 4495761
00000000abbc371c m 1 w 0 r 1 j 4495761
0000000006f6cb05 m 1 w 0 r 1 j 4495761
0000000038235bcb m 1 w 0 r 1 j 4495761
00000000e8185be9 m 1 w 0 r 1 j 4495761
00000000a38ef1c1 m 1 w 0 r 1 j 0
00000000e820b447 m 1 w 0 r 1 j 0
000000003591af03 m 1 w 0 r 1 j 0
000000001f0ae9ae m 1 w 0 r 1 j 0
0000000087166620 m 1 w 0 r 1 j 0
00000000a17dbfc0 m 1 w 0 r 1 j 4495761
000000007376ffe6 m 1 w 0 r 1 j 4495761
00000000fd661e33 m 1 w 0 r 1 j 4495761
00000000bccca2ab m 1 w 0 r 1 j 4495761
0000000004801f49 m 1 w 0 r 1 j 0
00000000b2bdf2f1 m 1 w 0 r 1 j 0
000000007114c49f m 1 w 0 r 1 j 0
000000007faee1a2 m 1 w 0 r 1 j 0
000000009e1bb653 m 1 w 0 r 1 j 0
00000000621055d6 m 1 w 0 r 1 j 0
0000000051e733d3 m 1 w 0 r 1 j 0
000000003883df70 m 1 w 0 r 1 j 0
00000000d4229549 m 1 w 0 r 1 j 0
00000000b00ab471 m 1 w 0 r 1 j 0
0000000079531cdf m 1 w 0 r 1 j 4495761
0000000050fb659e m 1 w 0 r 1 j 4495761
000000007878143f m 1 w 0 r 1 j 4495761
00000000fc539c34 m 1 w 0 r 1 j 4495761
00000000de5162ae m 1 w 0 r 1 j 4495761
00000000e74472c1 m 1 w 0 r 1 j 4495761
000000000d7c5997 m 1 w 0 r 1 j 0
00000000e4d25975 m 1 w 0 r 1 j 0
0000000044e085b6 m 1 w 0 r 1 j 0
0000000006a0f5df m 1 w 0 r 1 j 0
0000000030ea4406 m 1 w 0 r 1 j 0
0000000012568f18 m 1 w 0 r 1 j 4495761
000000001d77544e m 1 w 0 r 1 j 4495762
00000000b12eb8af m 1 w 0 r 1 j 0
0000000030fcc564 m 1 w 0 r 1 j 0
00000000497e1351 m 1 w 0 r 1 j 0
00000000e0200d7f m 1 w 0 r 1 j 0
000000008b03ebf2 m 1 w 0 r 1 j 0
000000008a602d85 m 1 w 0 r 1 j 0
0000000069b1915d m 1 w 0 r 1 j 0
0000000067db49c0 m 1 w 0 r 1 j 0
00000000f9bba9b3 m 1 w 0 r 1 j 0
0000000036c1d5ec m 1 w 0 r 1 j 0
00000000fe1dbf07 m 1 w 0 r 1 j%

koverstreet commented 3 months ago

Is anyone still able to hit this with my master branch?

PrefersAwkward commented 3 months ago

Kent, I just now got your master branch kernel running, and I have tools 1.6.4. I also have a split-brained FS handy for debugging.

IIRC, the initial issues started after my computer lost power back in 6.7. After trying to fix a while back, I began to see splitbrains. Not sure if I did user errors or made things worse, or if I'm experiencing what you're aiming to fix, but I'll give it a try.

Hopefully, my stuff is useful. If not, I'll try and reformat a new FS and see if I can reproduce.

I ran this: sudo fsck.bcachefs /dev/sda:/dev/sdb:/dev/nvme1n1:/dev/nvme2n1

Got this message:

bch2_dev_in_fs() Split brain detected between /dev/nvme2n1 and /dev/sda: /dev/sda believes seq of /dev/nvme2n1 to be 36, but /dev/nvme2n1 has 39 Not using /dev/nvme2n1 bch2_dev_in_fs() Split brain detected between /dev/nvme1n1 and /dev/sda: /dev/sda believes seq of /dev/nvme1n1 to be 36, but /dev/nvme1n1 has 39 Not using /dev/nvme1n1 bch2_dev_in_fs() Split brain detected between /dev/sdb and /dev/sda: /dev/sda believes seq of /dev/sdb to be 36, but /dev/sdb has 39 Not using /dev/sdb insufficient devices online (0) for replicas entry btree: 1/2 [0 1] bch2_fs_open() bch_fs_open err opening /dev/sda: insufficient_devices_to_start

koverstreet commented 3 months ago

You need -o no_splitbrain_check - there was a bug that caused us to not downgrade correctly, so the sequence number for splitbrain detection didn't get updated; that option is the workaround.

PrefersAwkward commented 3 months ago

I'm probably screwing this up.

I ran: sudo fsck.bcachefs -o no_splitbrain_check /dev/sda:/dev/sdb:/dev/nvme1n1:/dev/nvme2n1

Got this message:

bch2_dev_in_fs() Split brain detected between /dev/nvme2n1 and /dev/sda: /dev/sda believes seq of /dev/nvme2n1 to be 36, but /dev/nvme2n1 has 39 Not using /dev/nvme2n1 bch2_dev_in_fs() Split brain detected between /dev/nvme1n1 and /dev/sda: /dev/sda believes seq of /dev/nvme1n1 to be 36, but /dev/nvme1n1 has 39 Not using /dev/nvme1n1 bch2_dev_in_fs() Split brain detected between /dev/sdb and /dev/sda: /dev/sda believes seq of /dev/sdb to be 36, but /dev/sdb has 39 Not using /dev/sdb insufficient devices online (0) for replicas entry btree: 1/2 [0 1] bch2_fs_open() bch_fs_open err opening /dev/sda: insufficient_devices_to_start

koverstreet commented 3 months ago

What version are you using?

PrefersAwkward commented 3 months ago

I'm running master branch. It says 6.9rc2. Bcachefs tools 1.6.4.

Let me know if you want me to run a print or debug command to confirm

koverstreet commented 3 months ago

Looks like tools 1.6.4 doesn't have no_splitbrain_check - my bad, I need to tag a new release.

Try mounting with -o no_splitbrain_check,fsck,fix_errors

PrefersAwkward commented 3 months ago

I ran this: sudo mount -t bcachefs -o fsck,no_splitbrain_check,fix_errors /dev/sda:/dev/sdb:/dev/nvme1n1:/dev/nvme2n1 /mnt/BigBoi

It spiked CPUs for a minute or two. It then calmed down and hasn't yet mounted. It like the CPU and disks are not doing much according to IOStat. I'll check on it now and then and see if it got mounted.

I also ran this: dmesg:

[ 7486.730707] bcachefs (0d21127d-8c69-4c51-b44a-6ef94b8a1216): alloc_read... done [ 7486.741119] bcachefs (0d21127d-8c69-4c51-b44a-6ef94b8a1216): stripes_read... done [ 7486.741124] bcachefs (0d21127d-8c69-4c51-b44a-6ef94b8a1216): snapshots_read... done [ 7486.741131] bcachefs (0d21127d-8c69-4c51-b44a-6ef94b8a1216): check_allocations... [ 7500.813090] iwlwifi 0000:0e:00.0: WRT: Invalid buffer destination [ 7500.965455] iwlwifi 0000:0e:00.0: WFPM_UMAC_PD_NOTIFICATION: 0x20 [ 7500.965469] iwlwifi 0000:0e:00.0: WFPM_LMAC2_PD_NOTIFICATION: 0x1f [ 7500.965485] iwlwifi 0000:0e:00.0: WFPM_AUTH_KEY_0: 0x90 [ 7500.965497] iwlwifi 0000:0e:00.0: CNVI_SCU_SEQ_DATA_DW9: 0x0 [ 7549.366308] bucket 0:13751 gen 4 has wrong data_type: got btree, should be need_discard, fixing [ 7549.366313] bucket 0:13751 gen 4 data type need_discard has wrong dirty_sectors: got 1024, should be 0, fixing [ 7550.246114] bucket 1:30811 gen 2 has wrong data_type: got btree, should be need_discard, fixing [ 7550.246119] bucket 1:30811 gen 2 data type need_discard has wrong dirty_sectors: got 1024, should be 0, fixing [ 7562.470944] dev 0 has wrong btree buckets: got 91890, should be 91889, fixing [ 7562.470948] dev 0 has wrong btree sectors: got 53235200, should be 53234176, fixing [ 7562.470950] dev 0 has wrong need_discard buckets: got 10, should be 11, fixing [ 7562.470953] dev 1 has wrong btree buckets: got 91890, should be 91889, fixing [ 7562.470954] dev 1 has wrong btree sectors: got 53235200, should be 53234176, fixing [ 7562.470955] dev 1 has wrong need_discard buckets: got 11, should be 12, fixing [ 7562.470961] fs has wrong btree: got 106470400, should be 106468352, fixing [ 7562.470964] fs has wrong btree: 1/2 [0 1]: got 106470400, should be 106468352, fixing [ 7562.497474] done [ 7562.544354] bcachefs (0d21127d-8c69-4c51-b44a-6ef94b8a1216): going read-write

koverstreet commented 3 months ago

echo w > /proc/sysrq-trigger; dmesg top perf top perf top -e bcachefs:*

PrefersAwkward commented 3 months ago

I think I understand some of these things. I ran them all, but I don't know how to read / share of some of them. I'm not familiar with "perf" or "echo w > /proc/sysrq-trigger" for example.

Here are some things I saw when I ran that:

When I ran this: perf top 5.55% [kernel] [k] perf_adjust_freq 2.75% [kernel] [k] srso_alias_return_thunk 2.20% libc.so.6 [.] memmove_ 1.86% [kernel] [k] srso_alias_safe_ret 1.31% [kernel] [k] perf_pmu_nop_void 1.03% [kernel] [k] read_tsc 0.93% [amdgpu] [k] amdgpu_cgs_read_register 0.82% [kernel] [k] io_idle 0.80% perf [.] rb_next 0.74% [kernel] [k] asm_sysvec_apic_ 0.61% [kernel] [k] menu_select 0.58% libQt6Core.so.6.6.2 [.] QtPrivate::compa 0.52% [kernel] [k] perf_poll 0.51% [amdgpu] [k] gfx_v11_0_get_gp 0.51% [kernel] [k] native_write_msr 0.51% [kernel] [k] _copy_to_user 0.49% [kernel] [k] native_sched_clock 0.45% [amdgpu] [k] amdgpu_device_rreg 0.44% libQuickCharts.so.6.0.0 [.] MapProxySource::item(int) 0.41% [kernel] [k] native_read_msr 0.39% [kernel] [k] ktime_get 0.39% perf [.] evsel__parse_sample 0.35% [kernel] [k] do_idle 0.35% perf [.] perf_hpphrtimer_run_queues 0.28% [kernel] [k] update_blocked_averages 0.28% [kernel] [k] merge_sched_in 0.27% libc.so.6 [.] _int_malloc 0.25% [kernel] [k] get_nex 0.25% libKSysGuardSensors.so.6.0.3 [.] KSysGuard::Senso 0.25% [kernel] [k] perf_event_update_userpage 0.23% [kernel] [k] arch_perf_update_userpage 0.23% [kernel] [k] update_sd_lb_sta 0.23% [kernel] [k] psi_group_change 0.22% [kernel] [k] scheduler_tick 0.22% [kernel] [k] do_sys_poll 0.22% [kernel] [k] update_rq_clock 0.21% perf [.] dso__find_symbol 0.21% [kernel] [k] sched_clock_cpu 0.20% firefox [.] free _unthr_context ▒ ▒ avx512_unaligned_erms ▒ ▒ ▒ ▒ ▒ ▒ ▒ timer_interrupt ▒ ▒ reStrings(QStringView, QStringView, Qt::CaseSensitivity) ▒ ▒ u_clock_counter ▒ ▒ ▒ ▒ ▒ const ▒ ▒ ▒ ▒ ▒ >is_dynamic_entry ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ ▒ t_timer_interrupt ▒ rDataModel::data(QModelIndex const&, int) const ▒ ▒ ▒ ts.constprop.0 ▒ ▒ ▒ ▒ ▒ ▒ ▒

When I ran this: Dmesg [ 7549.366308] bucket 0:13751 gen 4 has wrong data_type: got btree, should be need_discard, fixing [ 7549.366313] bucket 0:13751 gen 4 data type need_discard has wrong dirty_sectors: got 1024, should be 0, fixing [ 7550.246114] bucket 1:30811 gen 2 has wrong data_type: got btree, should be need_discard, fixing [ 7550.246119] bucket 1:30811 gen 2 data type need_discard has wrong dirty_sectors: got 1024, should be 0, fixing [ 7562.470944] dev 0 has wrong btree buckets: got 91890, should be 91889, fixing [ 7562.470948] dev 0 has wrong btree sectors: got 53235200, should be 53234176, fixing [ 7562.470950] dev 0 has wrong need_discard buckets: got 10, should be 11, fixing [ 7562.470953] dev 1 has wrong btree buckets: got 91890, should be 91889, fixing [ 7562.470954] dev 1 has wrong btree sectors: got 53235200, should be 53234176, fixing [ 7562.470955] dev 1 has wrong need_discard buckets: got 11, should be 12, fixing [ 7562.470961] fs has wrong btree: got 106470400, should be 106468352, fixing [ 7562.470964] fs has wrong btree: 1/2 [0 1]: got 106470400, should be 106468352, fixing [ 7562.497474] done [ 7562.544354] bcachefs (0d21127d-8c69-4c51-b44a-6ef94b8a1216): going read-write [ 7562.552413] bcachefs (0d21127d-8c69-4c51-b44a-6ef94b8a1216): journal_replay...

And when I ran this: perf top -e bcachefs:*

Available samples 0 bcachefs:btree_path_upgrade_fail ▒ 0 bcachefs:btree_path_relock_fail ▒ 0 bcachefs:btree_node_set_root ▒ 0 bcachefs:btree_node_rewrite ▒ 0 bcachefs:btree_node_split ▒ 0 bcachefs:btree_node_merge ▒ 0 bcachefs:btree_node_compact ▒ 0 bcachefs:btree_reserve_get_fail ▒ 0 bcachefs:btree_node_free ▒ 0 bcachefs:btree_node_alloc ▒ 0 bcachefs:btree_node_write ▒ 0 bcachefs:btree_node_read ▒ 725 bcachefs:btree_cache_cannibalize_unlock ▒ 0 bcachefs:btree_cache_cannibalize ▒ 725 bcachefs:btree_cache_cannibalize_lock ▒ 0 bcachefs:btree_cache_cannibalize_lock_fail ▒ 0 bcachefs:btree_cache_reap ▒ 0 bcachefs:btree_cache_scan ▒ 0 bcachefs:bkey_pack_pos_fail ▒ 725 bcachefs:journal_reclaim_finish ▒ 725 bcachefs:journal_reclaim_start ▒ 0 bcachefs:journal_write ▒ 0 bcachefs:journal_entry_close ▒ 0 bcachefs:journal_entry_full ▒ 0 bcachefs:journal_full ▒ 0 bcachefs:read_reuse_race ▒ 0 bcachefs:read_retry ▒ 0 bcachefs:read_split ▒ 0 bcachefs:read_bounce ▒ 0 bcachefs:read_nopromote ▒ 0 bcachefs:read_promote ▒ 0 bcachefs:write_super ▒

PrefersAwkward commented 3 months ago

Okay, after poking a bit, it looks like Dmesg continues to print bcachefs stuff that I think you might find useful. If I understand it correctly, it is still working on something, so I'll let it run overnight and perhaps through tomorrow, and check on it now and then:

Here is a pastebin of dmesg: https://pastebin.com/gvWkATkd

PrefersAwkward commented 3 months ago

I left it running since yesterday. The command is still in a running state and the filesystem remains unmounted. I left IOStat on to log the I/O operations while I was away. It looks like the disks involved in the FS haven't done anything.

I'll go ahead and reboot to clear things up. Let me know if you'd like for me to try anything other commands or any new branches on this filesystem. I can also rejoin IRC and pick things up there if you'd like.

yekm commented 3 months ago

Similar issues here. Archlinux whith latest updates 6.8.5-arch1-1 and extra/bcachefs-tools 3:1.6.4-2. Two ssd with two 128G partitions, ~10 6TB hdds, 128G ddr4, 24 core xeon.

either btree trans held srcu lock (delaying memory reclaim) for 38 seconds or kernel BUG at fs/bcachefs/journal_io.c:1809! followed by a lot of bch2_btree_update_start(): error journal_reclaim_would_deadlock And then a zero activity in iotop and several processes (bcachefs, bch kernel threads) in D sate.

One time after couple of hours part of the system became unresponsive (several systemd-journald and other unrelated daemons was in D). So, each time this error appeared, I was constantly forcefully plugging my server on and off.

At the beginning there was about 6 hdd and 2 ssd. Other hdds was added afterwards. Initial setting was metadata_target=ssd and metadata_replicas=3 (turns out this is a maximum number, even 4 is too big). Then I started copying data in, everything was fine. Shortly after I saw that there is not enough space in two 64G partitions for metadata. I've resized them online to 128G. Everything worked fine. After adding a couple of hdds an error mentioned above appeared. I was able to fully recover by running mount with -o fsck,fix_errors. This error appeared again then I was trying to evacuate one of the hdd. Again fsck,fix_errors seemed like fixed everything. I've succesfully removed one hdd. After another batch of hdd was added and data copied in I noticed that both ssd had ~55G of user data on them. This was undesirable. data rebalance was silently doing nothing. I've changed foreground_target from none to hdd. data rebalance was silently doing nothing still. I've decided to evacuate one of the ssds (the plan was to add and rebalance it again later). During this operation similar errors appeared. And everything broke down completely. Neither mounting with fsck,fix_errors nor fsck.bcachefs works. First one hangs with bch2_btree_update_start(): error journal_reclaim_would_deadlock, second tries to fix something and aborts with fsck.bcachefs: libbcachefs/journal_io.c:1849: bch2_journal_write_prep: Assertion '!(u64s > j->entry_u64s_reserved)' failed. (also fsck.bcachefs works about 3-4 times faster)

I've tried to mount and fsck.bcachefs (-o for fsck is not documented in -h) with one or none ssds (it's 3 metadata replicas, right?), but got no luck. I have a core of fsck.bcachefs but it's 11G and "truncated" (whatever coredumpctl means by it) fsck.bcachefs -R produces more manageable core around 70M.

What options do I have left to try to recover my data?

koverstreet commented 3 months ago

Try my master branch - we just got reedriley's filesystem working again last night

koverstreet commented 3 months ago

If it doesn't mount, jump on IRC and start sending me logs and sysfs/debugfs debug output

shiwork commented 3 months ago

I have success mounting for an unclear shutdown filesystem and looks working well to me. I built the latest master branch(commit hash is d82346f)

kernel: 6.9.0-rc2+ bcachefs-tool: 1.6.4

thanks for your work

koverstreet / bcachefs

Cannot mount filesystem after unclean shutdown (kernel thread hangs during journal replay) #656