koverstreet / bcachefs

Other
675 stars 69 forks source link

bch-rebalance thread crash and hang - unable to handle page fault (LZ4HC implicated) #658

Open Wingar opened 6 months ago

Wingar commented 6 months ago

I'm running a Gentoo machine on Kernel 6.7.6 (from Gentoo-sources)/1.6.4 Tools with a 2-tier bcachefs filesystem (erasure coding, 2-replicas, encryption, lz4 compression), and every now and then the bch-rebalance kernel thread on my fs crashes, which renders writing to the background target inoperable.

However, reads from all devices perfectly functional and writes to the foreground target function fine (All functions, even clearing cache to free space for incoming data). No loss in btree, journal, et al functionality, just the automated flush from foreground to background target.

I can somewhat reliably replicate this when I have multiple processes writing to the FS at one time (Multiple samba connections, lftp, et al). The circumstances tend to be when I'm pushing the write speed of the foreground SSDs to their limit (700MB/s or so per SSD), however this does not happen when one or a few processes are writing at once, only when there are many.

After this happens, any attempt to unmount the filesystem (It is not root) results in a hung umount and zombified process. The best way to resolve the issue is to reboot the VM, waiting for systemd to forcibly kill the processes and threads itself. This is the cleanest way to shut down, but even then the occasional fsck shows up some small but fixable errors.

Aside from this, I have not suffered any data loss or corruption from this (file-level checksum verified)

The machine itself is a VM running under KVM/Libvirt on a Rocky Linux 9.3 host, with each device passed through as raw block devices with the virtio bus, no caching, type raw, io native. I've included a segment of the VMs libvirt configuration xml to show how it is configured directly.

My configuration is 2 (200GB) foreground SSDs, 14 (1.2TB) background HDDs. Filesystem/Mount options: metadata_replicas=2,data_replicas=2,background_compression=lz4:7,metadata_target=ssd,foreground_target=ssd,background_target=hdd,promote_target=ssd,erasure_code,verbose

Dmesg announcing thread crash:

[ 2435.684750] #PF: supervisor write access in kernel mode
[ 2435.684788] #PF: error_code(0x0002) - not-present page
[ 2435.684820] PGD 100000067 P4D 100000067 PUD 1001ee067 PMD 16c7b1067 PTE 0
[ 2435.684886] Oops: 0002 [#1] PREEMPT SMP PTI
[ 2435.684920] CPU: 0 PID: 788 Comm: bch-rebalance/2 Not tainted 6.7.6-gentoo-x86_64 #2
[ 2435.684957] Hardware name: Red Hat KVM/RHEL, BIOS edk2-20230524-4.el9_3 05/24/2023
[ 2435.684995] RIP: 0010:LZ4HC_compress_generic+0x37f/0x1ac0 [lz4hc_compress]
[ 2435.685038] Code: 00 83 f9 0e 0f 8f cb 15 00 00 48 8b 7c 24 10 89 ca c1 e2 04 88 17 48 8b 4c 24 38 48 8d 14 30 48 8b 31 48 83 c0 08 48 83 c1 08 <48> 89 70 f8 48 39 d0 72 ec 48 8b 44 24 20 48 8b 7c 24 50 48 83 c2
[ 2435.685103] RSP: 0018:ffffb9cb8114f6e0 EFLAGS: 00010296
[ 2435.685137] RAX: ffffb9cb8dc2d001 RBX: ffffb9cb8dc1d000 RCX: ffffb9cb8dc3df00
[ 2435.685169] RDX: ffffb9cb8dc2cffa RSI: fbc66285b18817ff RDI: ffffb9cb8dc1d000
[ 2435.685201] RBP: 0000000000010000 R08: ffff97f6b4620000 R09: 0000000000010000
[ 2435.685233] R10: ffffb9cb8dc1d000 R11: 00000000c66285b1 R12: ffff97f6b4620000
[ 2435.685265] R13: 0000000000000000 R14: ffffb9cb8dc3def9 R15: ffffb9cb8dc3defd
[ 2435.685297] FS:  0000000000000000(0000) GS:ffff97ff3fa00000(0000) knlGS:0000000000000000
[ 2435.685332] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2435.685367] CR2: ffffb9cb8dc2d000 CR3: 00000001be806002 CR4: 0000000000170ef0
[ 2435.685402] Call Trace:
[ 2435.685429]  <TASK>
[ 2435.685458]  ? __die+0x23/0x70
[ 2435.685496]  ? page_fault_oops+0x15d/0x440
[ 2435.685529]  ? fixup_exception+0x26/0x310
[ 2435.685580]  ? exc_page_fault+0x16a/0x170
[ 2435.685626]  ? asm_exc_page_fault+0x26/0x30
[ 2435.685661]  ? LZ4HC_compress_generic+0x37f/0x1ac0 [lz4hc_compress]
[ 2435.685698]  LZ4_compress_HC+0x7b/0x90 [lz4hc_compress]
[ 2435.685734]  attempt_compress+0x1e6/0x200 [bcachefs]
[ 2435.685944]  ? __get_free_pages+0x11/0x40
[ 2435.685978]  ? mempool_alloc_vp+0x2f/0x50 [bcachefs]
[ 2435.686062]  ? mempool_alloc+0x66/0x1a0
[ 2435.686097]  bch2_bio_compress+0x22c/0x4c0 [bcachefs]
[ 2435.686183]  __bch2_write+0x122f/0x1340 [bcachefs]
[ 2435.686281]  ? __bch2_increment_clock+0x2d/0x140 [bcachefs]
[ 2435.686365]  ? _raw_spin_unlock+0xe/0x30
[ 2435.686397]  ? bch2_write+0x2c4/0x450 [bcachefs]
[ 2435.686486]  ? bch2_moving_ctxt_do_pending_writes+0xea/0x120 [bcachefs]
[ 2435.686600]  bch2_moving_ctxt_do_pending_writes+0xea/0x120 [bcachefs]
[ 2435.686698]  bch2_move_ratelimit+0x1b4/0x410 [bcachefs]
[ 2435.686791]  ? __pfx_autoremove_wake_function+0x10/0x10
[ 2435.686839]  do_rebalance+0x13a/0x830 [bcachefs]
[ 2435.686959]  ? kvm_sched_clock_read+0x11/0x20
[ 2435.687448]  ? local_clock_noinstr+0xd/0xb0
[ 2435.687809]  ? __bch2_trans_get+0x303/0x360 [bcachefs]
[ 2435.688171]  ? __pfx_bch2_rebalance_thread+0x10/0x10 [bcachefs]
[ 2435.688523]  bch2_rebalance_thread+0x57/0xa0 [bcachefs]
[ 2435.688913]  ? bch2_rebalance_thread+0x4d/0xa0 [bcachefs]
[ 2435.689275]  ? __pfx_closure_sync_fn+0x10/0x10
[ 2435.689583]  kthread+0xe8/0x120
[ 2435.689868]  ? __pfx_kthread+0x10/0x10
[ 2435.690125]  ret_from_fork+0x34/0x50
[ 2435.690416]  ? __pfx_kthread+0x10/0x10
[ 2435.690735]  ret_from_fork_asm+0x1b/0x30
[ 2435.691018]  </TASK>
[ 2435.691255] Modules linked in: poly1305_generic libpoly1305 poly1305_x86_64 chacha_generic chacha_x86_64 libchacha bcachefs crc64 lz4hc_compress lz4_compress xor raid6_pq intel_rapl_msr intel_rapl_common vfat rapl fat iTCO_wdt iTCO_vendor_support i2c_i801 pcspkr lpc_ich i2c_smbus mfd_core virtio_balloon joydev drm backlight fuse loop efi_pstore i2c_core dm_mod configfs nfnetlink xfs sr_mod cdrom crct10dif_pclmul crc32_pclmul libcrc32c crc32c_intel ghash_clmulni_intel sha512_ssse3 xhci_pci xhci_pci_renesas virtio_net ahci sha256_ssse3 net_failover xhci_hcd sha1_ssse3 libahci failover serio_raw efivarfs qemu_fw_cfg virtio_pci virtio_pci_legacy_dev virtio_pci_modern_dev virtio_rng aesni_intel crypto_simd cryptd
[ 2435.692732] CR2: ffffb9cb8dc2d000
[ 2435.693016] ---[ end trace 0000000000000000 ]---
[ 2437.015754] RIP: 0010:LZ4HC_compress_generic+0x37f/0x1ac0 [lz4hc_compress]
[ 2437.017693] Code: 00 83 f9 0e 0f 8f cb 15 00 00 48 8b 7c 24 10 89 ca c1 e2 04 88 17 48 8b 4c 24 38 48 8d 14 30 48 8b 31 48 83 c0 08 48 83 c1 08 <48> 89 70 f8 48 39 d0 72 ec 48 8b 44 24 20 48 8b 7c 24 50 48 83 c2
[ 2437.018456] RSP: 0018:ffffb9cb8114f6e0 EFLAGS: 00010296
[ 2437.018868] RAX: ffffb9cb8dc2d001 RBX: ffffb9cb8dc1d000 RCX: ffffb9cb8dc3df00
[ 2437.019185] RDX: ffffb9cb8dc2cffa RSI: fbc66285b18817ff RDI: ffffb9cb8dc1d000
[ 2437.019485] RBP: 0000000000010000 R08: ffff97f6b4620000 R09: 0000000000010000
[ 2437.019797] R10: ffffb9cb8dc1d000 R11: 00000000c66285b1 R12: ffff97f6b4620000
[ 2437.020095] R13: 0000000000000000 R14: ffffb9cb8dc3def9 R15: ffffb9cb8dc3defd
[ 2437.020421] FS:  0000000000000000(0000) GS:ffff97ff3fa00000(0000) knlGS:0000000000000000
[ 2437.020843] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2437.021268] CR2: ffffb9cb8dc2d000 CR3: 00000001be806002 CR4: 0000000000170ef0
[ 2437.021735] note: bch-rebalance/2[788] exited with irqs disabled

bcachefs fs usage:

Filesystem: 2353ad4f-f54a-4a6d-b838-596270f9eebc
Size:                       14.4 TiB
Used:                       5.40 TiB
Online reserved:            23.0 KiB

Data type       Required/total  Durability    Devices
btree:          1/2             2             [vde vdq]           2.25 GiB
btree:          1/2             2             [vdq vdp]           44.5 GiB
user:           1/2             2             [vdi vdk]            581 MiB
user:           1/2             2             [vdd vdb]           5.31 GiB
user:           1/2             2             [vdo vdc]            518 MiB
user:           1/2             2             [vde vdm]           1.83 GiB
user:           1/2             2             [vdg vdl]           2.80 GiB
user:           1/2             2             [vdk vdn]            319 MiB
user:           1/2             2             [vde vdd]           3.27 GiB
user:           1/2             2             [vdd vdi]           2.41 GiB
user:           1/2             2             [vdf vdn]           2.09 GiB
user:           1/2             2             [vdh vdk]           3.72 GiB
user:           1/2             2             [vdj vdl]           3.79 GiB
user:           1/2             2             [vdl vdb]            336 MiB
user:           1/1             1             [vde]                  512 B
user:           1/2             2             [vde vdi]           1.96 GiB
user:           1/2             2             [vde vdb]           5.41 GiB
user:           1/2             2             [vdd vdm]           1.49 GiB
user:           1/2             2             [vdf vdj]           3.52 GiB
user:           1/2             2             [vdg vdh]           3.43 GiB
user:           1/2             2             [vdg vdc]           4.93 GiB
user:           1/2             2             [vdh vdo]           9.17 GiB
user:           1/2             2             [vdi vdo]           5.20 GiB
user:           1/2             2             [vdj vdc]            459 MiB
user:           1/2             2             [vdl vdm]           9.40 GiB
user:           1/2             2             [vdm vdb]           2.69 GiB
user:           1/1             1             [vdj]               30.5 KiB
user:           1/2             2             [vde vdg]           3.86 GiB
user:           1/2             2             [vde vdk]           2.43 GiB
user:           1/2             2             [vde vdo]           1.51 GiB
user:           1/2             2             [vdd vdg]           5.60 GiB
user:           1/2             2             [vdd vdk]           3.53 GiB
user:           1/2             2             [vdd vdo]           1.20 GiB
user:           1/2             2             [vdf vdh]            501 MiB
user:           1/2             2             [vdf vdl]           1.19 GiB
user:           1/2             2             [vdf vdc]           4.80 GiB
user:           1/2             2             [vdg vdj]           3.66 GiB
user:           1/2             2             [vdg vdn]            309 MiB
user:           1/2             2             [vdh vdi]            628 MiB
user:           1/2             2             [vdh vdm]           1.22 GiB
user:           1/2             2             [vdh vdb]           5.19 GiB
user:           1/2             2             [vdi vdm]            556 MiB
user:           1/2             2             [vdi vdb]            322 MiB
user:           1/2             2             [vdj vdn]           4.51 GiB
user:           1/2             2             [vdk vdl]            591 MiB
user:           1/2             2             [vdk vdc]           7.16 GiB
user:           1/2             2             [vdl vdo]           1.59 GiB
user:           1/2             2             [vdm vdo]           8.98 GiB
user:           1/2             2             [vdn vdc]            616 MiB
user:           1/2             2             [vdc vdb]           2.39 GiB
user:           1/1             1             [vdd]               64.0 KiB
user:           1/1             1             [vdm]               1.50 KiB
user:           1/2             2             [vde vdf]           3.56 GiB
user:           1/2             2             [vde vdh]           2.97 GiB
user:           1/2             2             [vde vdj]           2.80 GiB
user:           1/2             2             [vde vdl]           2.29 GiB
user:           1/2             2             [vde vdn]            987 MiB
user:           1/2             2             [vde vdc]           4.71 GiB
user:           1/2             2             [vdd vdf]           3.88 GiB
user:           1/2             2             [vdd vdh]           3.38 GiB
user:           1/2             2             [vdd vdj]           2.88 GiB
user:           1/2             2             [vdd vdl]           1.80 GiB
user:           1/2             2             [vdd vdn]           1.43 GiB
user:           1/2             2             [vdd vdc]           1.67 GiB
user:           1/2             2             [vdf vdg]           2.69 GiB
user:           1/2             2             [vdf vdi]           3.98 GiB
user:           1/2             2             [vdf vdk]           4.74 GiB
user:           1/2             2             [vdf vdm]           3.45 GiB
user:           1/2             2             [vdf vdo]            583 MiB
user:           1/2             2             [vdf vdb]           2.83 GiB
user:           1/2             2             [vdg vdi]            629 MiB
user:           1/2             2             [vdg vdk]           3.03 GiB
user:           1/2             2             [vdg vdm]           1.28 GiB
user:           1/2             2             [vdg vdo]           1.56 GiB
user:           1/2             2             [vdg vdb]           4.16 GiB
user:           1/2             2             [vdh vdj]           1.82 GiB
user:           1/2             2             [vdh vdl]           3.93 GiB
user:           1/2             2             [vdh vdn]           1.45 GiB
user:           1/2             2             [vdh vdc]            334 MiB
user:           1/2             2             [vdi vdj]            562 MiB
user:           1/2             2             [vdi vdl]            511 MiB
user:           1/2             2             [vdi vdn]           14.4 GiB
user:           1/2             2             [vdi vdc]           6.07 GiB
user:           1/2             2             [vdj vdk]           5.03 GiB
user:           1/2             2             [vdj vdm]            827 MiB
user:           1/2             2             [vdj vdo]           5.56 GiB
user:           1/2             2             [vdj vdb]           2.32 GiB
user:           1/2             2             [vdk vdm]            632 MiB
user:           1/2             2             [vdk vdo]            464 MiB
user:           1/2             2             [vdk vdb]           5.56 GiB
user:           1/2             2             [vdl vdn]           9.27 GiB
user:           1/2             2             [vdl vdc]            260 MiB
user:           1/2             2             [vdm vdn]           1.52 GiB
user:           1/2             2             [vdm vdc]           3.90 GiB
user:           1/2             2             [vdn vdo]            509 MiB
user:           1/2             2             [vdn vdb]            337 MiB
user:           1/2             2             [vdo vdb]            960 MiB
user:           1/2             2             [vdq vdp]            135 GiB
user:           13/14           14            [vde vdd vdf vdg vdh vdi vdj vdk vdl vdm vdn vdo vdc vdb] 4.58 TiB
user:           14/15           15            [vde vdd vdf vdg vdh vdi vdj vdk vdl vdm vdn vdo vdc vdb vdp] 2.75 MiB
user:           14/15           15            [vde vdd vdf vdg vdh vdi vdj vdk vdl vdm vdn vdo vdc vdb vdq] 6.50 MiB
cached:         1/1             1             [vdi]               76.3 MiB
cached:         1/1             1             [vdb]               77.0 MiB
cached:         1/1             1             [vdd]               73.3 MiB
cached:         1/1             1             [vdm]               73.2 MiB
parity:         14/15           15            [vde vdd vdf vdg vdh vdi vdj vdk vdl vdm vdn vdo vdc vdb vdq] 512 KiB
cached:         1/1             1             [vdg]               83.3 MiB
cached:         1/1             1             [vdk]               71.8 MiB
cached:         1/1             1             [vdo]               76.1 MiB
cached:         1/1             1             [vdp]               74.2 GiB
parity:         14/15           15            [vde vdd vdf vdg vdh vdi vdj vdk vdl vdm vdn vdo vdc vdb vdp] 256 KiB
cached:         1/1             1             [vde]                953 MiB
cached:         1/1             1             [vdf]               87.5 MiB
cached:         1/1             1             [vdh]               76.3 MiB
cached:         1/1             1             [vdj]               76.1 MiB
cached:         1/1             1             [vdl]               76.6 MiB
cached:         1/1             1             [vdn]               75.4 MiB
cached:         1/1             1             [vdc]               73.9 MiB
cached:         1/1             1             [vdq]               86.5 GiB
parity:         13/14           14            [vde vdd vdf vdg vdh vdi vdj vdk vdl vdm vdn vdo vdc vdb] 361 GiB

hdd.hdd01 (device 0):            vde              rw
                                data         buckets    fragmented
  free:                      733 GiB         3002065
  sb:                       3.00 MiB              13       252 KiB
  journal:                  2.00 GiB            8192
  btree:                    1.13 GiB            4608
  user:                     18.7 GiB           77129       103 MiB
  cached:                    953 MiB            6109
  parity:                   25.7 GiB          105112
  stripe:                    335 GiB         1375338       328 MiB
  need_gc_gens:                  0 B               0
  need_discard:                  0 B               0
  capacity:                 1.09 TiB         4578566

hdd.hdd02 (device 1):            vdd              rw
                                data         buckets    fragmented
  free:                      735 GiB         3011666
  sb:                       3.00 MiB              13       252 KiB
  journal:                  2.00 GiB            8192
  btree:                         0 B               0
  user:                     18.9 GiB           77657       106 MiB
  cached:                   73.3 MiB             588
  parity:                   25.9 GiB          106092
  stripe:                    335 GiB         1374358       331 MiB
  need_gc_gens:                  0 B               0
  need_discard:                  0 B               0
  capacity:                 1.09 TiB         4578566

hdd.hdd03 (device 2):            vdf              rw
                                data         buckets    fragmented
  free:                      735 GiB         3011677
  sb:                       3.00 MiB              13       252 KiB
  journal:                  2.00 GiB            8192
  btree:                         0 B               0
  user:                     18.8 GiB           77544       104 MiB
  cached:                   87.5 MiB             690
  parity:                   25.9 GiB          106131
  stripe:                    335 GiB         1374319       334 MiB
  need_gc_gens:                  0 B               0
  need_discard:                  0 B               0
  capacity:                 1.09 TiB         4578566

hdd.hdd03 (device 3):            vdg              rw
                                data         buckets    fragmented
  free:                      735 GiB         3011500
  sb:                       3.00 MiB              13       252 KiB
  journal:                  2.00 GiB            8192
  btree:                         0 B               0
  user:                     18.9 GiB           77766       103 MiB
  cached:                   83.3 MiB             645
  parity:                   25.9 GiB          106147
  stripe:                    335 GiB         1374303       337 MiB
  need_gc_gens:                  0 B               0
  need_discard:                  0 B               0
  capacity:                 1.09 TiB         4578566

hdd.hdd04 (device 4):            vdh              rw
                                data         buckets    fragmented
  free:                      735 GiB         3011914
  sb:                       3.00 MiB              13       252 KiB
  journal:                  2.00 GiB            8192
  btree:                         0 B               0
  user:                     18.8 GiB           77357       102 MiB
  cached:                   76.3 MiB             640
  parity:                   25.8 GiB          105698
  stripe:                    335 GiB         1374752       337 MiB
  need_gc_gens:                  0 B               0
  need_discard:                  0 B               0
  capacity:                 1.09 TiB         4578566

hdd.hdd05 (device 5):            vdi              rw
                                data         buckets    fragmented
  free:                      735 GiB         3011954
  sb:                       3.00 MiB              13       252 KiB
  journal:                  2.00 GiB            8192
  btree:                         0 B               0
  user:                     18.8 GiB           77367       103 MiB
  cached:                   76.3 MiB             590
  parity:                   25.8 GiB          105733
  stripe:                    335 GiB         1374717       326 MiB
  need_gc_gens:                  0 B               0
  need_discard:                  0 B               0
  capacity:                 1.09 TiB         4578566

hdd.hdd06 (device 6):            vdj              rw
                                data         buckets    fragmented
  free:                      735 GiB         3011931
  sb:                       3.00 MiB              13       252 KiB
  journal:                  2.00 GiB            8192
  btree:                         0 B               0
  user:                     18.8 GiB           77375       111 MiB
  cached:                   76.1 MiB             605
  parity:                   25.8 GiB          105714
  stripe:                    335 GiB         1374736       324 MiB
  need_gc_gens:                  0 B               0
  need_discard:                  0 B               0
  capacity:                 1.09 TiB         4578566

hdd.hdd07 (device 7):            vdk              rw
                                data         buckets    fragmented
  free:                      735 GiB         3011987
  sb:                       3.00 MiB              13       252 KiB
  journal:                  2.00 GiB            8192
  btree:                         0 B               0
  user:                     18.8 GiB           77350       106 MiB
  cached:                   71.8 MiB             574
  parity:                   25.8 GiB          105700
  stripe:                    335 GiB         1374750       335 MiB
  need_gc_gens:                  0 B               0
  need_discard:                  0 B               0
  capacity:                 1.09 TiB         4578566

hdd.hdd08 (device 8):            vdl              rw
                                data         buckets    fragmented
  free:                      735 GiB         3011957
  sb:                       3.00 MiB              13       252 KiB
  journal:                  2.00 GiB            8192
  btree:                         0 B               0
  user:                     18.8 GiB           77368       107 MiB
  cached:                   76.6 MiB             586
  parity:                   25.8 GiB          105657
  stripe:                    335 GiB         1374793       332 MiB
  need_gc_gens:                  0 B               0
  need_discard:                  0 B               0
  capacity:                 1.09 TiB         4578566

hdd.hdd09 (device 9):            vdm              rw
                                data         buckets    fragmented
  free:                      735 GiB         3012001
  sb:                       3.00 MiB              13       252 KiB
  journal:                  2.00 GiB            8192
  btree:                         0 B               0
  user:                     18.8 GiB           77328       102 MiB
  cached:                   73.2 MiB             582
  parity:                   25.8 GiB          105672
  stripe:                    335 GiB         1374778       333 MiB
  need_gc_gens:                  0 B               0
  need_discard:                  0 B               0
  capacity:                 1.09 TiB         4578566

hdd.hdd10 (device 10):           vdn              rw
                                data         buckets    fragmented
  free:                      735 GiB         3011925
  sb:                       3.00 MiB              13       252 KiB
  journal:                  2.00 GiB            8192
  btree:                         0 B               0
  user:                     18.8 GiB           77372       108 MiB
  cached:                   75.4 MiB             614
  parity:                   25.8 GiB          105678
  stripe:                    335 GiB         1374772       337 MiB
  need_gc_gens:                  0 B               0
  need_discard:                  0 B               0
  capacity:                 1.09 TiB         4578566

hdd.hdd11 (device 11):           vdo              rw
                                data         buckets    fragmented
  free:                      735 GiB         3011920
  sb:                       3.00 MiB              13       252 KiB
  journal:                  2.00 GiB            8192
  btree:                         0 B               0
  user:                     18.8 GiB           77357       108 MiB
  cached:                   76.1 MiB             634
  parity:                   25.8 GiB          105670
  stripe:                    335 GiB         1374780       339 MiB
  need_gc_gens:                  0 B               0
  need_discard:                  0 B               0
  capacity:                 1.09 TiB         4578566

hdd.hdd12 (device 12):           vdc              rw
                                data         buckets    fragmented
  free:                      735 GiB         3011981
  sb:                       3.00 MiB              13       252 KiB
  journal:                  2.00 GiB            8192
  btree:                         0 B               0
  user:                     18.8 GiB           77335       102 MiB
  cached:                   73.9 MiB             595
  parity:                   25.8 GiB          105735
  stripe:                    335 GiB         1374715       330 MiB
  need_gc_gens:                  0 B               0
  need_discard:                  0 B               0
  capacity:                 1.09 TiB         4578566

hdd.hdd13 (device 13):           vdb              rw
                                data         buckets    fragmented
  free:                      735 GiB         3011863
  sb:                       3.00 MiB              13       252 KiB
  journal:                  2.00 GiB            8192
  btree:                         0 B               0
  user:                     18.8 GiB           77422     100.0 MiB
  cached:                   77.0 MiB             626
  parity:                   25.8 GiB          105711
  stripe:                    335 GiB         1374739       340 MiB
  need_gc_gens:                  0 B               0
  need_discard:                  0 B               0
  capacity:                 1.09 TiB         4578566

ssd.ssd01 (device 14):           vdq              rw
                                data         buckets    fragmented
  free:                     7.29 GiB           29863
  sb:                       3.00 MiB              13       252 KiB
  journal:                  1.46 GiB            5961
  btree:                    23.4 GiB           95731
  user:                     67.6 GiB          276783      10.1 MiB
  cached:                   86.5 GiB          354775
  parity:                        0 B               0
  stripe:                        0 B               2
  need_gc_gens:                  0 B               0
  need_discard:                  0 B               0
  capacity:                  186 GiB          763128

ssd.ssd02 (device 15):           vdp              rw
                                data         buckets    fragmented
  free:                     7.29 GiB           14930
  sb:                       3.00 MiB               7       508 KiB
  journal:                  1.46 GiB            2980
  btree:                    22.2 GiB           72978      13.4 GiB
  user:                     67.6 GiB          138410      19.4 MiB
  cached:                   74.2 GiB          152258
  parity:                        0 B               0
  stripe:                        0 B               1
  need_gc_gens:                  0 B               0
  need_discard:                  0 B               0
  capacity:                  186 GiB          381564

bcachefs show-super

External UUID:                              2353ad4f-f54a-4a6d-b838-596270f9eebc
Internal UUID:                              d649b677-ad45-46e0-8203-4259fb360d13
Magic number:                               c68573f6-66ce-90a9-d96a-60cf803df7ef
Device index:                               0
Label:
Version:                                    1.3: rebalance_work
Version upgrade complete:                   1.3: rebalance_work
Oldest version on disk:                     1.3: rebalance_work
Created:                                    Tue Feb 27 20:45:42 2024
Sequence number:                            293
Time of last write:                         Wed Mar  6 12:40:27 2024
Superblock size:                            16.2 KiB/1.00 MiB
Clean:                                      0
Devices:                                    16
Sections:                                   members_v1,crypt,disk_groups,clean,replicas,journal_seq_blacklist,journal_v2,counters,members_v2,errors,ext,downgrade
Features:                                   lz4,gzip,zstd,ec,journal_seq_blacklist_v3,new_siphash,inline_data,new_extent_overwrite,btree_ptr_v2,extents_above_btree_updates,btree_updates_journalled,new_varint,journal_no_flush,alloc_v2,extents_across_btree_nodes
Compat features:                            alloc_info,alloc_metadata,extents_above_btree_updates_done,bformat_overflow_done

Options:
  block_size:                               512 B
  btree_node_size:                          256 KiB
  errors:                                   continue [ro] panic
  metadata_replicas:                        2
  data_replicas:                            2
  metadata_replicas_required:               1
  data_replicas_required:                   1
  encoded_extent_max:                       64.0 KiB
  metadata_checksum:                        none [crc32c] crc64 xxhash
  data_checksum:                            none [crc32c] crc64 xxhash
  compression:                              none
  background_compression:                   lz4:7
  str_hash:                                 crc32c crc64 [siphash]
  metadata_target:                          ssd
  foreground_target:                        ssd
  background_target:                        hdd
  promote_target:                           ssd
  erasure_code:                             1
  inodes_32bit:                             1
  shard_inode_numbers:                      1
  inodes_use_key_cache:                     1
  gc_reserve_percent:                       8
  gc_reserve_bytes:                         0 B
  root_reserve_percent:                     0
  wide_macs:                                0
  acl:                                      1
  usrquota:                                 0
  grpquota:                                 0
  prjquota:                                 0
  journal_flush_delay:                      1000
  journal_flush_disabled:                   0
  journal_reclaim_delay:                    100
  journal_transaction_names:                1
  version_upgrade:                          [compatible] incompatible none
  nocow:                                    0

members_v2 (size 2064):
Device:                                     0
  Label:                                    hdd01 (1)
  UUID:                                     338f5d85-7d64-40c5-bf64-d0132b315a94
  Size:                                     1.09 TiB
  read errors:                              0
  write errors:                             0
  checksum errors:                          0
  seqread iops:                             0
  seqwrite iops:                            0
  randread iops:                            0
  randwrite iops:                           0
  Bucket size:                              256 KiB
  First bucket:                             0
  Buckets:                                  4578566
  Last mount:                               Sun Mar 10 16:53:57 2024
  Last superblock write:                    270
  State:                                    rw
  Data allowed:                             journal,btree,user
  Has data:                                 btree,user,cached,parity
  Durability:                               1
  Discard:                                  0
  Freespace initialized:                    1
Device:                                     1
  Label:                                    hdd02 (2)
  UUID:                                     48390588-ea0f-408c-8f45-a4cb1f5548da
  Size:                                     1.09 TiB
  read errors:                              0
  write errors:                             0
  checksum errors:                          0
  seqread iops:                             0
  seqwrite iops:                            0
  randread iops:                            0
  randwrite iops:                           0
  Bucket size:                              256 KiB
  First bucket:                             0
  Buckets:                                  4578566
  Last mount:                               Sun Mar 10 16:53:57 2024
  Last superblock write:                    270
  State:                                    rw
  Data allowed:                             journal,btree,user
  Has data:                                 user,cached,parity
  Durability:                               1
  Discard:                                  0
  Freespace initialized:                    1
Device:                                     2
  Label:                                    hdd03 (3)
  UUID:                                     9d41fd7b-e025-4d02-a27f-eb975483b1a6
  Size:                                     1.09 TiB
  read errors:                              0
  write errors:                             0
  checksum errors:                          0
  seqread iops:                             0
  seqwrite iops:                            0
  randread iops:                            0
  randwrite iops:                           0
  Bucket size:                              256 KiB
  First bucket:                             0
  Buckets:                                  4578566
  Last mount:                               Sun Mar 10 16:53:57 2024
  Last superblock write:                    270
  State:                                    rw
  Data allowed:                             journal,btree,user
  Has data:                                 user,cached,parity
  Durability:                               1
  Discard:                                  0
  Freespace initialized:                    1
Device:                                     3
  Label:                                    hdd03 (3)
  UUID:                                     3e98488a-8dcc-4cd0-a4b7-1a6bcfd1a586
  Size:                                     1.09 TiB
  read errors:                              0
  write errors:                             0
  checksum errors:                          0
  seqread iops:                             0
  seqwrite iops:                            0
  randread iops:                            0
  randwrite iops:                           0
  Bucket size:                              256 KiB
  First bucket:                             0
  Buckets:                                  4578566
  Last mount:                               Sun Mar 10 16:53:57 2024
  Last superblock write:                    270
  State:                                    rw
  Data allowed:                             journal,btree,user
  Has data:                                 user,cached,parity
  Durability:                               1
  Discard:                                  0
  Freespace initialized:                    1
Device:                                     4
  Label:                                    hdd04 (4)
  UUID:                                     06d8c0a7-e7e8-4308-9cd3-abcdd1e2f81e
  Size:                                     1.09 TiB
  read errors:                              0
  write errors:                             0
  checksum errors:                          0
  seqread iops:                             0
  seqwrite iops:                            0
  randread iops:                            0
  randwrite iops:                           0
  Bucket size:                              256 KiB
  First bucket:                             0
  Buckets:                                  4578566
  Last mount:                               Sun Mar 10 16:53:57 2024
  Last superblock write:                    270
  State:                                    rw
  Data allowed:                             journal,btree,user
  Has data:                                 user,cached,parity
  Durability:                               1
  Discard:                                  0
  Freespace initialized:                    1
Device:                                     5
  Label:                                    hdd05 (5)
  UUID:                                     524d274b-5fde-4e34-93da-92503887fdab
  Size:                                     1.09 TiB
  read errors:                              0
  write errors:                             0
  checksum errors:                          0
  seqread iops:                             0
  seqwrite iops:                            0
  randread iops:                            0
  randwrite iops:                           0
  Bucket size:                              256 KiB
  First bucket:                             0
  Buckets:                                  4578566
  Last mount:                               Sun Mar 10 16:53:57 2024
  Last superblock write:                    270
  State:                                    rw
  Data allowed:                             journal,btree,user
  Has data:                                 user,cached,parity
  Durability:                               1
  Discard:                                  0
  Freespace initialized:                    1
Device:                                     6
  Label:                                    hdd06 (6)
  UUID:                                     e1e75a95-66f7-47c9-a060-035f1d123166
  Size:                                     1.09 TiB
  read errors:                              0
  write errors:                             0
  checksum errors:                          0
  seqread iops:                             0
  seqwrite iops:                            0
  randread iops:                            0
  randwrite iops:                           0
  Bucket size:                              256 KiB
  First bucket:                             0
  Buckets:                                  4578566
  Last mount:                               Sun Mar 10 16:53:57 2024
  Last superblock write:                    270
  State:                                    rw
  Data allowed:                             journal,btree,user
  Has data:                                 user,cached,parity
  Durability:                               1
  Discard:                                  0
  Freespace initialized:                    1
Device:                                     7
  Label:                                    hdd07 (7)
  UUID:                                     e6f48e08-33df-4cfa-b97d-cc28c37d5485
  Size:                                     1.09 TiB
  read errors:                              0
  write errors:                             0
  checksum errors:                          0
  seqread iops:                             0
  seqwrite iops:                            0
  randread iops:                            0
  randwrite iops:                           0
  Bucket size:                              256 KiB
  First bucket:                             0
  Buckets:                                  4578566
  Last mount:                               Sun Mar 10 16:53:57 2024
  Last superblock write:                    270
  State:                                    rw
  Data allowed:                             journal,btree,user
  Has data:                                 user,cached,parity
  Durability:                               1
  Discard:                                  0
  Freespace initialized:                    1
Device:                                     8
  Label:                                    hdd08 (8)
  UUID:                                     6307e9ae-d9fd-4c55-bbfa-6620c1af19ba
  Size:                                     1.09 TiB
  read errors:                              0
  write errors:                             0
  checksum errors:                          0
  seqread iops:                             0
  seqwrite iops:                            0
  randread iops:                            0
  randwrite iops:                           0
  Bucket size:                              256 KiB
  First bucket:                             0
  Buckets:                                  4578566
  Last mount:                               Sun Mar 10 16:53:57 2024
  Last superblock write:                    270
  State:                                    rw
  Data allowed:                             journal,btree,user
  Has data:                                 user,cached,parity
  Durability:                               1
  Discard:                                  0
  Freespace initialized:                    1
Device:                                     9
  Label:                                    hdd09 (9)
  UUID:                                     e31223ef-3cbf-45b8-a6b7-8ad233149f17
  Size:                                     1.09 TiB
  read errors:                              0
  write errors:                             0
  checksum errors:                          0
  seqread iops:                             0
  seqwrite iops:                            0
  randread iops:                            0
  randwrite iops:                           0
  Bucket size:                              256 KiB
  First bucket:                             0
  Buckets:                                  4578566
  Last mount:                               Sun Mar 10 16:53:57 2024
  Last superblock write:                    270
  State:                                    rw
  Data allowed:                             journal,btree,user
  Has data:                                 user,cached,parity
  Durability:                               1
  Discard:                                  0
  Freespace initialized:                    1
Device:                                     10
  Label:                                    hdd10 (10)
  UUID:                                     c5da352b-5702-4f75-b278-4ac1c6e9a489
  Size:                                     1.09 TiB
  read errors:                              0
  write errors:                             0
  checksum errors:                          0
  seqread iops:                             0
  seqwrite iops:                            0
  randread iops:                            0
  randwrite iops:                           0
  Bucket size:                              256 KiB
  First bucket:                             0
  Buckets:                                  4578566
  Last mount:                               Sun Mar 10 16:53:57 2024
  Last superblock write:                    270
  State:                                    rw
  Data allowed:                             journal,btree,user
  Has data:                                 user,cached,parity
  Durability:                               1
  Discard:                                  0
  Freespace initialized:                    1
Device:                                     11
  Label:                                    hdd11 (11)
  UUID:                                     c2ac602d-5981-44be-bd7e-459a053d3bda
  Size:                                     1.09 TiB
  read errors:                              0
  write errors:                             0
  checksum errors:                          0
  seqread iops:                             0
  seqwrite iops:                            0
  randread iops:                            0
  randwrite iops:                           0
  Bucket size:                              256 KiB
  First bucket:                             0
  Buckets:                                  4578566
  Last mount:                               Sun Mar 10 16:53:57 2024
  Last superblock write:                    270
  State:                                    rw
  Data allowed:                             journal,btree,user
  Has data:                                 user,cached,parity
  Durability:                               1
  Discard:                                  0
  Freespace initialized:                    1
Device:                                     12
  Label:                                    hdd12 (12)
  UUID:                                     669b7c16-8b12-45c6-a13d-ebe396ce4b04
  Size:                                     1.09 TiB
  read errors:                              0
  write errors:                             0
  checksum errors:                          0
  seqread iops:                             0
  seqwrite iops:                            0
  randread iops:                            0
  randwrite iops:                           0
  Bucket size:                              256 KiB
  First bucket:                             0
  Buckets:                                  4578566
  Last mount:                               Sun Mar 10 16:53:57 2024
  Last superblock write:                    270
  State:                                    rw
  Data allowed:                             journal,btree,user
  Has data:                                 user,cached,parity
  Durability:                               1
  Discard:                                  0
  Freespace initialized:                    1
Device:                                     13
  Label:                                    hdd13 (13)
  UUID:                                     4fbad3dc-1fcf-4c7e-b9d1-357b9d692dfa
  Size:                                     1.09 TiB
  read errors:                              0
  write errors:                             0
  checksum errors:                          0
  seqread iops:                             0
  seqwrite iops:                            0
  randread iops:                            0
  randwrite iops:                           0
  Bucket size:                              256 KiB
  First bucket:                             0
  Buckets:                                  4578566
  Last mount:                               Sun Mar 10 16:53:57 2024
  Last superblock write:                    270
  State:                                    rw
  Data allowed:                             journal,btree,user
  Has data:                                 user,cached,parity
  Durability:                               1
  Discard:                                  0
  Freespace initialized:                    1
Device:                                     14
  Label:                                    ssd01 (15)
  UUID:                                     f3420167-2aec-4bd4-945e-a86ba21ef3a9
  Size:                                     186 GiB
  read errors:                              0
  write errors:                             0
  checksum errors:                          0
  seqread iops:                             0
  seqwrite iops:                            0
  randread iops:                            0
  randwrite iops:                           0
  Bucket size:                              256 KiB
  First bucket:                             0
  Buckets:                                  763128
  Last mount:                               Sun Mar 10 16:53:57 2024
  Last superblock write:                    270
  State:                                    rw
  Data allowed:                             journal,btree,user
  Has data:                                 journal,btree,user,cached,parity
  Durability:                               1
  Discard:                                  0
  Freespace initialized:                    1
Device:                                     15
  Label:                                    ssd02 (16)
  UUID:                                     23476d10-959e-497c-905a-f41ea700f3da
  Size:                                     186 GiB
  read errors:                              0
  write errors:                             0
  checksum errors:                          0
  seqread iops:                             0
  seqwrite iops:                            0
  randread iops:                            0
  randwrite iops:                           0
  Bucket size:                              512 KiB
  First bucket:                             0
  Buckets:                                  381564
  Last mount:                               Sun Mar 10 16:53:57 2024
  Last superblock write:                    270
  State:                                    rw
  Data allowed:                             journal,btree,user
  Has data:                                 journal,btree,user,cached,parity
  Durability:                               1
  Discard:                                  0
  Freespace initialized:                    1

errors (size 248):
fs_usage_data_wrong                         3               Wed Mar  6 12:20:52 2024
fs_usage_cached_wrong                       3               Wed Mar  6 12:20:54 2024
fs_usage_replicas_wrong                     17              Wed Mar  6 12:20:57 2024
dev_usage_buckets_wrong                     8               Sun Mar  3 12:52:48 2024
dev_usage_sectors_wrong                     25              Wed Mar  6 12:20:52 2024
dev_usage_fragmented_wrong                  21              Wed Mar  6 12:20:52 2024
dev_usage_buckets_ec_wrong                  16              Thu Mar  7 17:27:26 2024
alloc_key_data_type_wrong                   4               Sun Mar  3 12:52:39 2024
alloc_key_dirty_sectors_wrong               9809            Wed Mar  6 12:20:45 2024
alloc_key_cached_sectors_wrong              8887            Wed Mar  6 12:20:50 2024
lru_entry_bad                               4               Sun Mar  3 12:53:47 2024
ptr_to_missing_backpointer                  11737           Thu Mar  7 17:30:58 2024
ptr_to_missing_replicas_entry               5               Sun Mar  3 12:49:50 2024
stale_dirty_ptr                             13              Sun Mar  3 12:49:50 2024
stripe_sector_count_wrong                   2476            Thu Mar  7 17:27:14 2024

dev-0/alloc_debug (HDD0)

                 buckets         sectors      fragmented
free             3002065               0               0
sb                    13            6152             504
journal             8192         4194304               0
btree               4608         2359296               0
user               77129        39279660          210451
cached              6109         1951769               0
parity            105112        53817344               0
stripe           1375338       703463461          672380
need_gc_gens           0               0               0
need_discard           0               0               0
ec               1480450

reserves:
stripe            143136
normal             71596
copygc                56
btree                 28
btree_copygc           0
reclaim                0

freelist_wait           empty
open buckets allocated  36
open buckets this dev   0
open buckets total      1024
open_buckets_wait       empty
open_buckets_btree      2
open_buckets_user       32
buckets_to_invalidate   0
btree reserve cache     1

dev-14/alloc_debug (SSD0)

                 buckets         sectors      fragmented
free               29863               0               0
sb                    13            6152             504
journal             5961         3052032               0
btree              95731        49014272               0
user              276783       141692181           20715
cached            354775       181470149               0
parity                 0               0               0
stripe                 2               0               0
need_gc_gens           0               0               0
need_discard           0               0               0
ec                     2

reserves:
stripe             23902
normal             11979
copygc                56
btree                 28
btree_copygc           0
reclaim                0

freelist_wait           empty
open buckets allocated  36
open buckets this dev   8
open buckets total      1024
open_buckets_wait       empty
open_buckets_btree      2
open_buckets_user       32
buckets_to_invalidate   0
btree reserve cache     1

Kernel config bcachefs options

CONFIG_BCACHEFS_FS=m
CONFIG_BCACHEFS_QUOTA=y
CONFIG_BCACHEFS_ERASURE_CODING=y
CONFIG_BCACHEFS_POSIX_ACL=y
CONFIG_BCACHEFS_DEBUG_TRANSACTIONS=y
# CONFIG_BCACHEFS_DEBUG is not set
# CONFIG_BCACHEFS_TESTS is not set
# CONFIG_BCACHEFS_LOCK_TIME_STATS is not set
# CONFIG_BCACHEFS_NO_LATENCY_ACCT is not set

Libvirt device configuration sample

    <disk type='block' device='disk'>
      <driver name='qemu' type='raw' cache='none' io='native'/>
      <source dev='/dev/disk/by-id/scsi-SHGST_HUC101212CSS600_L0JWX1KJ' index='17'/>
      <backingStore/>
      <target dev='vdb' bus='virtio'/>
      <alias name='virtio-disk1'/>
      <address type='pci' domain='0x0000' bus='0x08' slot='0x00' function='0x0'/>
    </disk>
koverstreet commented 6 months ago

Can you faddr2line this?

LZ4HC_compress_generic+0x37f/0x1ac0

On Sun, Mar 10, 2024 at 3:54 AM Emily Scarlett @.***> wrote:

I'm running a Gentoo machine on Kernel 6.7.6 (from Gentoo-sources) with a 2-tier bcachefs filesystem (erasure coding, 2-replicas, encryption, lz4 compression), and every now and then the bch-rebalance kernel thread on my fs crashes, which renders writing to the background target inoperable.

However, reads from all devices perfectly functional and writes to the foreground target function fine (All functions, even clearing cache to free space for incoming data). No loss in btree, journal, et al functionality, just the automated flush from foreground to background target.

I can somewhat reliably replicate this when I have multiple processes writing to the FS at one time (Multiple samba connections, lftp, et al). The circumstances tend to be when I'm pushing the write speed of the foreground SSDs to their limit (700MB/s or so per SSD), however this does not happen when one or a few processes are writing at once, only when there are many.

After this happens, any attempt to unmount the filesystem (It is not root) results in a hung umount and zombified process. The best way to resolve the issue is to reboot the VM, waiting for systemd to forcibly kill the processes and threads itself. This is the cleanest way to shut down, but even then the occasional fsck shows up some small but fixable errors.

The machine itself is a VM running under KVM/Libvirt on a Rocky Linux 9.3 host, with each device passed through as raw block devices with the virtio bus, no caching, type raw, io native. I've included a segment of the VMs libvirt configuration xml to show how it is configured directly.

My configuration is 2 (200GB) foreground SSDs, 14 (1.2TB) background HDDs. Filesystem/Mount options: metadata_replicas=2,data_replicas=2,background_compression=lz4:7,metadata_target=ssd,foreground_target=ssd,background_target=hdd,promote_target=ssd,erasure_code,verbose

Dmesg announcing thread crash:

[ 2435.684750] #PF: supervisor write access in kernel mode [ 2435.684788] #PF: error_code(0x0002) - not-present page [ 2435.684820] PGD 100000067 P4D 100000067 PUD 1001ee067 PMD 16c7b1067 PTE 0 [ 2435.684886] Oops: 0002 [#1] PREEMPT SMP PTI [ 2435.684920] CPU: 0 PID: 788 Comm: bch-rebalance/2 Not tainted 6.7.6-gentoo-x86_64 #2 [ 2435.684957] Hardware name: Red Hat KVM/RHEL, BIOS edk2-20230524-4.el9_3 05/24/2023 [ 2435.684995] RIP: 0010:LZ4HC_compress_generic+0x37f/0x1ac0 [lz4hc_compress] [ 2435.685038] Code: 00 83 f9 0e 0f 8f cb 15 00 00 48 8b 7c 24 10 89 ca c1 e2 04 88 17 48 8b 4c 24 38 48 8d 14 30 48 8b 31 48 83 c0 08 48 83 c1 08 <48> 89 70 f8 48 39 d0 72 ec 48 8b 44 24 20 48 8b 7c 24 50 48 83 c2 [ 2435.685103] RSP: 0018:ffffb9cb8114f6e0 EFLAGS: 00010296 [ 2435.685137] RAX: ffffb9cb8dc2d001 RBX: ffffb9cb8dc1d000 RCX: ffffb9cb8dc3df00 [ 2435.685169] RDX: ffffb9cb8dc2cffa RSI: fbc66285b18817ff RDI: ffffb9cb8dc1d000 [ 2435.685201] RBP: 0000000000010000 R08: ffff97f6b4620000 R09: 0000000000010000 [ 2435.685233] R10: ffffb9cb8dc1d000 R11: 00000000c66285b1 R12: ffff97f6b4620000 [ 2435.685265] R13: 0000000000000000 R14: ffffb9cb8dc3def9 R15: ffffb9cb8dc3defd [ 2435.685297] FS: 0000000000000000(0000) GS:ffff97ff3fa00000(0000) knlGS:0000000000000000 [ 2435.685332] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2435.685367] CR2: ffffb9cb8dc2d000 CR3: 00000001be806002 CR4: 0000000000170ef0 [ 2435.685402] Call Trace: [ 2435.685429] [ 2435.685458] ? die+0x23/0x70 [ 2435.685496] ? page_fault_oops+0x15d/0x440 [ 2435.685529] ? fixup_exception+0x26/0x310 [ 2435.685580] ? exc_page_fault+0x16a/0x170 [ 2435.685626] ? asm_exc_page_fault+0x26/0x30 [ 2435.685661] ? LZ4HC_compress_generic+0x37f/0x1ac0 [lz4hc_compress] [ 2435.685698] LZ4_compress_HC+0x7b/0x90 [lz4hc_compress] [ 2435.685734] attempt_compress+0x1e6/0x200 [bcachefs] [ 2435.685944] ? __get_free_pages+0x11/0x40 [ 2435.685978] ? mempool_alloc_vp+0x2f/0x50 [bcachefs] [ 2435.686062] ? mempool_alloc+0x66/0x1a0 [ 2435.686097] bch2_bio_compress+0x22c/0x4c0 [bcachefs] [ 2435.686183] bch2_write+0x122f/0x1340 [bcachefs] [ 2435.686281] ? bch2_increment_clock+0x2d/0x140 [bcachefs] [ 2435.686365] ? _raw_spin_unlock+0xe/0x30 [ 2435.686397] ? bch2_write+0x2c4/0x450 [bcachefs] [ 2435.686486] ? bch2_moving_ctxt_do_pending_writes+0xea/0x120 [bcachefs] [ 2435.686600] bch2_moving_ctxt_do_pending_writes+0xea/0x120 [bcachefs] [ 2435.686698] bch2_move_ratelimit+0x1b4/0x410 [bcachefs] [ 2435.686791] ? pfx_autoremove_wake_function+0x10/0x10 [ 2435.686839] do_rebalance+0x13a/0x830 [bcachefs] [ 2435.686959] ? kvm_sched_clock_read+0x11/0x20 [ 2435.687448] ? local_clock_noinstr+0xd/0xb0 [ 2435.687809] ? bch2_trans_get+0x303/0x360 [bcachefs] [ 2435.688171] ? pfx_bch2_rebalance_thread+0x10/0x10 [bcachefs] [ 2435.688523] bch2_rebalance_thread+0x57/0xa0 [bcachefs] [ 2435.688913] ? bch2_rebalance_thread+0x4d/0xa0 [bcachefs] [ 2435.689275] ? pfx_closure_sync_fn+0x10/0x10 [ 2435.689583] kthread+0xe8/0x120 [ 2435.689868] ? __pfx_kthread+0x10/0x10 [ 2435.690125] ret_from_fork+0x34/0x50 [ 2435.690416] ? pfx_kthread+0x10/0x10 [ 2435.690735] ret_from_fork_asm+0x1b/0x30 [ 2435.691018] [ 2435.691255] Modules linked in: poly1305_generic libpoly1305 poly1305_x86_64 chacha_generic chacha_x86_64 libchacha bcachefs crc64 lz4hc_compress lz4_compress xor raid6_pq intel_rapl_msr intel_rapl_common vfat rapl fat iTCO_wdt iTCO_vendor_support i2c_i801 pcspkr lpc_ich i2c_smbus mfd_core virtio_balloon joydev drm backlight fuse loop efi_pstore i2c_core dm_mod configfs nfnetlink xfs sr_mod cdrom crct10dif_pclmul crc32_pclmul libcrc32c crc32c_intel ghash_clmulni_intel sha512_ssse3 xhci_pci xhci_pci_renesas virtio_net ahci sha256_ssse3 net_failover xhci_hcd sha1_ssse3 libahci failover serio_raw efivarfs qemu_fw_cfg virtio_pci virtio_pci_legacy_dev virtio_pci_modern_dev virtio_rng aesni_intel crypto_simd cryptd [ 2435.692732] CR2: ffffb9cb8dc2d000 [ 2435.693016] ---[ end trace 0000000000000000 ]--- [ 2437.015754] RIP: 0010:LZ4HC_compress_generic+0x37f/0x1ac0 [lz4hc_compress] [ 2437.017693] Code: 00 83 f9 0e 0f 8f cb 15 00 00 48 8b 7c 24 10 89 ca c1 e2 04 88 17 48 8b 4c 24 38 48 8d 14 30 48 8b 31 48 83 c0 08 48 83 c1 08 <48> 89 70 f8 48 39 d0 72 ec 48 8b 44 24 20 48 8b 7c 24 50 48 83 c2 [ 2437.018456] RSP: 0018:ffffb9cb8114f6e0 EFLAGS: 00010296 [ 2437.018868] RAX: ffffb9cb8dc2d001 RBX: ffffb9cb8dc1d000 RCX: ffffb9cb8dc3df00 [ 2437.019185] RDX: ffffb9cb8dc2cffa RSI: fbc66285b18817ff RDI: ffffb9cb8dc1d000 [ 2437.019485] RBP: 0000000000010000 R08: ffff97f6b4620000 R09: 0000000000010000 [ 2437.019797] R10: ffffb9cb8dc1d000 R11: 00000000c66285b1 R12: ffff97f6b4620000 [ 2437.020095] R13: 0000000000000000 R14: ffffb9cb8dc3def9 R15: ffffb9cb8dc3defd [ 2437.020421] FS: 0000000000000000(0000) GS:ffff97ff3fa00000(0000) knlGS:0000000000000000 [ 2437.020843] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2437.021268] CR2: ffffb9cb8dc2d000 CR3: 00000001be806002 CR4: 0000000000170ef0 [ 2437.021735] note: bch-rebalance/2[788] exited with irqs disabled

bcachefs fs usage:

Filesystem: 2353ad4f-f54a-4a6d-b838-596270f9eebc Size: 14.4 TiB Used: 5.40 TiB Online reserved: 23.0 KiB

Data type Required/total Durability Devices btree: 1/2 2 [vde vdq] 2.25 GiB btree: 1/2 2 [vdq vdp] 44.5 GiB user: 1/2 2 [vdi vdk] 581 MiB user: 1/2 2 [vdd vdb] 5.31 GiB user: 1/2 2 [vdo vdc] 518 MiB user: 1/2 2 [vde vdm] 1.83 GiB user: 1/2 2 [vdg vdl] 2.80 GiB user: 1/2 2 [vdk vdn] 319 MiB user: 1/2 2 [vde vdd] 3.27 GiB user: 1/2 2 [vdd vdi] 2.41 GiB user: 1/2 2 [vdf vdn] 2.09 GiB user: 1/2 2 [vdh vdk] 3.72 GiB user: 1/2 2 [vdj vdl] 3.79 GiB user: 1/2 2 [vdl vdb] 336 MiB user: 1/1 1 [vde] 512 B user: 1/2 2 [vde vdi] 1.96 GiB user: 1/2 2 [vde vdb] 5.41 GiB user: 1/2 2 [vdd vdm] 1.49 GiB user: 1/2 2 [vdf vdj] 3.52 GiB user: 1/2 2 [vdg vdh] 3.43 GiB user: 1/2 2 [vdg vdc] 4.93 GiB user: 1/2 2 [vdh vdo] 9.17 GiB user: 1/2 2 [vdi vdo] 5.20 GiB user: 1/2 2 [vdj vdc] 459 MiB user: 1/2 2 [vdl vdm] 9.40 GiB user: 1/2 2 [vdm vdb] 2.69 GiB user: 1/1 1 [vdj] 30.5 KiB user: 1/2 2 [vde vdg] 3.86 GiB user: 1/2 2 [vde vdk] 2.43 GiB user: 1/2 2 [vde vdo] 1.51 GiB user: 1/2 2 [vdd vdg] 5.60 GiB user: 1/2 2 [vdd vdk] 3.53 GiB user: 1/2 2 [vdd vdo] 1.20 GiB user: 1/2 2 [vdf vdh] 501 MiB user: 1/2 2 [vdf vdl] 1.19 GiB user: 1/2 2 [vdf vdc] 4.80 GiB user: 1/2 2 [vdg vdj] 3.66 GiB user: 1/2 2 [vdg vdn] 309 MiB user: 1/2 2 [vdh vdi] 628 MiB user: 1/2 2 [vdh vdm] 1.22 GiB user: 1/2 2 [vdh vdb] 5.19 GiB user: 1/2 2 [vdi vdm] 556 MiB user: 1/2 2 [vdi vdb] 322 MiB user: 1/2 2 [vdj vdn] 4.51 GiB user: 1/2 2 [vdk vdl] 591 MiB user: 1/2 2 [vdk vdc] 7.16 GiB user: 1/2 2 [vdl vdo] 1.59 GiB user: 1/2 2 [vdm vdo] 8.98 GiB user: 1/2 2 [vdn vdc] 616 MiB user: 1/2 2 [vdc vdb] 2.39 GiB user: 1/1 1 [vdd] 64.0 KiB user: 1/1 1 [vdm] 1.50 KiB user: 1/2 2 [vde vdf] 3.56 GiB user: 1/2 2 [vde vdh] 2.97 GiB user: 1/2 2 [vde vdj] 2.80 GiB user: 1/2 2 [vde vdl] 2.29 GiB user: 1/2 2 [vde vdn] 987 MiB user: 1/2 2 [vde vdc] 4.71 GiB user: 1/2 2 [vdd vdf] 3.88 GiB user: 1/2 2 [vdd vdh] 3.38 GiB user: 1/2 2 [vdd vdj] 2.88 GiB user: 1/2 2 [vdd vdl] 1.80 GiB user: 1/2 2 [vdd vdn] 1.43 GiB user: 1/2 2 [vdd vdc] 1.67 GiB user: 1/2 2 [vdf vdg] 2.69 GiB user: 1/2 2 [vdf vdi] 3.98 GiB user: 1/2 2 [vdf vdk] 4.74 GiB user: 1/2 2 [vdf vdm] 3.45 GiB user: 1/2 2 [vdf vdo] 583 MiB user: 1/2 2 [vdf vdb] 2.83 GiB user: 1/2 2 [vdg vdi] 629 MiB user: 1/2 2 [vdg vdk] 3.03 GiB user: 1/2 2 [vdg vdm] 1.28 GiB user: 1/2 2 [vdg vdo] 1.56 GiB user: 1/2 2 [vdg vdb] 4.16 GiB user: 1/2 2 [vdh vdj] 1.82 GiB user: 1/2 2 [vdh vdl] 3.93 GiB user: 1/2 2 [vdh vdn] 1.45 GiB user: 1/2 2 [vdh vdc] 334 MiB user: 1/2 2 [vdi vdj] 562 MiB user: 1/2 2 [vdi vdl] 511 MiB user: 1/2 2 [vdi vdn] 14.4 GiB user: 1/2 2 [vdi vdc] 6.07 GiB user: 1/2 2 [vdj vdk] 5.03 GiB user: 1/2 2 [vdj vdm] 827 MiB user: 1/2 2 [vdj vdo] 5.56 GiB user: 1/2 2 [vdj vdb] 2.32 GiB user: 1/2 2 [vdk vdm] 632 MiB user: 1/2 2 [vdk vdo] 464 MiB user: 1/2 2 [vdk vdb] 5.56 GiB user: 1/2 2 [vdl vdn] 9.27 GiB user: 1/2 2 [vdl vdc] 260 MiB user: 1/2 2 [vdm vdn] 1.52 GiB user: 1/2 2 [vdm vdc] 3.90 GiB user: 1/2 2 [vdn vdo] 509 MiB user: 1/2 2 [vdn vdb] 337 MiB user: 1/2 2 [vdo vdb] 960 MiB user: 1/2 2 [vdq vdp] 135 GiB user: 13/14 14 [vde vdd vdf vdg vdh vdi vdj vdk vdl vdm vdn vdo vdc vdb] 4.58 TiB user: 14/15 15 [vde vdd vdf vdg vdh vdi vdj vdk vdl vdm vdn vdo vdc vdb vdp] 2.75 MiB user: 14/15 15 [vde vdd vdf vdg vdh vdi vdj vdk vdl vdm vdn vdo vdc vdb vdq] 6.50 MiB cached: 1/1 1 [vdi] 76.3 MiB cached: 1/1 1 [vdb] 77.0 MiB cached: 1/1 1 [vdd] 73.3 MiB cached: 1/1 1 [vdm] 73.2 MiB parity: 14/15 15 [vde vdd vdf vdg vdh vdi vdj vdk vdl vdm vdn vdo vdc vdb vdq] 512 KiB cached: 1/1 1 [vdg] 83.3 MiB cached: 1/1 1 [vdk] 71.8 MiB cached: 1/1 1 [vdo] 76.1 MiB cached: 1/1 1 [vdp] 74.2 GiB parity: 14/15 15 [vde vdd vdf vdg vdh vdi vdj vdk vdl vdm vdn vdo vdc vdb vdp] 256 KiB cached: 1/1 1 [vde] 953 MiB cached: 1/1 1 [vdf] 87.5 MiB cached: 1/1 1 [vdh] 76.3 MiB cached: 1/1 1 [vdj] 76.1 MiB cached: 1/1 1 [vdl] 76.6 MiB cached: 1/1 1 [vdn] 75.4 MiB cached: 1/1 1 [vdc] 73.9 MiB cached: 1/1 1 [vdq] 86.5 GiB parity: 13/14 14 [vde vdd vdf vdg vdh vdi vdj vdk vdl vdm vdn vdo vdc vdb] 361 GiB

hdd.hdd01 (device 0): vde rw data buckets fragmented free: 733 GiB 3002065 sb: 3.00 MiB 13 252 KiB journal: 2.00 GiB 8192 btree: 1.13 GiB 4608 user: 18.7 GiB 77129 103 MiB cached: 953 MiB 6109 parity: 25.7 GiB 105112 stripe: 335 GiB 1375338 328 MiB need_gc_gens: 0 B 0 need_discard: 0 B 0 capacity: 1.09 TiB 4578566

hdd.hdd02 (device 1): vdd rw data buckets fragmented free: 735 GiB 3011666 sb: 3.00 MiB 13 252 KiB journal: 2.00 GiB 8192 btree: 0 B 0 user: 18.9 GiB 77657 106 MiB cached: 73.3 MiB 588 parity: 25.9 GiB 106092 stripe: 335 GiB 1374358 331 MiB need_gc_gens: 0 B 0 need_discard: 0 B 0 capacity: 1.09 TiB 4578566

hdd.hdd03 (device 2): vdf rw data buckets fragmented free: 735 GiB 3011677 sb: 3.00 MiB 13 252 KiB journal: 2.00 GiB 8192 btree: 0 B 0 user: 18.8 GiB 77544 104 MiB cached: 87.5 MiB 690 parity: 25.9 GiB 106131 stripe: 335 GiB 1374319 334 MiB need_gc_gens: 0 B 0 need_discard: 0 B 0 capacity: 1.09 TiB 4578566

hdd.hdd03 (device 3): vdg rw data buckets fragmented free: 735 GiB 3011500 sb: 3.00 MiB 13 252 KiB journal: 2.00 GiB 8192 btree: 0 B 0 user: 18.9 GiB 77766 103 MiB cached: 83.3 MiB 645 parity: 25.9 GiB 106147 stripe: 335 GiB 1374303 337 MiB need_gc_gens: 0 B 0 need_discard: 0 B 0 capacity: 1.09 TiB 4578566

hdd.hdd04 (device 4): vdh rw data buckets fragmented free: 735 GiB 3011914 sb: 3.00 MiB 13 252 KiB journal: 2.00 GiB 8192 btree: 0 B 0 user: 18.8 GiB 77357 102 MiB cached: 76.3 MiB 640 parity: 25.8 GiB 105698 stripe: 335 GiB 1374752 337 MiB need_gc_gens: 0 B 0 need_discard: 0 B 0 capacity: 1.09 TiB 4578566

hdd.hdd05 (device 5): vdi rw data buckets fragmented free: 735 GiB 3011954 sb: 3.00 MiB 13 252 KiB journal: 2.00 GiB 8192 btree: 0 B 0 user: 18.8 GiB 77367 103 MiB cached: 76.3 MiB 590 parity: 25.8 GiB 105733 stripe: 335 GiB 1374717 326 MiB need_gc_gens: 0 B 0 need_discard: 0 B 0 capacity: 1.09 TiB 4578566

hdd.hdd06 (device 6): vdj rw data buckets fragmented free: 735 GiB 3011931 sb: 3.00 MiB 13 252 KiB journal: 2.00 GiB 8192 btree: 0 B 0 user: 18.8 GiB 77375 111 MiB cached: 76.1 MiB 605 parity: 25.8 GiB 105714 stripe: 335 GiB 1374736 324 MiB need_gc_gens: 0 B 0 need_discard: 0 B 0 capacity: 1.09 TiB 4578566

hdd.hdd07 (device 7): vdk rw data buckets fragmented free: 735 GiB 3011987 sb: 3.00 MiB 13 252 KiB journal: 2.00 GiB 8192 btree: 0 B 0 user: 18.8 GiB 77350 106 MiB cached: 71.8 MiB 574 parity: 25.8 GiB 105700 stripe: 335 GiB 1374750 335 MiB need_gc_gens: 0 B 0 need_discard: 0 B 0 capacity: 1.09 TiB 4578566

hdd.hdd08 (device 8): vdl rw data buckets fragmented free: 735 GiB 3011957 sb: 3.00 MiB 13 252 KiB journal: 2.00 GiB 8192 btree: 0 B 0 user: 18.8 GiB 77368 107 MiB cached: 76.6 MiB 586 parity: 25.8 GiB 105657 stripe: 335 GiB 1374793 332 MiB need_gc_gens: 0 B 0 need_discard: 0 B 0 capacity: 1.09 TiB 4578566

hdd.hdd09 (device 9): vdm rw data buckets fragmented free: 735 GiB 3012001 sb: 3.00 MiB 13 252 KiB journal: 2.00 GiB 8192 btree: 0 B 0 user: 18.8 GiB 77328 102 MiB cached: 73.2 MiB 582 parity: 25.8 GiB 105672 stripe: 335 GiB 1374778 333 MiB need_gc_gens: 0 B 0 need_discard: 0 B 0 capacity: 1.09 TiB 4578566

hdd.hdd10 (device 10): vdn rw data buckets fragmented free: 735 GiB 3011925 sb: 3.00 MiB 13 252 KiB journal: 2.00 GiB 8192 btree: 0 B 0 user: 18.8 GiB 77372 108 MiB cached: 75.4 MiB 614 parity: 25.8 GiB 105678 stripe: 335 GiB 1374772 337 MiB need_gc_gens: 0 B 0 need_discard: 0 B 0 capacity: 1.09 TiB 4578566

hdd.hdd11 (device 11): vdo rw data buckets fragmented free: 735 GiB 3011920 sb: 3.00 MiB 13 252 KiB journal: 2.00 GiB 8192 btree: 0 B 0 user: 18.8 GiB 77357 108 MiB cached: 76.1 MiB 634 parity: 25.8 GiB 105670 stripe: 335 GiB 1374780 339 MiB need_gc_gens: 0 B 0 need_discard: 0 B 0 capacity: 1.09 TiB 4578566

hdd.hdd12 (device 12): vdc rw data buckets fragmented free: 735 GiB 3011981 sb: 3.00 MiB 13 252 KiB journal: 2.00 GiB 8192 btree: 0 B 0 user: 18.8 GiB 77335 102 MiB cached: 73.9 MiB 595 parity: 25.8 GiB 105735 stripe: 335 GiB 1374715 330 MiB need_gc_gens: 0 B 0 need_discard: 0 B 0 capacity: 1.09 TiB 4578566

hdd.hdd13 (device 13): vdb rw data buckets fragmented free: 735 GiB 3011863 sb: 3.00 MiB 13 252 KiB journal: 2.00 GiB 8192 btree: 0 B 0 user: 18.8 GiB 77422 100.0 MiB cached: 77.0 MiB 626 parity: 25.8 GiB 105711 stripe: 335 GiB 1374739 340 MiB need_gc_gens: 0 B 0 need_discard: 0 B 0 capacity: 1.09 TiB 4578566

ssd.ssd01 (device 14): vdq rw data buckets fragmented free: 7.29 GiB 29863 sb: 3.00 MiB 13 252 KiB journal: 1.46 GiB 5961 btree: 23.4 GiB 95731 user: 67.6 GiB 276783 10.1 MiB cached: 86.5 GiB 354775 parity: 0 B 0 stripe: 0 B 2 need_gc_gens: 0 B 0 need_discard: 0 B 0 capacity: 186 GiB 763128

ssd.ssd02 (device 15): vdp rw data buckets fragmented free: 7.29 GiB 14930 sb: 3.00 MiB 7 508 KiB journal: 1.46 GiB 2980 btree: 22.2 GiB 72978 13.4 GiB user: 67.6 GiB 138410 19.4 MiB cached: 74.2 GiB 152258 parity: 0 B 0 stripe: 0 B 1 need_gc_gens: 0 B 0 need_discard: 0 B 0 capacity: 186 GiB 381564

bcachefs show-super

External UUID: 2353ad4f-f54a-4a6d-b838-596270f9eebc Internal UUID: d649b677-ad45-46e0-8203-4259fb360d13 Magic number: c68573f6-66ce-90a9-d96a-60cf803df7ef Device index: 0 Label: Version: 1.3: rebalance_work Version upgrade complete: 1.3: rebalance_work Oldest version on disk: 1.3: rebalance_work Created: Tue Feb 27 20:45:42 2024 Sequence number: 293 Time of last write: Wed Mar 6 12:40:27 2024 Superblock size: 16.2 KiB/1.00 MiB Clean: 0 Devices: 16 Sections: members_v1,crypt,disk_groups,clean,replicas,journal_seq_blacklist,journal_v2,counters,members_v2,errors,ext,downgrade Features: lz4,gzip,zstd,ec,journal_seq_blacklist_v3,new_siphash,inline_data,new_extent_overwrite,btree_ptr_v2,extents_above_btree_updates,btree_updates_journalled,new_varint,journal_no_flush,alloc_v2,extents_across_btree_nodes Compat features: alloc_info,alloc_metadata,extents_above_btree_updates_done,bformat_overflow_done

Options: block_size: 512 B btree_node_size: 256 KiB errors: continue [ro] panic metadata_replicas: 2 data_replicas: 2 metadata_replicas_required: 1 data_replicas_required: 1 encoded_extent_max: 64.0 KiB metadata_checksum: none [crc32c] crc64 xxhash data_checksum: none [crc32c] crc64 xxhash compression: none background_compression: lz4:7 str_hash: crc32c crc64 [siphash] metadata_target: ssd foreground_target: ssd background_target: hdd promote_target: ssd erasure_code: 1 inodes_32bit: 1 shard_inode_numbers: 1 inodes_use_key_cache: 1 gc_reserve_percent: 8 gc_reserve_bytes: 0 B root_reserve_percent: 0 wide_macs: 0 acl: 1 usrquota: 0 grpquota: 0 prjquota: 0 journal_flush_delay: 1000 journal_flush_disabled: 0 journal_reclaim_delay: 100 journal_transaction_names: 1 version_upgrade: [compatible] incompatible none nocow: 0

members_v2 (size 2064): Device: 0 Label: hdd01 (1) UUID: 338f5d85-7d64-40c5-bf64-d0132b315a94 Size: 1.09 TiB read errors: 0 write errors: 0 checksum errors: 0 seqread iops: 0 seqwrite iops: 0 randread iops: 0 randwrite iops: 0 Bucket size: 256 KiB First bucket: 0 Buckets: 4578566 Last mount: Sun Mar 10 16:53:57 2024 Last superblock write: 270 State: rw Data allowed: journal,btree,user Has data: btree,user,cached,parity Durability: 1 Discard: 0 Freespace initialized: 1 Device: 1 Label: hdd02 (2) UUID: 48390588-ea0f-408c-8f45-a4cb1f5548da Size: 1.09 TiB read errors: 0 write errors: 0 checksum errors: 0 seqread iops: 0 seqwrite iops: 0 randread iops: 0 randwrite iops: 0 Bucket size: 256 KiB First bucket: 0 Buckets: 4578566 Last mount: Sun Mar 10 16:53:57 2024 Last superblock write: 270 State: rw Data allowed: journal,btree,user Has data: user,cached,parity Durability: 1 Discard: 0 Freespace initialized: 1 Device: 2 Label: hdd03 (3) UUID: 9d41fd7b-e025-4d02-a27f-eb975483b1a6 Size: 1.09 TiB read errors: 0 write errors: 0 checksum errors: 0 seqread iops: 0 seqwrite iops: 0 randread iops: 0 randwrite iops: 0 Bucket size: 256 KiB First bucket: 0 Buckets: 4578566 Last mount: Sun Mar 10 16:53:57 2024 Last superblock write: 270 State: rw Data allowed: journal,btree,user Has data: user,cached,parity Durability: 1 Discard: 0 Freespace initialized: 1 Device: 3 Label: hdd03 (3) UUID: 3e98488a-8dcc-4cd0-a4b7-1a6bcfd1a586 Size: 1.09 TiB read errors: 0 write errors: 0 checksum errors: 0 seqread iops: 0 seqwrite iops: 0 randread iops: 0 randwrite iops: 0 Bucket size: 256 KiB First bucket: 0 Buckets: 4578566 Last mount: Sun Mar 10 16:53:57 2024 Last superblock write: 270 State: rw Data allowed: journal,btree,user Has data: user,cached,parity Durability: 1 Discard: 0 Freespace initialized: 1 Device: 4 Label: hdd04 (4) UUID: 06d8c0a7-e7e8-4308-9cd3-abcdd1e2f81e Size: 1.09 TiB read errors: 0 write errors: 0 checksum errors: 0 seqread iops: 0 seqwrite iops: 0 randread iops: 0 randwrite iops: 0 Bucket size: 256 KiB First bucket: 0 Buckets: 4578566 Last mount: Sun Mar 10 16:53:57 2024 Last superblock write: 270 State: rw Data allowed: journal,btree,user Has data: user,cached,parity Durability: 1 Discard: 0 Freespace initialized: 1 Device: 5 Label: hdd05 (5) UUID: 524d274b-5fde-4e34-93da-92503887fdab Size: 1.09 TiB read errors: 0 write errors: 0 checksum errors: 0 seqread iops: 0 seqwrite iops: 0 randread iops: 0 randwrite iops: 0 Bucket size: 256 KiB First bucket: 0 Buckets: 4578566 Last mount: Sun Mar 10 16:53:57 2024 Last superblock write: 270 State: rw Data allowed: journal,btree,user Has data: user,cached,parity Durability: 1 Discard: 0 Freespace initialized: 1 Device: 6 Label: hdd06 (6) UUID: e1e75a95-66f7-47c9-a060-035f1d123166 Size: 1.09 TiB read errors: 0 write errors: 0 checksum errors: 0 seqread iops: 0 seqwrite iops: 0 randread iops: 0 randwrite iops: 0 Bucket size: 256 KiB First bucket: 0 Buckets: 4578566 Last mount: Sun Mar 10 16:53:57 2024 Last superblock write: 270 State: rw Data allowed: journal,btree,user Has data: user,cached,parity Durability: 1 Discard: 0 Freespace initialized: 1 Device: 7 Label: hdd07 (7) UUID: e6f48e08-33df-4cfa-b97d-cc28c37d5485 Size: 1.09 TiB read errors: 0 write errors: 0 checksum errors: 0 seqread iops: 0 seqwrite iops: 0 randread iops: 0 randwrite iops: 0 Bucket size: 256 KiB First bucket: 0 Buckets: 4578566 Last mount: Sun Mar 10 16:53:57 2024 Last superblock write: 270 State: rw Data allowed: journal,btree,user Has data: user,cached,parity Durability: 1 Discard: 0 Freespace initialized: 1 Device: 8 Label: hdd08 (8) UUID: 6307e9ae-d9fd-4c55-bbfa-6620c1af19ba Size: 1.09 TiB read errors: 0 write errors: 0 checksum errors: 0 seqread iops: 0 seqwrite iops: 0 randread iops: 0 randwrite iops: 0 Bucket size: 256 KiB First bucket: 0 Buckets: 4578566 Last mount: Sun Mar 10 16:53:57 2024 Last superblock write: 270 State: rw Data allowed: journal,btree,user Has data: user,cached,parity Durability: 1 Discard: 0 Freespace initialized: 1 Device: 9 Label: hdd09 (9) UUID: e31223ef-3cbf-45b8-a6b7-8ad233149f17 Size: 1.09 TiB read errors: 0 write errors: 0 checksum errors: 0 seqread iops: 0 seqwrite iops: 0 randread iops: 0 randwrite iops: 0 Bucket size: 256 KiB First bucket: 0 Buckets: 4578566 Last mount: Sun Mar 10 16:53:57 2024 Last superblock write: 270 State: rw Data allowed: journal,btree,user Has data: user,cached,parity Durability: 1 Discard: 0 Freespace initialized: 1 Device: 10 Label: hdd10 (10) UUID: c5da352b-5702-4f75-b278-4ac1c6e9a489 Size: 1.09 TiB read errors: 0 write errors: 0 checksum errors: 0 seqread iops: 0 seqwrite iops: 0 randread iops: 0 randwrite iops: 0 Bucket size: 256 KiB First bucket: 0 Buckets: 4578566 Last mount: Sun Mar 10 16:53:57 2024 Last superblock write: 270 State: rw Data allowed: journal,btree,user Has data: user,cached,parity Durability: 1 Discard: 0 Freespace initialized: 1 Device: 11 Label: hdd11 (11) UUID: c2ac602d-5981-44be-bd7e-459a053d3bda Size: 1.09 TiB read errors: 0 write errors: 0 checksum errors: 0 seqread iops: 0 seqwrite iops: 0 randread iops: 0 randwrite iops: 0 Bucket size: 256 KiB First bucket: 0 Buckets: 4578566 Last mount: Sun Mar 10 16:53:57 2024 Last superblock write: 270 State: rw Data allowed: journal,btree,user Has data: user,cached,parity Durability: 1 Discard: 0 Freespace initialized: 1 Device: 12 Label: hdd12 (12) UUID: 669b7c16-8b12-45c6-a13d-ebe396ce4b04 Size: 1.09 TiB read errors: 0 write errors: 0 checksum errors: 0 seqread iops: 0 seqwrite iops: 0 randread iops: 0 randwrite iops: 0 Bucket size: 256 KiB First bucket: 0 Buckets: 4578566 Last mount: Sun Mar 10 16:53:57 2024 Last superblock write: 270 State: rw Data allowed: journal,btree,user Has data: user,cached,parity Durability: 1 Discard: 0 Freespace initialized: 1 Device: 13 Label: hdd13 (13) UUID: 4fbad3dc-1fcf-4c7e-b9d1-357b9d692dfa Size: 1.09 TiB read errors: 0 write errors: 0 checksum errors: 0 seqread iops: 0 seqwrite iops: 0 randread iops: 0 randwrite iops: 0 Bucket size: 256 KiB First bucket: 0 Buckets: 4578566 Last mount: Sun Mar 10 16:53:57 2024 Last superblock write: 270 State: rw Data allowed: journal,btree,user Has data: user,cached,parity Durability: 1 Discard: 0 Freespace initialized: 1 Device: 14 Label: ssd01 (15) UUID: f3420167-2aec-4bd4-945e-a86ba21ef3a9 Size: 186 GiB read errors: 0 write errors: 0 checksum errors: 0 seqread iops: 0 seqwrite iops: 0 randread iops: 0 randwrite iops: 0 Bucket size: 256 KiB First bucket: 0 Buckets: 763128 Last mount: Sun Mar 10 16:53:57 2024 Last superblock write: 270 State: rw Data allowed: journal,btree,user Has data: journal,btree,user,cached,parity Durability: 1 Discard: 0 Freespace initialized: 1 Device: 15 Label: ssd02 (16) UUID: 23476d10-959e-497c-905a-f41ea700f3da Size: 186 GiB read errors: 0 write errors: 0 checksum errors: 0 seqread iops: 0 seqwrite iops: 0 randread iops: 0 randwrite iops: 0 Bucket size: 512 KiB First bucket: 0 Buckets: 381564 Last mount: Sun Mar 10 16:53:57 2024 Last superblock write: 270 State: rw Data allowed: journal,btree,user Has data: journal,btree,user,cached,parity Durability: 1 Discard: 0 Freespace initialized: 1

errors (size 248): fs_usage_data_wrong 3 Wed Mar 6 12:20:52 2024 fs_usage_cached_wrong 3 Wed Mar 6 12:20:54 2024 fs_usage_replicas_wrong 17 Wed Mar 6 12:20:57 2024 dev_usage_buckets_wrong 8 Sun Mar 3 12:52:48 2024 dev_usage_sectors_wrong 25 Wed Mar 6 12:20:52 2024 dev_usage_fragmented_wrong 21 Wed Mar 6 12:20:52 2024 dev_usage_buckets_ec_wrong 16 Thu Mar 7 17:27:26 2024 alloc_key_data_type_wrong 4 Sun Mar 3 12:52:39 2024 alloc_key_dirty_sectors_wrong 9809 Wed Mar 6 12:20:45 2024 alloc_key_cached_sectors_wrong 8887 Wed Mar 6 12:20:50 2024 lru_entry_bad 4 Sun Mar 3 12:53:47 2024 ptr_to_missing_backpointer 11737 Thu Mar 7 17:30:58 2024 ptr_to_missing_replicas_entry 5 Sun Mar 3 12:49:50 2024 stale_dirty_ptr 13 Sun Mar 3 12:49:50 2024 stripe_sector_count_wrong 2476 Thu Mar 7 17:27:14 2024

dev-0/alloc_debug (HDD0)

             buckets         sectors      fragmented

free 3002065 0 0 sb 13 6152 504 journal 8192 4194304 0 btree 4608 2359296 0 user 77129 39279660 210451 cached 6109 1951769 0 parity 105112 53817344 0 stripe 1375338 703463461 672380 need_gc_gens 0 0 0 need_discard 0 0 0 ec 1480450

reserves: stripe 143136 normal 71596 copygc 56 btree 28 btree_copygc 0 reclaim 0

freelist_wait empty open buckets allocated 36 open buckets this dev 0 open buckets total 1024 open_buckets_wait empty open_buckets_btree 2 open_buckets_user 32 buckets_to_invalidate 0 btree reserve cache 1

dev-14/alloc_debug (SSD0)

             buckets         sectors      fragmented

free 29863 0 0 sb 13 6152 504 journal 5961 3052032 0 btree 95731 49014272 0 user 276783 141692181 20715 cached 354775 181470149 0 parity 0 0 0 stripe 2 0 0 need_gc_gens 0 0 0 need_discard 0 0 0 ec 2

reserves: stripe 23902 normal 11979 copygc 56 btree 28 btree_copygc 0 reclaim 0

freelist_wait empty open buckets allocated 36 open buckets this dev 8 open buckets total 1024 open_buckets_wait empty open_buckets_btree 2 open_buckets_user 32 buckets_to_invalidate 0 btree reserve cache 1

Kernel config bcachefs options

CONFIG_BCACHEFS_FS=m CONFIG_BCACHEFS_QUOTA=y CONFIG_BCACHEFS_ERASURE_CODING=y CONFIG_BCACHEFS_POSIX_ACL=y CONFIG_BCACHEFS_DEBUG_TRANSACTIONS=y

CONFIG_BCACHEFS_DEBUG is not set

CONFIG_BCACHEFS_TESTS is not set

CONFIG_BCACHEFS_LOCK_TIME_STATS is not set

CONFIG_BCACHEFS_NO_LATENCY_ACCT is not set

Libvirt device configuration sample

<disk type='block' device='disk'>
  <driver name='qemu' type='raw' cache='none' io='native'/>
  <source dev='/dev/disk/by-id/scsi-SHGST_HUC101212CSS600_L0JWX1KJ' index='17'/>
  <backingStore/>
  <target dev='vdb' bus='virtio'/>
  <alias name='virtio-disk1'/>
  <address type='pci' domain='0x0000' bus='0x08' slot='0x00' function='0x0'/>
</disk>

— Reply to this email directly, view it on GitHub https://github.com/koverstreet/bcachefs/issues/658, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAPGX3R5VTIHXBCCYYBHSLDYXQGT3AVCNFSM6AAAAABEOYL6KOVHI2DSMVQWIX3LMV43ASLTON2WKOZSGE3TONJYHE4DGMY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Wingar commented 6 months ago

It appears I don't have CONFIG_DEBUG_INFO, so I can't faddr2line right now. I'm rebuilding a new kernel with this (no other changes). I actually had restarted inbetween posting and your message so I had to induce the failure again (thankfully I can reproduce with some effort) and I can at least report that the call trace is identical to when I posted, which means the fault is identical each time.

I'll get the faddr2line the once the new kernel's up.

Wingar commented 6 months ago

Okay! scripts/faddr2line ./lib/lz4/lz4hc_compress.ko LZ4HC_compress_generic+0x37f/0x1ac0

LZ4HC_compress_generic+0x37f/0x1ac0:
LZ4_copy8 at /usr/src/linux/lib/lz4/lz4defs.h:158
(inlined by) LZ4_wildCopy at /usr/src/linux/lib/lz4/lz4defs.h:180
(inlined by) LZ4HC_encodeSequence at /usr/src/linux/lib/lz4/lz4hc_compress.c:296
(inlined by) LZ4HC_compress_generic at /usr/src/linux/lib/lz4/lz4hc_compress.c:402
koverstreet commented 6 months ago

This looks like an LZ4 HC bug, something's off with their output buffer length checking - will have to forward it to them.

Wingar commented 6 months ago

So, for the time being, sounds best to just disable lz4 or use a different compression?

koverstreet commented 6 months ago

Just use it with the default compression level.