koverstreet / bcachefs

Other
662 stars 70 forks source link

Crash in LZ4_compress_destSize_generic during writeback #710

Open g2p opened 1 month ago

g2p commented 1 month ago

The kernel is the same as for #708, d11e1164d85f6d6e08942edc0f5d14f9e9c6a8a0 rebased over linus/master (e091caf99f3a5006c95baec24330bac6f7f17193).

lz4 is used for foreground compression.

<1>[29330.458921] BUG: unable to handle page fault for address: ffffbc39c22e8000
<1>[29330.458959] #PF: supervisor read access in kernel mode
<1>[29330.458978] #PF: error_code(0x0000) - not-present page
<6>[29330.458996] PGD 100000067 P4D 100000067 PUD 10029c067 PMD 106dc7067 PTE 0
<4>[29330.459026] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
<4>[29330.459046] CPU: 4 PID: 37872 Comm: kworker/u50:9 Tainted: G        W   E      6.10.0-rc7-g2p #26
<4>[29330.459074] Hardware name: To Be Filled By O.E.M. X570 Phantom Gaming 4/X570 Phantom Gaming 4, BIOS P5.61 02/22/2024
<4>[29330.459103] Workqueue: writeback wb_workfn (flush-bcachefs-2)
<4>[29330.459130] RIP: 0010:LZ4_compress_destSize_generic+0x360/0x630
<4>[29330.459154] Code: 60 02 00 00 41 c6 00 00 48 89 c8 4c 8d 71 04 48 83 c6 02 48 29 d0 48 83 c2 04 66 89 46 fe 4c 89 f0 4d 39 ce 0f 83 28 01 00 00 <48> 8b 3a 4c 8b 38 4c 39 ff 0f 84 08 01 00 00 4c 31 ff f3 48 0f bc
<4>[29330.459200] RSP: 0018:ffffbc39c3553598 EFLAGS: 00010297
<4>[29330.459219] RAX: ffffbc39c22e7ff3 RBX: ffffbc39c22e0000 RCX: ffffbc39c22e6f4f
<4>[29330.459241] RDX: ffffbc39c22e7ffa RSI: ffffbc39c1e57fc2 RDI: 0000000000000000
<4>[29330.459263] RBP: ffffbc39c3553600 R08: ffffbc39c1e57fad R09: ffffbc39c22e7ff4
<4>[29330.459284] R10: ffff9ca959d80000 R11: ffffbc39c1e58ff4 R12: 0000000000000002
<4>[29330.459305] R13: ffffbc39c1e58ffa R14: ffffbc39c22e6f53 R15: 0000000000000000
<4>[29330.459327] FS:  0000000000000000(0000) GS:ffff9cad6ee00000(0000) knlGS:0000000000000000
<4>[29330.459351] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[29330.459371] CR2: ffffbc39c22e8000 CR3: 00000001092dc000 CR4: 0000000000350ef0
<4>[29330.459392] Call Trace:
<4>[29330.459405]  <TASK>
<4>[29330.459417]  ? show_regs+0x70/0x90
<4>[29330.459435]  ? __die+0x28/0x80
<4>[29330.459452]  ? page_fault_oops+0x290/0x5c0
<4>[29330.459472]  ? srso_return_thunk+0x5/0x5f
<4>[29330.459491]  ? LZ4_compress_destSize_generic+0x360/0x630
<4>[29330.459515]  ? kernelmode_fixup_or_oops.isra.0+0x69/0x90
<4>[29330.459537]  ? __bad_area_nosemaphore+0x191/0x2c0
<4>[29330.459560]  ? bad_area_nosemaphore+0x16/0x30
<4>[29330.459579]  ? do_kern_addr_fault+0x77/0xa0
<4>[29330.459598]  ? exc_page_fault+0x192/0x1a0
<4>[29330.459616]  ? asm_exc_page_fault+0x2b/0x30
<4>[29330.459641]  ? LZ4_compress_destSize_generic+0x360/0x630
<4>[29330.459667]  LZ4_compress_destSize+0x82/0xe0
<4>[29330.459687]  attempt_compress+0x1e9/0x290
<4>[29330.459706]  ? srso_return_thunk+0x5/0x5f
<4>[29330.459723]  ? kvmalloc_node_noprof+0x61/0x100
<4>[29330.459744]  ? srso_return_thunk+0x5/0x5f
<4>[29330.459761]  ? mempool_kvmalloc+0x1e/0x30
<4>[29330.459778]  ? srso_return_thunk+0x5/0x5f
<4>[29330.459796]  ? mempool_alloc_noprof+0x87/0x1c0
<4>[29330.459818]  bch2_bio_compress+0x262/0x620
<4>[29330.459844]  __bch2_write+0x1362/0x1660
<4>[29330.459877]  bch2_write+0x17f/0x4d0
<4>[29330.459892]  ? srso_return_thunk+0x5/0x5f
<4>[29330.459909]  ? bch2_write+0x17f/0x4d0
<4>[29330.459924]  ? bch2_writepage_io_done+0x510/0x510
<4>[29330.460443]  ? srso_return_thunk+0x5/0x5f
<4>[29330.460943]  ? write_cache_pages+0x50/0xc0
<4>[29330.461442]  bch2_writepage_do_io+0x3f/0x50
<4>[29330.461932]  bch2_writepages+0x83/0xd0
<4>[29330.462417]  do_writepages+0x82/0x270
<4>[29330.462894]  ? __enqueue_entity+0x109/0x150
<4>[29330.463370]  ? enqueue_entity+0xd9/0x550
<4>[29330.463842]  ? srso_return_thunk+0x5/0x5f
<4>[29330.464309]  ? pick_eevdf+0x177/0x1b0
<4>[29330.464768]  __writeback_single_inode+0x44/0x360
<4>[29330.465217]  ? srso_return_thunk+0x5/0x5f
<4>[29330.465666]  writeback_sb_inodes+0x254/0x560
<4>[29330.466115]  __writeback_inodes_wb+0x54/0x100
<4>[29330.466520]  ? queue_io+0x113/0x120
<4>[29330.466922]  wb_writeback+0x1a0/0x320
<4>[29330.467318]  ? srso_return_thunk+0x5/0x5f
<4>[29330.467717]  wb_workfn+0x357/0x420
<4>[29330.468114]  process_one_work+0x17b/0x3d0
<4>[29330.468506]  worker_thread+0x2f6/0x430
<4>[29330.468895]  ? rescuer_thread+0x410/0x410
<4>[29330.469277]  kthread+0xe8/0x120
<4>[29330.469652]  ? kthread_park+0xb0/0xb0
<4>[29330.470023]  ret_from_fork+0x4b/0x70
<4>[29330.470388]  ? kthread_park+0xb0/0xb0
<4>[29330.470755]  ret_from_fork_asm+0x11/0x20
<4>[29330.471133]  </TASK>
<4>[29330.477264] CR2: ffffbc39c22e8000
<4>[29330.477783] ---[ end trace 0000000000000000 ]---
<4>[29330.675643] RIP: 0010:LZ4_compress_destSize_generic+0x360/0x630
<4>[29330.675952] Code: 60 02 00 00 41 c6 00 00 48 89 c8 4c 8d 71 04 48 83 c6 02 48 29 d0 48 83 c2 04 66 89 46 fe 4c 89 f0 4d 39 ce 0f 83 28 01 00 00 <48> 8b 3a 4c 8b 38 4c 39 ff 0f 84 08 01 00 00 4c 31 ff f3 48 0f bc
<4>[29330.676600] RSP: 0018:ffffbc39c3553598 EFLAGS: 00010297
<4>[29330.676928] RAX: ffffbc39c22e7ff3 RBX: ffffbc39c22e0000 RCX: ffffbc39c22e6f4f
<4>[29330.677259] RDX: ffffbc39c22e7ffa RSI: ffffbc39c1e57fc2 RDI: 0000000000000000
<4>[29330.677594] RBP: ffffbc39c3553600 R08: ffffbc39c1e57fad R09: ffffbc39c22e7ff4
<4>[29330.677925] R10: ffff9ca959d80000 R11: ffffbc39c1e58ff4 R12: 0000000000000002
<4>[29330.678257] R13: ffffbc39c1e58ffa R14: ffffbc39c22e6f53 R15: 0000000000000000
<4>[29330.678591] FS:  0000000000000000(0000) GS:ffff9cad6ee00000(0000) knlGS:0000000000000000
<4>[29330.678930] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[29330.679268] CR2: ffffbc39c22e8000 CR3: 00000001092dc000 CR4: 0000000000350ef0
<0>[29330.679620] Kernel panic - not syncing: Fatal exception
<0>[29330.680960] Kernel Offset: 0x35a00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
<0>[29330.872869] Rebooting in 45 seconds..

With scripts/decode_stacktrace.sh:

<4>[29330.459417] ? show_regs (arch/x86/kernel/dumpstack.c:479) 
<4>[29330.459435] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434) 
<4>[29330.459452] ? page_fault_oops (arch/x86/mm/fault.c:715) 
<4>[29330.459472] ? srso_return_thunk (arch/x86/lib/retpoline.S:224) 
<4>[29330.459491] ? LZ4_compress_destSize_generic (lib/lz4/lz4defs.h:119 lib/lz4/lz4defs.h:203 lib/lz4/lz4_compress.c:641) 
<4>[29330.459515] ? kernelmode_fixup_or_oops.isra.0 (arch/x86/mm/fault.c:739) 
<4>[29330.459537] ? __bad_area_nosemaphore (arch/x86/mm/fault.c:828) 
<4>[29330.459560] ? bad_area_nosemaphore (arch/x86/mm/fault.c:835) 
<4>[29330.459579] ? do_kern_addr_fault (arch/x86/mm/fault.c:1199) 
<4>[29330.459598] ? exc_page_fault (arch/x86/mm/fault.c:1479 arch/x86/mm/fault.c:1539) 
<4>[29330.459616] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623) 
<4>[29330.459641] ? LZ4_compress_destSize_generic (lib/lz4/lz4defs.h:119 lib/lz4/lz4defs.h:203 lib/lz4/lz4_compress.c:641) 
<4>[29330.459667] LZ4_compress_destSize (lib/lz4/lz4_compress.c:764) 
<4>[29330.459687] attempt_compress (fs/bcachefs/compress.c:312) 
<4>[29330.459706] ? srso_return_thunk (arch/x86/lib/retpoline.S:224) 
<4>[29330.459723] ? kvmalloc_node_noprof (mm/util.c:640) 
<4>[29330.459744] ? srso_return_thunk (arch/x86/lib/retpoline.S:224) 
<4>[29330.459761] ? mempool_kvmalloc (mm/mempool.c:591) 
<4>[29330.459778] ? srso_return_thunk (arch/x86/lib/retpoline.S:224) 
<4>[29330.459796] ? mempool_alloc_noprof (mm/mempool.c:402) 
<4>[29330.459818] bch2_bio_compress (fs/bcachefs/compress.c:426 fs/bcachefs/compress.c:498) 
<4>[29330.459844] __bch2_write (fs/bcachefs/io_write.c:965 (discriminator 1) fs/bcachefs/io_write.c:1463 (discriminator 1)) 
<4>[29330.459877] bch2_write (fs/bcachefs/io_write.c:1644) 
<4>[29330.459892] ? srso_return_thunk (arch/x86/lib/retpoline.S:224) 
<4>[29330.459909] ? bch2_write (fs/bcachefs/io_write.c:1644) 
<4>[29330.459924] ? bch2_writepage_io_done (fs/bcachefs/fs-io-buffered.c:500) 
<4>[29330.460443] ? srso_return_thunk (arch/x86/lib/retpoline.S:224) 
<4>[29330.460943] ? write_cache_pages (mm/page-writeback.c:2612) 
<4>[29330.461442] bch2_writepage_do_io (fs/bcachefs/fs-io-buffered.c:461) 
<4>[29330.461932] bch2_writepages (fs/bcachefs/fs-io-buffered.c:653) 
<4>[29330.462417] do_writepages (mm/page-writeback.c:2663) 
<4>[29330.462894] ? __enqueue_entity (kernel/sched/fair.c:829) 
<4>[29330.463370] ? enqueue_entity (kernel/sched/fair.c:5331) 
<4>[29330.463842] ? srso_return_thunk (arch/x86/lib/retpoline.S:224) 
<4>[29330.464309] ? pick_eevdf (kernel/sched/fair.c:881) 
<4>[29330.464768] __writeback_single_inode (fs/fs-writeback.c:1660) 
<4>[29330.465217] ? srso_return_thunk (arch/x86/lib/retpoline.S:224) 
<4>[29330.465666] writeback_sb_inodes (fs/fs-writeback.c:1949) 
<4>[29330.466115] __writeback_inodes_wb (fs/fs-writeback.c:2019) 
<4>[29330.466520] ? queue_io (fs/fs-writeback.c:1487) 
<4>[29330.466922] wb_writeback (fs/fs-writeback.c:2129) 
<4>[29330.467318] ? srso_return_thunk (arch/x86/lib/retpoline.S:224) 
<4>[29330.467717] wb_workfn (fs/fs-writeback.c:2286 (discriminator 1) fs/fs-writeback.c:2314 (discriminator 1)) 
<4>[29330.468114] process_one_work (kernel/workqueue.c:3253) 
<4>[29330.468506] worker_thread (kernel/workqueue.c:3323 (discriminator 2) kernel/workqueue.c:3409 (discriminator 2)) 
<4>[29330.468895] ? rescuer_thread (kernel/workqueue.c:3356) 
<4>[29330.469277] kthread (kernel/kthread.c:389) 
<4>[29330.469652] ? kthread_park (kernel/kthread.c:342) 
<4>[29330.470023] ret_from_fork (arch/x86/kernel/process.c:153) 
<4>[29330.470388] ? kthread_park (kernel/kthread.c:342) 
<4>[29330.470755] ret_from_fork_asm (arch/x86/entry/entry_64.S:257) 
g2p commented 1 month ago

I looked at #658. I did not customize the compression level. Also, it seems like the crash might be an error reading from the source buffer?

Confirmed by looking at objdump -rSl lib/lz4/lz4_compress.o:

/home/g2p/src/evilpiepirate.org/git/bcachefs/lib/lz4/lz4_compress.c:641
                        size_t matchLength = LZ4_count(ip + MINMATCH,
    1c9c:       48 83 c2 04             add    $0x4,%rdx
/home/g2p/src/evilpiepirate.org/git/bcachefs/lib/lz4/lz4_compress.c:637
                LZ4_writeLE16(op, (U16)(ip - match)); op += 2;
    1ca0:       66 89 46 fe             mov    %ax,-0x2(%rsi)
LZ4_count():
/home/g2p/src/evilpiepirate.org/git/bcachefs/lib/lz4/lz4defs.h:202 (discriminator 1)
        while (likely(pIn < pInLimit - (STEPSIZE - 1))) {
    1ca4:       4c 89 f0                mov    %r14,%rax
    1ca7:       4d 39 ce                cmp    %r9,%r14
    1caa:       0f 83 28 01 00 00       jae    1dd8 <LZ4_compress_destSize_generic+0x488>
LZ4_read_ARCH():
/home/g2p/src/evilpiepirate.org/git/bcachefs/lib/lz4/lz4defs.h:119
        return get_unaligned((const size_t *)ptr);
    1cb0:       48 8b 3a                mov    (%rdx),%rdi
    1cb3:       4c 8b 38                mov    (%rax),%r15

Where the bad address is in %rdx according to decodecode:

Code: 60 02 00 00 41 c6 00 00 48 89 c8 4c 8d 71 04 48 83 c6 02 48 29 d0 48 83 c2 04 66 89 46 fe 4c 89 f0 4d 39 ce 0f 83 28 01 00 00 <48> 8b12:21:46 [1/873]
39 ff 0f 84 08 01 00 00 4c 31 ff f3 48 0f bc          
All code      
========
   0:   60                      (bad)
   1:   02 00                   add    (%rax),%al
   3:   00 41 c6                add    %al,-0x3a(%rcx)
   6:   00 00                   add    %al,(%rax)
   8:   48 89 c8                mov    %rcx,%rax
   b:   4c 8d 71 04             lea    0x4(%rcx),%r14
   f:   48 83 c6 02             add    $0x2,%rsi
  13:   48 29 d0                sub    %rdx,%rax
  16:   48 83 c2 04             add    $0x4,%rdx
  1a:   66 89 46 fe             mov    %ax,-0x2(%rsi)
  1e:   4c 89 f0                mov    %r14,%rax
  21:   4d 39 ce                cmp    %r9,%r14
  24:   0f 83 28 01 00 00       jae    0x152
  2a:*  48 8b 3a                mov    (%rdx),%rdi              <-- trapping instruction
  2d:   4c 8b 38                mov    (%rax),%r15
  30:   4c 39 ff                cmp    %r15,%rdi
  33:   0f 84 08 01 00 00       je     0x141
  39:   4c 31 ff                xor    %r15,%rdi
  3c:   f3                      repz
  3d:   48                      rex.W
  3e:   0f                      .byte 0xf
  3f:   bc                      .byte 0xbc
g2p commented 1 month ago

From a quick grep bcachefs is the only user of LZ4_compress_destSize. Quite possibly the bug is in the in-kernel LZ4 library; it was updated last to upstream LZ4 1.8.3 in 2018 which has known bugs (documented upstream).

koverstreet commented 1 month ago

We can switch away from LZ4_compress_destsize as a workaround if we really need to