koverstreet / bcachefs

Other
694 stars 72 forks source link

Kernel panic in shrinker_to_text -> strlen #731

Closed g2p closed 3 weeks ago

g2p commented 2 months ago

Might shrinker_to_text end up called with a null shrinker? Though I don't see how exactly.

CONFIG_SHRINKER_DEBUG was off.

From 1d875e4e9cf2ba1542c7279a97f64e5971804d1c (bcachefs-testing).

<4>[ 3422.835918] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623) 
<4>[ 3422.835934] ? strlen (lib/string.c:402 (discriminator 1)) 
<4>[ 3422.835944] ? seq_buf_puts (lib/seq_buf.c:186) 
<4>[ 3422.835954] shrinker_to_text (mm/shrinker.c:829) 
<4>[ 3422.835967] shrinkers_to_text (mm/shrinker.c:897 (discriminator 1)) 
<4>[ 3422.835976] ? prb_read_valid (kernel/printk/printk_ringbuffer.c:2183) 
<4>[ 3422.835985] ? srso_return_thunk (arch/x86/lib/retpoline.S:224) 
<4>[ 3422.835995] ? console_unlock (kernel/printk/printk.c:3137 (discriminator 1)) 
<4>[ 3422.836018] __show_mem (./include/linux/seq_buf.h:100 mm/show_mem.c:490) 
<4>[ 3422.836030] dump_header (mm/oom_kill.c:445 (discriminator 1)) 
<4>[ 3422.836040] oom_kill_process (mm/oom_kill.c:424 mm/oom_kill.c:1013) 
<4>[ 3422.836049] ? srso_return_thunk (arch/x86/lib/retpoline.S:224) 
<4>[ 3422.836061] out_of_memory (mm/oom_kill.c:1152) 
<4>[ 3422.833539] gcc invoked oom-killer: gfp_mask=0x440dc0(GFP_KERNEL_ACCOUNT|__GFP_COMP|__GFP_ZERO), order=0, oom_score_adj=0
<4>[ 3422.833570] CPU: 9 UID: 1000 PID: 81403 Comm: gcc Tainted: G        W   E      6.11.0-rc4-g2p #57
<4>[ 3422.833589] Tainted: [W]=WARN, [E]=UNSIGNED_MODULE
<4>[ 3422.833600] Hardware name: To Be Filled By O.E.M. X570 Phantom Gaming 4/X570 Phantom Gaming 4, BIOS P5.61 02/22/2024
<4>[ 3422.833619] Call Trace:
<4>[ 3422.833627]  <TASK>
<4>[ 3422.833637] dump_stack_lvl (lib/dump_stack.c:122) 
<4>[ 3422.833652] dump_stack (lib/dump_stack.c:129) 
<4>[ 3422.833662] dump_header (mm/oom_kill.c:74 mm/oom_kill.c:442) 
<4>[ 3422.833676] oom_kill_process (mm/oom_kill.c:424 mm/oom_kill.c:1013) 
<4>[ 3422.833687] ? srso_return_thunk (arch/x86/lib/retpoline.S:224) 
<4>[ 3422.833702] out_of_memory (mm/oom_kill.c:1152) 
<4>[ 3422.833717] __alloc_pages_noprof (mm/page_alloc.c:3609 mm/page_alloc.c:4371 mm/page_alloc.c:4708) 
<4>[ 3422.833734] ? lock_acquire (./include/trace/events/lock.h:24 (discriminator 2) kernel/locking/lockdep.c:5730 (discriminator 2)) 
<4>[ 3422.833760] alloc_pages_mpol_noprof (mm/mempolicy.c:2265 (discriminator 1)) 
<4>[ 3422.833773] ? lock_acquire (./include/trace/events/lock.h:24 (discriminator 2) kernel/locking/lockdep.c:5730 (discriminator 2)) 
<4>[ 3422.833787] alloc_pages_noprof (mm/mempolicy.c:2345) 
<4>[ 3422.833800] pte_alloc_one (./include/asm-generic/pgalloc.h:71 arch/x86/mm/pgtable.c:33) 
<4>[ 3422.833812] __do_fault (mm/memory.c:4650 (discriminator 1)) 
<4>[ 3422.833825] do_fault (mm/memory.c:5091 mm/memory.c:5193) 
<4>[ 3422.833836] ? srso_return_thunk (arch/x86/lib/retpoline.S:224) 
<4>[ 3422.833850] __handle_mm_fault (mm/memory.c:3947 mm/memory.c:5521 mm/memory.c:5664) 
<4>[ 3422.833859] ? srso_return_thunk (arch/x86/lib/retpoline.S:224) 
<4>[ 3422.833869] ? lock_release (./include/trace/events/lock.h:69 (discriminator 2) kernel/locking/lockdep.c:5770 (discriminator 2)) 
<4>[ 3422.833890] handle_mm_fault (mm/memory.c:5832) 
<4>[ 3422.833903] do_user_addr_fault (arch/x86/mm/fault.c:1389) 
<4>[ 3422.833919] exc_page_fault (./arch/x86/include/asm/irqflags.h:26 ./arch/x86/include/asm/irqflags.h:87 ./arch/x86/include/asm/irqflags.h:147 arch/x86/mm/fault.c:1489 arch/x86/mm/fault.c:1539) 
<4>[ 3422.833933] asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623) 
<4>[ 3422.833945] RIP: 0010:rep_stos_alternative (arch/x86/lib/clear_page_64.S:96) 
<4>[ 3422.833958] Code: ff c7 48 ff c9 75 f6 e9 ce fd 0c 00 48 89 07 48 83 c7 08 83 e9 08 74 ef 83 f9 08 73 ef eb de 66 66 2e 0f 1f 84 00 00 00 00 00 <48> 89 07 48 89 47 08 48 89 47 10 48 89 47 18 48 89 47 20 48 89 47
All code
========
   0:   ff c7                   inc    %edi
   2:   48 ff c9                dec    %rcx
   5:   75 f6                   jne    0xfffffffffffffffd
   7:   e9 ce fd 0c 00          jmp    0xcfdda
   c:   48 89 07                mov    %rax,(%rdi)
   f:   48 83 c7 08             add    $0x8,%rdi
  13:   83 e9 08                sub    $0x8,%ecx
  16:   74 ef                   je     0x7
  18:   83 f9 08                cmp    $0x8,%ecx
  1b:   73 ef                   jae    0xc
  1d:   eb de                   jmp    0xfffffffffffffffd
  1f:   66 66 2e 0f 1f 84 00    data16 cs nopw 0x0(%rax,%rax,1)
  26:   00 00 00 00 
  2a:*  48 89 07                mov    %rax,(%rdi)      <-- trapping instruction
  2d:   48 89 47 08             mov    %rax,0x8(%rdi)
  31:   48 89 47 10             mov    %rax,0x10(%rdi)
  35:   48 89 47 18             mov    %rax,0x18(%rdi)
  39:   48 89 47 20             mov    %rax,0x20(%rdi)
  3d:   48                      rex.W
  3e:   89                      .byte 0x89
  3f:   47                      rex.RXB

Code starting with the faulting instruction
===========================================
   0:   48 89 07                mov    %rax,(%rdi)
   3:   48 89 47 08             mov    %rax,0x8(%rdi)
   7:   48 89 47 10             mov    %rax,0x10(%rdi)
   b:   48 89 47 18             mov    %rax,0x18(%rdi)
   f:   48 89 47 20             mov    %rax,0x20(%rdi)
  13:   48                      rex.W
  14:   89                      .byte 0x89
  15:   47                      rex.RXB
<4>[ 3422.833991] RSP: 0018:ffffb4c842f47d00 EFLAGS: 00050202
<4>[ 3422.834004] RAX: 0000000000000000 RBX: 00007f33d0ab8104 RCX: 0000000000000efc
<4>[ 3422.834019] RDX: 00007f33d0ab5360 RSI: 00000000000000a5 RDI: 00007f33d0ab8104
<4>[ 3422.834033] RBP: ffffb4c842f47d40 R08: 00007f33d0ab5000 R09: 0000000000000000
<4>[ 3422.834047] R10: 0000000000000000 R11: 0000000000000000 R12: ffff95d3d87764a8
<4>[ 3422.834061] R13: 0000000000000003 R14: 00007f33d0ab82d8 R15: 0000000000000104
<4>[ 3422.834085] ? elf_load (./arch/x86/include/asm/smap.h:33 ./arch/x86/include/asm/uaccess_64.h:181 ./arch/x86/include/asm/uaccess_64.h:189 fs/binfmt_elf.c:125 fs/binfmt_elf.c:421) 
<4>[ 3422.834100] load_elf_binary (fs/binfmt_elf.c:679 (discriminator 2) fs/binfmt_elf.c:1235 (discriminator 2)) 
<4>[ 3422.834123] bprm_execve (fs/exec.c:1829 fs/exec.c:1869 fs/exec.c:1920 fs/exec.c:1896) 
<4>[ 3422.834140] do_execveat_common.isra.0 (fs/exec.c:2027) 
<4>[ 3422.834156] __x64_sys_execve (fs/exec.c:2172) 
<4>[ 3422.834168] x64_sys_call (arch/x86/entry/syscall_64.c:36) 
<4>[ 3422.834181] do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1)) 
<4>[ 3422.834191] ? exc_page_fault (arch/x86/mm/fault.c:1543) 
<4>[ 3422.834205] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) 
<4>[ 3422.834217] RIP: 0033:0x7fb0476eef3b
<4>[ 3422.834231] Code: Unable to access opcode bytes at 0x7fb0476eef11.

Code starting with the faulting instruction
===========================================
<4>[ 3422.834243] RSP: 002b:00007fff1d690918 EFLAGS: 00000246 ORIG_RAX: 000000000000003b
<4>[ 3422.834260] RAX: ffffffffffffffda RBX: 00005592d2d6b550 RCX: 00007fb0476eef3b
<4>[ 3422.834274] RDX: 00005592d2d6ae18 RSI: 00005592d2d6aa30 RDI: 00005592d2d6b550
<4>[ 3422.834288] RBP: 00007fff1d6909f0 R08: 00000000000007f0 R09: 00000000000003f0
<4>[ 3422.834302] R10: 00007fb047803ac0 R11: 0000000000000246 R12: 00005592d2d6aa30
<4>[ 3422.834316] R13: 00005592a5e90004 R14: 0000000000000002 R15: 00005592d2d69120
<4>[ 3422.834341]  </TASK>
<4>[ 3422.834348] Mem-Info:
<4>[ 3422.834356] active_anon:595 inactive_anon:1731 isolated_anon:0
<4>[ 3422.834356]  active_file:1302 inactive_file:1550 isolated_file:0
<4>[ 3422.834356]  unevictable:52 dirty:0 writeback:0
<4>[ 3422.834356]  slab_reclaimable:16140 slab_unreclaimable:3065596
<4>[ 3422.834356]  mapped:1388 shmem:150 pagetables:4070
<4>[ 3422.834356]  sec_pagetables:0 bounce:0
<4>[ 3422.834356]  kernel_misc_reclaimable:0
<4>[ 3422.834356]  free:61244 free_pcp:252 free_cma:0
<4>[ 3422.834424] Node 0 active_anon:2380kB inactive_anon:6924kB active_file:5208kB inactive_file:6200kB unevictable:208kB isolated(anon):0kB isolated(file):0kB mapped:5552kB dirty:0kB writeback:0kB shmem:600kB shmem_thp:0kB shmem_pmdmapped:0kB anon_thp:0kB writeback_tmp:0kB kernel_stack:10352kB pagetables:16280kB sec_pagetables:0kB all_unreclaimable? no
<4>[ 3422.834478] Node 0 DMA free:9612kB boost:0kB min:60kB low:72kB high:84kB reserved_highatomic:0KB active_anon:100kB inactive_anon:24kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15996kB managed:15368kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
<4>[ 3422.834527] lowmem_reserve[]: 0 2904 15838 0 0
<4>[ 3422.834548] Node 0 DMA32 free:78296kB boost:0kB min:12380kB low:15472kB high:18564kB reserved_highatomic:30720KB active_anon:88kB inactive_anon:60kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:3064752kB managed:2998772kB mlocked:0kB bounce:0kB free_pcp:984kB local_pcp:0kB free_cma:0kB
<4>[ 3422.834599] lowmem_reserve[]: 0 0 12933 0 0
<4>[ 3422.834618] Node 0 Normal free:157068kB boost:73728kB min:128864kB low:142648kB high:156432kB reserved_highatomic:49152KB active_anon:2364kB inactive_anon:6880kB active_file:5232kB inactive_file:6896kB unevictable:208kB writepending:0kB present:13618688kB managed:13251192kB mlocked:68kB bounce:0kB free_pcp:92kB local_pcp:0kB free_cma:0kB
<4>[ 3422.834672] lowmem_reserve[]: 0 0 0 0 0
<4>[ 3422.834691] Node 0 DMA: 16*4kB (M) 10*8kB (UM) 8*16kB (M) 10*32kB (M) 5*64kB (M) 6*128kB (M) 5*256kB (UM) 1*512kB (M) 2*1024kB (UM) 2*2048kB (M) 0*4096kB = 9616kB
<4>[ 3422.834758] Node 0 DMA32: 75*4kB (U) 87*8kB (UM) 84*16kB (UM) 487*32kB (UM) 401*64kB (UM) 272*128kB (UM) 1*256kB (U) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 78660kB
<4>[ 3422.834820] Node 0 Normal: 3039*4kB (UME) 2320*8kB (UME) 1449*16kB (UME) 1152*32kB (UME) 574*64kB (UME) 232*128kB (UME) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 157196kB
<4>[ 3422.834881] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
<4>[ 3422.834899] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
<4>[ 3422.834916] 4205 total pagecache pages
<4>[ 3422.834925] 863 pages in swap cache
<4>[ 3422.834934] Free swap  = 14962428kB
<4>[ 3422.834943] Total swap = 16777212kB
<4>[ 3422.834952] 4174859 pages RAM
<4>[ 3422.834960] 0 pages HighMem/MovableOnly
<4>[ 3422.834969] 108526 pages reserved
<4>[ 3422.834977] 0 pages hwpoisoned
<4>[ 3422.834986] Unreclaimable slab info:
<5>[ 3422.835469] kmalloc-rnd-06-8k total: 456 MiB active: 456 MiB
<5>[ 3422.835471] kmalloc-rnd-11-1k total: 82.1 MiB active: 82.1 MiB
<5>[ 3422.835472] kernfs_node_cache total: 11.6 MiB active: 11.6 MiB
<5>[ 3422.835474] page->ptl         total: 7.29 MiB active: 2.68 MiB
<5>[ 3422.835475] kmalloc-rnd-10-512 total: 6.84 MiB active: 6.84 MiB
<5>[ 3422.835476] task_struct       total: 6.84 MiB active: 6.84 MiB
<5>[ 3422.835478] shmem_inode_cache total: 6.29 MiB active: 6.29 MiB
<5>[ 3422.835479] vm_area_struct    total: 3.10 MiB active: 2.94 MiB
<5>[ 3422.835480] vma_lock          total: 2.86 MiB active: 2.71 MiB
<5>[ 3422.835482] filp              total: 2.63 MiB active: 2.46 MiB
<5>[ 3422.835483]
<4>[ 3422.835571] Shrinkers:
<1>[ 3422.835608] BUG: kernel NULL pointer dereference, address: 0000000000000000
<1>[ 3422.835619] #PF: supervisor read access in kernel mode
<1>[ 3422.835629] #PF: error_code(0x0000) - not-present page
<6>[ 3422.835638] PGD 12c909067 P4D 12c909067 PUD 12c90a067 PMD 0
<4>[ 3422.835654] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
<4>[ 3422.835664] CPU: 9 UID: 1000 PID: 81403 Comm: gcc Tainted: G        W   E      6.11.0-rc4-g2p #57
<4>[ 3422.835680] Tainted: [W]=WARN, [E]=UNSIGNED_MODULE
<4>[ 3422.835689] Hardware name: To Be Filled By O.E.M. X570 Phantom Gaming 4/X570 Phantom Gaming 4, BIOS P5.61 02/22/2024
<4>[ 3422.835705] RIP: 0010:strlen (lib/string.c:402 (discriminator 1)) 
<4>[ 3422.835714] Code: f7 75 ec 31 c0 31 d2 31 f6 31 ff e9 56 e2 0d 00 48 89 f8 31 d2 31 f6 31 ff e9 48 e2 0d 00 0f 1f 84 00 00 00 00 00 f3 0f 1e fa <80> 3f 00 74 16 48 89 f8 48 83 c0 01 80 38 00 75 f7 48 29 f8 31 ff
All code
========
   0:   f7 75 ec                divl   -0x14(%rbp)
   3:   31 c0                   xor    %eax,%eax
   5:   31 d2                   xor    %edx,%edx
   7:   31 f6                   xor    %esi,%esi
   9:   31 ff                   xor    %edi,%edi
   b:   e9 56 e2 0d 00          jmp    0xde266
  10:   48 89 f8                mov    %rdi,%rax
  13:   31 d2                   xor    %edx,%edx
  15:   31 f6                   xor    %esi,%esi
  17:   31 ff                   xor    %edi,%edi
  19:   e9 48 e2 0d 00          jmp    0xde266
  1e:   0f 1f 84 00 00 00 00    nopl   0x0(%rax,%rax,1)
  25:   00 
  26:   f3 0f 1e fa             endbr64
  2a:*  80 3f 00                cmpb   $0x0,(%rdi)      <-- trapping instruction
  2d:   74 16                   je     0x45
  2f:   48 89 f8                mov    %rdi,%rax
  32:   48 83 c0 01             add    $0x1,%rax
  36:   80 38 00                cmpb   $0x0,(%rax)
  39:   75 f7                   jne    0x32
  3b:   48 29 f8                sub    %rdi,%rax
  3e:   31 ff                   xor    %edi,%edi

Code starting with the faulting instruction
===========================================
   0:   80 3f 00                cmpb   $0x0,(%rdi)
   3:   74 16                   je     0x1b
   5:   48 89 f8                mov    %rdi,%rax
   8:   48 83 c0 01             add    $0x1,%rax
   c:   80 38 00                cmpb   $0x0,(%rax)
   f:   75 f7                   jne    0x8
  11:   48 29 f8                sub    %rdi,%rax
  14:   31 ff                   xor    %edi,%edi
<4>[ 3422.835741] RSP: 0018:ffffb4c842f475a8 EFLAGS: 00010246
<4>[ 3422.835751] RAX: 0000000000000000 RBX: ffffb4c842f47770 RCX: fffffffffffffffe
<4>[ 3422.835763] RDX: ffffb4c842f47680 RSI: 0000000000000000 RDI: 0000000000000000
<4>[ 3422.835775] RBP: ffffb4c842f475c8 R08: 0000000000000000 R09: 0000000000000000
<4>[ 3422.835786] R10: 0000000000000000 R11: 0000000000000000 R12: ffffb4c842f47770
<4>[ 3422.835798] R13: 0000000000000000 R14: 20c49ba5e353f7cf R15: 0000000000000008
<4>[ 3422.835809] FS:  0000000000000000(0000) GS:ffff95d6ee600000(0000) knlGS:0000000000000000
<4>[ 3422.835823] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[ 3422.835833] CR2: 0000000000000000 CR3: 000000021ba66000 CR4: 0000000000350ef0
<4>[ 3422.835845] Call Trace:
<4>[ 3422.835851]  <TASK>
<4>[ 3422.835857] ? show_regs (arch/x86/kernel/dumpstack.c:479) 
<4>[ 3422.835867] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434) 
<4>[ 3422.835876] ? page_fault_oops (arch/x86/mm/fault.c:715) 
<4>[ 3422.835893] ? do_user_addr_fault (arch/x86/mm/fault.c:1236) 
<4>[ 3422.835906] ? exc_page_fault (./arch/x86/include/asm/irqflags.h:26 ./arch/x86/include/asm/irqflags.h:87 ./arch/x86/include/asm/irqflags.h:147 arch/x86/mm/fault.c:1489 arch/x86/mm/fault.c:1539) 
<4>[ 3422.835918] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623) 
<4>[ 3422.835934] ? strlen (lib/string.c:402 (discriminator 1)) 
<4>[ 3422.835944] ? seq_buf_puts (lib/seq_buf.c:186) 
<4>[ 3422.835954] shrinker_to_text (mm/shrinker.c:829) 
<4>[ 3422.835967] shrinkers_to_text (mm/shrinker.c:897 (discriminator 1)) 
<4>[ 3422.835976] ? prb_read_valid (kernel/printk/printk_ringbuffer.c:2183) 
<4>[ 3422.835985] ? srso_return_thunk (arch/x86/lib/retpoline.S:224) 
<4>[ 3422.835995] ? console_unlock (kernel/printk/printk.c:3137 (discriminator 1)) 
<4>[ 3422.836018] __show_mem (./include/linux/seq_buf.h:100 mm/show_mem.c:490) 
<4>[ 3422.836030] dump_header (mm/oom_kill.c:445 (discriminator 1)) 
<4>[ 3422.836040] oom_kill_process (mm/oom_kill.c:424 mm/oom_kill.c:1013) 
<4>[ 3422.836049] ? srso_return_thunk (arch/x86/lib/retpoline.S:224) 
<4>[ 3422.836061] out_of_memory (mm/oom_kill.c:1152) 
<4>[ 3422.836073] __alloc_pages_noprof (mm/page_alloc.c:3609 mm/page_alloc.c:4371 mm/page_alloc.c:4708) 
<4>[ 3422.836086] ? lock_acquire (./include/trace/events/lock.h:24 (discriminator 2) kernel/locking/lockdep.c:5730 (discriminator 2)) 
<4>[ 3422.836105] alloc_pages_mpol_noprof (mm/mempolicy.c:2265 (discriminator 1)) 
<4>[ 3422.836116] ? lock_acquire (./include/trace/events/lock.h:24 (discriminator 2) kernel/locking/lockdep.c:5730 (discriminator 2)) 
<4>[ 3422.836127] alloc_pages_noprof (mm/mempolicy.c:2345) 
<4>[ 3422.836138] pte_alloc_one (./include/asm-generic/pgalloc.h:71 arch/x86/mm/pgtable.c:33) 
<4>[ 3422.836147] __do_fault (mm/memory.c:4650 (discriminator 1)) 
<4>[ 3422.836158] do_fault (mm/memory.c:5091 mm/memory.c:5193) 
<4>[ 3422.836166] ? srso_return_thunk (arch/x86/lib/retpoline.S:224) 
<4>[ 3422.836178] __handle_mm_fault (mm/memory.c:3947 mm/memory.c:5521 mm/memory.c:5664) 
<4>[ 3422.836187] ? srso_return_thunk (arch/x86/lib/retpoline.S:224) 
<4>[ 3422.836196] ? lock_release (./include/trace/events/lock.h:69 (discriminator 2) kernel/locking/lockdep.c:5770 (discriminator 2)) 
<4>[ 3422.836216] handle_mm_fault (mm/memory.c:5832) 
<4>[ 3422.836227] do_user_addr_fault (arch/x86/mm/fault.c:1389) 
<4>[ 3422.836241] exc_page_fault (./arch/x86/include/asm/irqflags.h:26 ./arch/x86/include/asm/irqflags.h:87 ./arch/x86/include/asm/irqflags.h:147 arch/x86/mm/fault.c:1489 arch/x86/mm/fault.c:1539) 
<4>[ 3422.836252] asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623) 
<4>[ 3422.836261] RIP: 0010:rep_stos_alternative (arch/x86/lib/clear_page_64.S:96) 
<4>[ 3422.836271] Code: ff c7 48 ff c9 75 f6 e9 ce fd 0c 00 48 89 07 48 83 c7 08 83 e9 08 74 ef 83 f9 08 73 ef eb de 66 66 2e 0f 1f 84 00 00 00 00 00 <48> 89 07 48 89 47 08 48 89 47 10 48 89 47 18 48 89 47 20 48 89 47
All code
========
   0:   ff c7                   inc    %edi
   2:   48 ff c9                dec    %rcx
   5:   75 f6                   jne    0xfffffffffffffffd
   7:   e9 ce fd 0c 00          jmp    0xcfdda
   c:   48 89 07                mov    %rax,(%rdi)
   f:   48 83 c7 08             add    $0x8,%rdi
  13:   83 e9 08                sub    $0x8,%ecx
  16:   74 ef                   je     0x7
  18:   83 f9 08                cmp    $0x8,%ecx
  1b:   73 ef                   jae    0xc
  1d:   eb de                   jmp    0xfffffffffffffffd
  1f:   66 66 2e 0f 1f 84 00    data16 cs nopw 0x0(%rax,%rax,1)
  26:   00 00 00 00 
  2a:*  48 89 07                mov    %rax,(%rdi)      <-- trapping instruction
  2d:   48 89 47 08             mov    %rax,0x8(%rdi)
  31:   48 89 47 10             mov    %rax,0x10(%rdi)
  35:   48 89 47 18             mov    %rax,0x18(%rdi)
  39:   48 89 47 20             mov    %rax,0x20(%rdi)
  3d:   48                      rex.W
  3e:   89                      .byte 0x89
  3f:   47                      rex.RXB

Code starting with the faulting instruction
===========================================
   0:   48 89 07                mov    %rax,(%rdi)
   3:   48 89 47 08             mov    %rax,0x8(%rdi)
   7:   48 89 47 10             mov    %rax,0x10(%rdi)
   b:   48 89 47 18             mov    %rax,0x18(%rdi)
   f:   48 89 47 20             mov    %rax,0x20(%rdi)
  13:   48                      rex.W
  14:   89                      .byte 0x89
  15:   47                      rex.RXB
<4>[ 3422.836298] RSP: 0018:ffffb4c842f47d00 EFLAGS: 00050202
<4>[ 3422.836308] RAX: 0000000000000000 RBX: 00007f33d0ab8104 RCX: 0000000000000efc
<4>[ 3422.836320] RDX: 00007f33d0ab5360 RSI: 00000000000000a5 RDI: 00007f33d0ab8104
<4>[ 3422.836331] RBP: ffffb4c842f47d40 R08: 00007f33d0ab5000 R09: 0000000000000000
<4>[ 3422.836343] R10: 0000000000000000 R11: 0000000000000000 R12: ffff95d3d87764a8
<4>[ 3422.836354] R13: 0000000000000003 R14: 00007f33d0ab82d8 R15: 0000000000000104
<4>[ 3422.836374] ? elf_load (./arch/x86/include/asm/smap.h:33 ./arch/x86/include/asm/uaccess_64.h:181 ./arch/x86/include/asm/uaccess_64.h:189 fs/binfmt_elf.c:125 fs/binfmt_elf.c:421) 
<4>[ 3422.836386] load_elf_binary (fs/binfmt_elf.c:679 (discriminator 2) fs/binfmt_elf.c:1235 (discriminator 2)) 
<4>[ 3422.836405] bprm_execve (fs/exec.c:1829 fs/exec.c:1869 fs/exec.c:1920 fs/exec.c:1896) 
<4>[ 3422.836418] do_execveat_common.isra.0 (fs/exec.c:2027) 
<4>[ 3422.836431] __x64_sys_execve (fs/exec.c:2172) 
<4>[ 3422.836441] x64_sys_call (arch/x86/entry/syscall_64.c:36) 
<4>[ 3422.836451] do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1)) 
<4>[ 3422.836459] ? exc_page_fault (arch/x86/mm/fault.c:1543) 
<4>[ 3422.836471] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) 
<4>[ 3422.836481] RIP: 0033:0x7fb0476eef3b
<4>[ 3422.836490] Code: Unable to access opcode bytes at 0x7fb0476eef11.

Code starting with the faulting instruction
===========================================
<4>[ 3422.836501] RSP: 002b:00007fff1d690918 EFLAGS: 00000246 ORIG_RAX: 000000000000003b
<4>[ 3422.836515] RAX: ffffffffffffffda RBX: 00005592d2d6b550 RCX: 00007fb0476eef3b
<4>[ 3422.836526] RDX: 00005592d2d6ae18 RSI: 00005592d2d6aa30 RDI: 00005592d2d6b550
<4>[ 3422.836538] RBP: 00007fff1d6909f0 R08: 00000000000007f0 R09: 00000000000003f0
<4>[ 3422.836549] R10: 00007fb047803ac0 R11: 0000000000000246 R12: 00005592d2d6aa30
<4>[ 3422.836561] R13: 00005592a5e90004 R14: 0000000000000002 R15: 00005592d2d69120
<4>[ 3422.836581]  </TASK>
<4>[ 3422.836586] Modules linked in: cpuid(E) simpledrm(E) drm_shmem_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) fb_sys_fops(E) drm_kms_helper(E) fb(E) input_leds(E) snd_seq_dummy(E) snd_hrtimer(E) xfs(E) dm_crypt(E) cmac(E) ccm(E) kyber_iosched(E) nls_utf8(E) wireguard(E) curve25519_x86_64(E) libcurve25519_generic(E) libchacha20poly1305(E) chacha_x86_64(E) poly1305_x86_64(E) ip6_udp_tunnel(E) udp_tunnel(E) ip6t_REJECT(E) nf_reject_ipv6(E) xt_hl(E) ip6t_rt(E) ipt_REJECT(E) nf_reject_ipv4(E) xt_multiport(E) xt_recent(E) nft_limit(E) xt_limit(E) xt_addrtype(E) xt_tcpudp(E) xt_conntrack(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) nft_compat(E) nf_tables(E) binfmt_misc(E) btrfs(E) blake2b_generic(E) nls_iso8859_1(E) snd_hda_codec_realtek(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_scodec_component(E) intel_rapl_msr(E) snd_hda_intel(E) snd_intel_dspcfg(E) intel_rapl_common(E) snd_hda_codec(E) kvm_amd(E) snd_hwdep(E) snd_hda_core(E) kvm(E) snd_pcm(E) iwlmvm(E) snd_seq(E) snd_seq_device(E) rapl(E)
<4>[ 3422.836679]  wmi_bmof(E) snd_timer(E) mac80211(E) snd(E) soundcore(E) libarc4(E) i2c_piix4(E) k10temp(E) i2c_smbus(E) iwlwifi(E) cfg80211(E) wmi(E) mac_hid(E) auth_rpcgss(E) drm(E) sunrpc(E) drm_panel_orientation_quirks(E) efi_pstore(E) nfnetlink(E) dmi_sysfs(E) ip_tables(E) x_tables(E) bcache(E) crct10dif_pclmul(E) crc32_pclmul(E) polyval_clmulni(E) bridge(E) polyval_generic(E) ghash_clmulni_intel(E) pata_acpi(E) stp(E) llc(E) sha512_ssse3(E) nvme(E) sha256_ssse3(E) ahci(E) igb(E) sha1_ssse3(E) xhci_pci(E) i2c_algo_bit(E) xhci_pci_renesas(E) nvme_core(E) ccp(E) libahci(E) dca(E) pata_jmicron(E) nvme_auth(E) dm_mirror(E) dm_region_hash(E) dm_log(E) msr(E) autofs4(E) aesni_intel(E) crypto_simd(E) cryptd(E)
<4>[ 3422.836948] CR2: 0000000000000000
<4>[ 3422.836956] ---[ end trace 0000000000000000 ]---
<4>[ 3422.845269] workqueue: cache_lookup [bcache] hogged CPU for >10000us 7 times, consider switching to WQ_UNBOUND
<4>[ 3422.990630] RIP: 0010:strlen (lib/string.c:402 (discriminator 1)) 
<4>[ 3422.990651] Code: f7 75 ec 31 c0 31 d2 31 f6 31 ff e9 56 e2 0d 00 48 89 f8 31 d2 31 f6 31 ff e9 48 e2 0d 00 0f 1f 84 00 00 00 00 00 f3 0f 1e fa <80> 3f 00 74 16 48 89 f8 48 83 c0 01 80 38 00 75 f7 48 29 f8 31 ff
All code
========
   0:   f7 75 ec                divl   -0x14(%rbp)
   3:   31 c0                   xor    %eax,%eax
   5:   31 d2                   xor    %edx,%edx
   7:   31 f6                   xor    %esi,%esi
   9:   31 ff                   xor    %edi,%edi
   b:   e9 56 e2 0d 00          jmp    0xde266
  10:   48 89 f8                mov    %rdi,%rax
  13:   31 d2                   xor    %edx,%edx
  15:   31 f6                   xor    %esi,%esi
  17:   31 ff                   xor    %edi,%edi
  19:   e9 48 e2 0d 00          jmp    0xde266
  1e:   0f 1f 84 00 00 00 00    nopl   0x0(%rax,%rax,1)
  25:   00 
  26:   f3 0f 1e fa             endbr64
  2a:*  80 3f 00                cmpb   $0x0,(%rdi)      <-- trapping instruction
  2d:   74 16                   je     0x45
  2f:   48 89 f8                mov    %rdi,%rax
  32:   48 83 c0 01             add    $0x1,%rax
  36:   80 38 00                cmpb   $0x0,(%rax)
  39:   75 f7                   jne    0x32
  3b:   48 29 f8                sub    %rdi,%rax
  3e:   31 ff                   xor    %edi,%edi

Code starting with the faulting instruction
===========================================
   0:   80 3f 00                cmpb   $0x0,(%rdi)
   3:   74 16                   je     0x1b
   5:   48 89 f8                mov    %rdi,%rax
   8:   48 83 c0 01             add    $0x1,%rax
   c:   80 38 00                cmpb   $0x0,(%rax)
   f:   75 f7                   jne    0x8
  11:   48 29 f8                sub    %rdi,%rax
  14:   31 ff                   xor    %edi,%edi
<4>[ 3422.990678] RSP: 0018:ffffb4c842f475a8 EFLAGS: 00010246
<4>[ 3422.990690] RAX: 0000000000000000 RBX: ffffb4c842f47770 RCX: fffffffffffffffe
<4>[ 3422.990702] RDX: ffffb4c842f47680 RSI: 0000000000000000 RDI: 0000000000000000
<4>[ 3422.990714] RBP: ffffb4c842f475c8 R08: 0000000000000000 R09: 0000000000000000
<4>[ 3422.990725] R10: 0000000000000000 R11: 0000000000000000 R12: ffffb4c842f47770
<4>[ 3422.990737] R13: 0000000000000000 R14: 20c49ba5e353f7cf R15: 0000000000000008
<4>[ 3422.990749] FS:  0000000000000000(0000) GS:ffff95d6ee600000(0000) knlGS:0000000000000000
<4>[ 3422.990763] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[ 3422.990773] CR2: 00007fb0476eef11 CR3: 000000021ba66000 CR4: 0000000000350ef0
<0>[ 3422.990785] Kernel panic - not syncing: Fatal exception
<0>[ 3422.991792] Kernel Offset: 0x16400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
<0>[ 3423.147299] Rebooting in 45 seconds..
jpsollie commented 2 months ago

I fail to see what the link with bcachefs is ... The mm guys may be interested here, as RC4 is a testing kernel. try bugzilla.kernel.org

g2p commented 2 months ago

This is on bcachefs-testing which added changes to the oom hook to display a top 10 of shrinkers. I suspect some of the top ten is empty with some configs, causing the crash.

This does not happen on rc4 without those commits near the top of bcachefs-testing.

g2p commented 3 weeks ago

Closing since those patches aren't in bcachefs-testing at the moment. There was some review feedback that might be relevant to the crash: https://lore.kernel.org/all/Zs6aRZrjqPXQue6r@dread.disaster.area/