openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.45k stars 1.73k forks source link

Large kmem_alloc(73728, 0x1000) Crash the Linux Kernel #16550

Open haruki3hhh opened 1 day ago

haruki3hhh commented 1 day ago

Hi, I'm using a bare metal machine with 5.15.0-122-generic #132-Ubuntu

But found the related module will crash due to some reasons:

[ 142.218849] intel_rapl_common: Found RAPL domain package
[  142.218856] intel_rapl_common: Found RAPL domain core
[  142.218883] XFS (sde1): Mounting V5 Filesystem
[  142.219134] XFS (dm-1): Mounting V5 Filesystem
[  142.219786] intel_rapl_common: Found RAPL domain package
[  142.219791] intel_rapl_common: Found RAPL domain core
[  142.248794] XFS (sde1): Starting recovery (logdev: internal)
[  142.269322] XFS (sde1): Ending recovery (logdev: internal)
[  142.276507] xfs filesystem being mounted at /data supports timestamps until 2038 (0x7fffffff)
[  142.361299] ipmi_si IPI0001:00: The BMC does not support setting the recv irq bit, compensating, but the BMC needs to be fixed.
[  142.424505] XFS (dm-1): Starting recovery (logdev: internal)
[  142.429482] ipmi_si IPI0001:00: Using irq 10
[  142.468173] ipmi_si IPI0001:00: IPMI message handler: Found new BMC (man_id: 0x0002a2, prod_id: 0x0100, dev_id: 0x20)
[  142.556406] ipmi_si IPI0001:00: IPMI kcs interface initialized
[  142.559837] ipmi_ssif: IPMI SSIF Interface driver
[  142.599400] znvpair: module license 'CDDL' taints kernel.
[  142.599404] Disabling lock debugging due to kernel taint
[  142.637917] Large kmem_alloc(73728, 0x1000), please file an issue at:
               https://github.com/openzfs/zfs/issues/new
[  142.637923] CPU: 146 PID: 3813 Comm: modprobe Tainted: P           OE     5.15.0-122-generic #132-Ubuntu
[  142.637927] Hardware name: Dell Inc. PowerEdge R7525/0H3K7P, BIOS 2.8.4 06/23/2022
[  142.637930] Call Trace:
[  142.637934]  <TASK>
[  142.637938]  show_stack+0x52/0x5c
[  142.637949]  dump_stack_lvl+0x4a/0x63
[  142.637956]  dump_stack+0x10/0x16
[  142.637958]  spl_kmem_alloc_impl.cold+0x16/0x1b [spl]
[  142.637967]  spl_kmem_zalloc+0x19/0x20 [spl]
[  142.637974]  zstd_mempool_init+0x2d/0xf1 [zzstd]
[  142.637981]  ? zstd_meminit+0x2e/0x2e [zzstd]
[  142.637988]  zstd_meminit+0xe/0x2e [zzstd]
[  142.637995]  init_module+0x1c/0xe81 [zzstd]
[  142.638001]  do_one_initcall+0x49/0x1e0
[  142.638009]  ? srso_alias_return_thunk+0x5/0x7f
[  142.638015]  ? kmem_cache_alloc_trace+0x19e/0x2e0
[  142.638022]  do_init_module+0x52/0x260
[  142.638028]  load_module+0xb45/0xbe0
[  142.638032]  __do_sys_finit_module+0xbf/0x120
[  142.638037]  __x64_sys_finit_module+0x18/0x20
[  142.638039]  x64_sys_call+0x1ac3/0x1fa0
[  142.638042]  do_syscall_64+0x56/0xb0
[  142.638047]  ? srso_alias_return_thunk+0x5/0x7f
[  142.638049]  ? syscall_exit_to_user_mode+0x2c/0x50
[  142.638053]  ? x64_sys_call+0x1a81/0x1fa0
[  142.638055]  ? srso_alias_return_thunk+0x5/0x7f
[  142.638057]  ? do_syscall_64+0x63/0xb0
[  142.638059]  ? do_syscall_64+0x63/0xb0
[  142.638061]  ? syscall_exit_to_user_mode+0x2c/0x50
[  142.638063]  ? x64_sys_call+0x1ac3/0x1fa0
[  142.638065]  ? srso_alias_return_thunk+0x5/0x7f
[  142.638068]  ? do_syscall_64+0x63/0xb0
[  142.638069]  ? srso_alias_return_thunk+0x5/0x7f
[  142.638071]  ? do_syscall_64+0x63/0xb0
[  142.638073]  ? do_syscall_64+0x63/0xb0
[  142.638075]  entry_SYSCALL_64_after_hwframe+0x6c/0xd6
[  142.638080] RIP: 0033:0x7ff2de70488d
[  142.638084] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 b5 0f 00 f7 d8 64 89 01 48
[  142.638086] RSP: 002b:00007ffce434e9e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[  142.638090] RAX: ffffffffffffffda RBX: 00005574826b0130 RCX: 00007ff2de70488d
[  142.638092] RDX: 0000000000000000 RSI: 0000557480b93cd2 RDI: 0000000000000009
[  142.638093] RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000002
[  142.638095] R10: 0000000000000009 R11: 0000000000000246 R12: 0000557480b93cd2
[  142.638096] R13: 00005574826b00f0 R14: 00005574826b5130 R15: 00005574826b6a40
[  142.638100]  </TASK>
[  142.638105] Large kmem_alloc(73728, 0x1000), please file an issue at:
               https://github.com/openzfs/zfs/issues/new
[  142.638107] CPU: 146 PID: 3813 Comm: modprobe Tainted: P           OE     5.15.0-122-generic #132-Ubuntu
[  142.638109] Hardware name: Dell Inc. PowerEdge R7525/0H3K7P, BIOS 2.8.4 06/23/2022
[  142.638110] Call Trace:
[  142.638111]  <TASK>
[  142.638112]  show_stack+0x52/0x5c
[  142.638115]  dump_stack_lvl+0x4a/0x63
[  142.638118]  dump_stack+0x10/0x16
[  142.638120]  spl_kmem_alloc_impl.cold+0x16/0x1b [spl]
[  142.638129]  spl_kmem_zalloc+0x19/0x20 [spl]
[  142.638135]  zstd_mempool_init+0x52/0xf1 [zzstd]
[  142.638142]  ? zstd_meminit+0x2e/0x2e [zzstd]
[  142.638148]  zstd_meminit+0xe/0x2e [zzstd]
[  142.638155]  init_module+0x1c/0xe81 [zzstd]
[  142.638161]  do_one_initcall+0x49/0x1e0
[  142.638164]  ? srso_alias_return_thunk+0x5/0x7f
[  142.638166]  ? kmem_cache_alloc_trace+0x19e/0x2e0
[  142.638170]  do_init_module+0x52/0x260
[  142.638173]  load_module+0xb45/0xbe0
[  142.638177]  __do_sys_finit_module+0xbf/0x120
[  142.638182]  __x64_sys_finit_module+0x18/0x20
[  142.638184]  x64_sys_call+0x1ac3/0x1fa0
[  142.638187]  do_syscall_64+0x56/0xb0
[  142.638189]  ? srso_alias_return_thunk+0x5/0x7f
[  142.638191]  ? syscall_exit_to_user_mode+0x2c/0x50
[  142.638193]  ? x64_sys_call+0x1a81/0x1fa0
[  142.638196]  ? srso_alias_return_thunk+0x5/0x7f
[  142.638198]  ? do_syscall_64+0x63/0xb0
[  142.638199]  ? do_syscall_64+0x63/0xb0
[  142.638201]  ? syscall_exit_to_user_mode+0x2c/0x50
[  142.638203]  ? x64_sys_call+0x1ac3/0x1fa0
[  142.638206]  ? srso_alias_return_thunk+0x5/0x7f
[  142.638208]  ? do_syscall_64+0x63/0xb0
[  142.638210]  ? srso_alias_return_thunk+0x5/0x7f
[  142.638212]  ? do_syscall_64+0x63/0xb0
[  142.638214]  ? do_syscall_64+0x63/0xb0
[  142.638216]  entry_SYSCALL_64_after_hwframe+0x6c/0xd6
[  142.638218] RIP: 0033:0x7ff2de70488d
[  142.638220] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 b5 0f 00 f7 d8 64 89 01 48
[  142.638222] RSP: 002b:00007ffce434e9e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[  142.638224] RAX: ffffffffffffffda RBX: 00005574826b0130 RCX: 00007ff2de70488d
[  142.638226] RDX: 0000000000000000 RSI: 0000557480b93cd2 RDI: 0000000000000009
[  142.638227] RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000002
[  142.638228] R10: 0000000000000009 R11: 0000000000000246 R12: 0000557480b93cd2
[  142.638230] R13: 00005574826b00f0 R14: 00005574826b5130 R15: 00005574826b6a40
[  142.638233]  </TASK>
[  142.809820] ZFS: Loaded module v2.1.5-1ubuntu6~22.04.4, ZFS pool version 5000, ZFS filesystem version 5
[  142.840778] loop0: detected capacity change from 0 to 635672
[  142.842757] loop1: detected capacity change from 0 to 643872
[  142.847100] loop2: detected capacity change from 0 to 642480
[  142.850036] loop3: detected capacity change from 0 to 642448
[  142.854395] loop4: detected capacity change from 0 to 113992
[  142.857920] loop5: detected capacity change from 0 to 113992
[  142.861698] loop6: detected capacity change from 0 to 130960
[  142.864809] loop7: detected capacity change from 0 to 131016
[  142.876078] loop8: detected capacity change from 0 to 152112
[  142.883401] loop9: detected capacity change from 0 to 152112
[  142.883800] loop10: detected capacity change from 0 to 135520
[  142.888345] loop11: detected capacity change from 0 to 135512
[  142.892263] loop12: detected capacity change from 0 to 37816
[  142.898449] loop13: detected capacity change from 0 to 37816
[  142.903791] loop14: detected capacity change from 0 to 19912
[  142.908324] loop15: detected capacity change from 0 to 19912
[  142.912219] loop16: detected capacity change from 0 to 22552
[  142.921654] loop17: detected capacity change from 0 to 178240
[  142.925781] loop18: detected capacity change from 0 to 178256
[  142.931514] loop19: detected capacity change from 0 to 79328
[  142.935040] loop20: detected capacity change from 0 to 79520
[  144.353219] XFS (dm-1): Ending recovery (logdev: internal)
rincebrain commented 1 day ago

That's not a crash, it's just a warning that it did it. I promise, you'd notice if it crashed.