clang 16 built kernel crashes w. "BUG: kernel NULL pointer dereference, address: 00000007", gcc 13 built kernel with same config boots fine (6.7-rc1, x86_32)

ernsteiswuerfel commented 11 months ago

Hello, it's-a me again with my ye-olde crashing x86_32 box. 😉 CONFIG_STACKPROTECTOR is not set this time.

I gave kernel 6.7-rc1 a test ride and it crashes at boot with:

BUG: kernel NULL pointer dereference, address: 00000007
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
*pdpt = 0000000002398001 *pde = 0000000000000000 
Oops: 0000 [#1] SMP PTI
CPU: 1 PID: 1 Comm: systemd Not tainted 6.7.0-rc1-P3 #1
Hardware name: LENOVO 2007F2G/2007F2G, BIOS 79ETE7WW (2.27 ) 03/21/2011
EIP: obj_cgroup_charge_pages+0xc/0xa8
Code: 75 ee eb cf 31 db 4b eb a0 e8 34 fe ff ff 89 c3 eb 93 8b 43 04 f0 83 00 01 eb b0 90 90 90 55 89 e5 53 57 56 83 ec 08 8b 7d 08 <8b> 71 08 f6 46 2c 01 75 38 8b 46 08 a8 03 74 2e 8b 46 0c 89 45 ec
EAX: 00000001 EBX: 00000000 ECX: ffffffff EDX: 00400cc0
ESI: ffffffff EDI: 00000001 EBP: c1155ce8 ESP: c1155cd4
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00210286
CR0: 80050033 CR2: 00000007 CR3: 0204e000 CR4: 000006f0
Call Trace:
 ? show_regs+0x4e/0x5c
 ? __die_body+0x11/0x4c
 ? __die+0x21/0x30
 ? page_fault_oops+0x20f/0x238
 ? mt_find+0x94/0x15c
 ? kernelmode_fixup_or_oops+0x92/0xa8
 ? __bad_area_nosemaphore+0x40/0x168
 ? bad_area_nosemaphore+0xd/0x14
 ? exc_page_fault+0x277/0x32c
 ? doublefault_shim+0x100/0x100
 ? handle_exception+0x101/0x101
 ? add_swap_count_continuation+0x1af/0x204
 ? doublefault_shim+0x100/0x100
 ? obj_cgroup_charge_pages+0xc/0xa8
 ? doublefault_shim+0x100/0x100
 ? obj_cgroup_charge_pages+0xc/0xa8
 obj_cgroup_charge+0x8d/0xcc
 pcpu_alloc+0x107/0x5c0
 ? cgroup_apply_control_enable+0xb1/0x250
 __alloc_percpu_gfp+0x10/0x18
 mem_cgroup_css_alloc+0xea/0x498
 cgroup_apply_control_enable+0xb1/0x250
 ? css_populate_dir+0xb5/0xd0
 cgroup_mkdir+0x1a2/0x2f4
 ? css_task_iter_end+0xbc/0xbc
 kernfs_iop_mkdir+0x52/0x68
 ? kernfs_iop_lookup+0xc0/0xc0
 vfs_mkdir+0x149/0x198
 do_mkdirat+0x72/0xb4
 __ia32_sys_mkdir+0x23/0x2c
 __do_fast_syscall_32+0x86/0xb0
 ? kmem_cache_free+0x2c3/0x2f0
 ? putname+0x3c/0x48
 ? putname+0x3c/0x48
 ? putname+0x3c/0x48
 ? syscall_exit_to_user_mode+0x1d/0x90
 ? __do_fast_syscall_32+0x92/0xb0
 ? syscall_exit_to_user_mode+0x1d/0x90
 ? __do_fast_syscall_32+0x92/0xb0
 ? __ia32_sys_clock_gettime+0x86/0xa0
 ? syscall_exit_to_user_mode+0x1d/0x90
 ? __do_fast_syscall_32+0x92/0xb0
 ? irqentry_exit_to_user_mode+0xa/0x1c
 ? irqentry_exit+0x12/0x2c
 ? exc_page_fault+0x112/0x32c
 do_fast_syscall_32+0x29/0x54
 do_SYSENTER_32+0x12/0x18
 entry_SYSENTER_32+0x98/0xf1
EIP: 0xb7fc8539
Code: 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 0f 1f 00 58 b8 77 00 00 00 cd 80 90 0f 1f
EAX: ffffffda EBX: 00a89d50 ECX: 000001ed EDX: b79f9e4c
ESI: b7ab3614 EDI: 00ad7dc0 EBP: bfea7578 ESP: bfea7508
DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00200292
 ? asm_exc_nmi+0xb0/0x10d
Modules linked in: dmi_sysfs
CR2: 0000000000000007
---[ end trace 0000000000000000 ]---
EIP: obj_cgroup_charge_pages+0xc/0xa8
Code: 75 ee eb cf 31 db 4b eb a0 e8 34 fe ff ff 89 c3 eb 93 8b 43 04 f0 83 00 01 eb b0 90 90 90 55 89 e5 53 57 56 83 ec 08 8b 7d 08 <8b> 71 08 f6 46 2c 01 75 38 8b 46 08 a8 03 74 2e 8b 46 0c 89 45 ec
EAX: 00000001 EBX: 00000000 ECX: ffffffff EDX: 00400cc0
ESI: ffffffff EDI: 00000001 EBP: c1155ce8 ESP: c1155cd4
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00210286
CR0: 80050033 CR2: 00000007 CR3: 0204e000 CR4: 000006f0
Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
Kernel Offset: disabled
Rebooting in 40 seconds..

dmesg was captured via netconsole. The gcc-13 built kernel with the same .config boots just fine,

Some data about the hardware:

 # inxi -bz
System:
  Kernel: 6.7.0-rc1-P3 arch: i686 bits: 32 Console: pty pts/0 Distro: Gentoo
    Base System release 2.14
Machine:
  Type: Laptop System: LENOVO product: 2007F2G v: ThinkPad T60
    serial: <filter>
  Mobo: LENOVO model: 2007F2G serial: <filter> BIOS: LENOVO
    v: 79ETE7WW (2.27 ) date: 03/21/2011
Battery:
  ID-1: BAT0 charge: 35.7 Wh (99.7%) condition: 35.8/56.2 Wh (63.7%)
CPU:
  Info: dual core Intel T2400 [MCP] speed (MHz): avg: 1000 min/max: 1000/1833
Graphics:
  Device-1: AMD RV515/M52 [Mobility Radeon X1300] driver: radeon v: kernel
  Display: x11 server: X.org v: 1.21.1.9 driver: X: loaded: radeon
    unloaded: fbdev,modesetting dri: r300 gpu: radeon
    resolution: <missing: xdpyinfo/xrandr> resolution: 1024x768
  API: OpenGL v: 4.5 Mesa 23.3.0-rc3 (git-65109bc8ac) renderer: llvmpipe
    (LLVM 16.0.6 128 bits)
Network:
  Device-1: Intel 82573L Gigabit Ethernet driver: e1000e
  Device-2: Intel PRO/Wireless 3945ABG [Golan] Network driver: iwl3945
Drives:
  Local Storage: total: 465.76 GiB used: 10.89 GiB (2.3%)
Info:
  Processes: 221 Uptime: 44m Memory: available: 2.95 GiB
  used: 472.7 MiB (15.6%) Shell: Bash inxi: 3.3.27

 # lspci 
00:00.0 Host bridge: Intel Corporation Mobile 945GM/PM/GMS, 943/940GML and 945GT Express Memory Controller Hub (rev 03)
00:01.0 PCI bridge: Intel Corporation Mobile 945GM/PM/GMS, 943/940GML and 945GT Express PCI Express Root Port (rev 03)
00:1b.0 Audio device: Intel Corporation NM10/ICH7 Family High Definition Audio Controller (rev 02)
00:1c.0 PCI bridge: Intel Corporation NM10/ICH7 Family PCI Express Port 1 (rev 02)
00:1c.1 PCI bridge: Intel Corporation NM10/ICH7 Family PCI Express Port 2 (rev 02)
00:1c.2 PCI bridge: Intel Corporation NM10/ICH7 Family PCI Express Port 3 (rev 02)
00:1c.3 PCI bridge: Intel Corporation NM10/ICH7 Family PCI Express Port 4 (rev 02)
00:1d.0 USB controller: Intel Corporation NM10/ICH7 Family USB UHCI Controller #1 (rev 02)
00:1d.1 USB controller: Intel Corporation NM10/ICH7 Family USB UHCI Controller #2 (rev 02)
00:1d.2 USB controller: Intel Corporation NM10/ICH7 Family USB UHCI Controller #3 (rev 02)
00:1d.3 USB controller: Intel Corporation NM10/ICH7 Family USB UHCI Controller #4 (rev 02)
00:1d.7 USB controller: Intel Corporation NM10/ICH7 Family USB2 EHCI Controller (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev e2)
00:1f.0 ISA bridge: Intel Corporation 82801GBM (ICH7-M) LPC Interface Bridge (rev 02)
00:1f.1 IDE interface: Intel Corporation 82801G (ICH7 Family) IDE Controller (rev 02)
00:1f.2 SATA controller: Intel Corporation 82801GBM/GHM (ICH7-M Family) SATA Controller [AHCI mode] (rev 02)
00:1f.3 SMBus: Intel Corporation NM10/ICH7 Family SMBus Controller (rev 02)
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RV515/M52 [Mobility Radeon X1300]
02:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet Controller
03:00.0 Network controller: Intel Corporation PRO/Wireless 3945ABG [Golan] Network Connection (rev 02)
15:00.0 CardBus bridge: Texas Instruments PCI1510 PC card Cardbus Controller

Kernel .config and both dmesg outputs attached. config_67-rc1_p3.txt clang16_dmesg_67-rc1_p3.txt gcc13_dmesg_67-rc1_p3.txt

nathanchance commented 11 months ago

If this doesn't happen with 6.6, can you bisect Linux to see what change introduced this in 6.7-rc1?

ernsteiswuerfel commented 11 months ago

The bisect was straightforward so it didn't take me long:

 # git bisect bad
e86828e5446d95676835679837d995dec188d2be is the first bad commit
commit e86828e5446d95676835679837d995dec188d2be
Author: Roman Gushchin <roman.gushchin@linux.dev>
Date:   Thu Oct 19 15:53:44 2023 -0700

    mm: kmem: scoped objcg protection

    Switch to a scope-based protection of the objcg pointer on slab/kmem
    allocation paths.  Instead of using the get_() semantics in the
    pre-allocation hook and put the reference afterwards, let's rely on the
    fact that objcg is pinned by the scope.

    It's possible because:
    1) if the objcg is received from the current task struct, the task is
       keeping a reference to the objcg.
    2) if the objcg is received from an active memcg (remote charging),
       the memcg is pinned by the scope and has a reference to the
       corresponding objcg.

    Link: https://lkml.kernel.org/r/20231019225346.1822282-5-roman.gushchin@linux.dev
    Signed-off-by: Roman Gushchin (Cruise) <roman.gushchin@linux.dev>
    Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org>
    Acked-by: Shakeel Butt <shakeelb@google.com>
    Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Dennis Zhou <dennis@kernel.org>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Michal Hocko <mhocko@kernel.org>
    Cc: Muchun Song <muchun.song@linux.dev>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

 include/linux/memcontrol.h |  9 +++++++++
 include/linux/sched/mm.h   |  4 ++++
 mm/memcontrol.c            | 47 ++++++++++++++++++++++++++++++++++++++++++++--
 mm/slab.h                  | 15 ++++++++-------

Simply reverting this rather big patch failed however so I was not able to test v6.7-r1 with the changes reverted for now.

bisect.log

nathanchance commented 11 months ago

That change is a part of a larger series. It reverts cleanly for me on 6.7-rc1 with the following git command:

git diff 7d0715d0d6b28a831b6fdfefb29c5a7a4929fa49^..e56808fef8f71a192b2740c0b6ea8be7ab865d54 | git apply -3 -R

Unfortunately, none of the sanitizers like KASAN, KCSAN, or KMSAN support 32-bit x86, so those won't really help us here :/ It may be worth reporting this upstream directly to see if the developer of that change could spot anything obviously wrong with it in this context. They could just be getting lucky that they are not hitting this issue with GCC.

ernsteiswuerfel commented 11 months ago

Your git command worked in cleanly reverting the series. Thanks! Also with the patch reverted the kernel boots ok.

As suggested I reported it upstream. linux-mm@kvack.org mailing list seemed the proper place for me: https://lore.kernel.org/linux-mm/20231115011506.0edd8870@yea/

ernsteiswuerfel commented 11 months ago

Unfortunately, none of the sanitizers like KASAN, KCSAN, or KMSAN support 32-bit x86, so those won't really help us here :/ [...]

Turns out the issue also happens on x8664! At least I strongly suppose it's the same issue as reverting the bisected patchset also 'fixes'_ the clang-16 built kernel.

The trace looks a bit different thouth. I also enabled KASAN:

general protection fault, probably for non-canonical address 0xf555515555555557: 0000 [#1] SMP KASAN NOPTI
KASAN: maybe wild-memory-access in range [0xaaaaaaaaaaaaaab8-0xaaaaaaaaaaaaaabf]
CPU: 26 PID: 1 Comm: systemd Not tainted 6.7.0-rc1-Zen3 #1
Hardware name: To Be Filled By O.E.M. B450M Steel Legend/B450M Steel Legend, BIOS P8.01 03/14/2023
RIP: 0010:obj_cgroup_charge_pages+0x27/0x2d5
Code: 90 90 90 55 41 57 41 56 41 55 41 54 53 89 d5 41 89 f6 49 89 ff 48 b8 00 00 00 00 00 fc ff df 49 83 c7 10 4d 89 fd 49 c1 ed 03 <41> 80 7c 05 00 00 74 08 4c 89 ff e8 5e 3a fd ff 49 8b 1f 4c 8d 63
RSP: 0018:ffffc90000067a78 EFLAGS: 00010212
RAX: dffffc0000000000 RBX: aaaaaaaaaaaaaaaa RCX: ffff8887df328b08
RDX: 000000000000000a RSI: 0000000000400cc0 RDI: aaaaaaaaaaaaaaaa
RBP: 000000000000000a R08: 3333333333333333 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8887df328b18
R13: 1555555555555557 R14: 0000000000400cc0 R15: aaaaaaaaaaaaaaba
FS:  00007fd18c5cb8c0(0000) GS:ffff8887df300000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005614629e5098 CR3: 0000000108066000 CR4: 0000000000b50ef0
Call Trace:
 <TASK>
 ? __die_body+0x16/0x75
 ? die_addr+0x4a/0x70
 ? exc_general_protection+0x1c9/0x2d0
 ? cgroup_mkdir+0x455/0x9fb
 ? __x64_sys_mkdir+0x69/0x80
 ? asm_exc_general_protection+0x26/0x30
 ? obj_cgroup_charge_pages+0x27/0x2d5
 obj_cgroup_charge+0x114/0x1ab
 pcpu_alloc+0x1a6/0xa65
 ? mem_cgroup_css_alloc+0x1eb/0x1140
 ? cgroup_apply_control_enable+0x26b/0x7c0
 mem_cgroup_css_alloc+0x23f/0x1140
 cgroup_apply_control_enable+0x26b/0x7c0
 ? cgroup_kn_set_ugid+0x2d/0x1a0
 ? srso_alias_return_thunk+0x5/0xfbef5
 cgroup_mkdir+0x455/0x9fb
 ? __cfi_cgroup_mkdir+0x10/0x10
 kernfs_iop_mkdir+0x130/0x170
 vfs_mkdir+0x405/0x530
 do_mkdirat+0x188/0x1f0
 __x64_sys_mkdir+0x69/0x80
 do_syscall_64+0x7d/0x100
 ? srso_alias_return_thunk+0x5/0xfbef5
 ? syscall_exit_to_user_mode+0x23/0xc0
 ? srso_alias_return_thunk+0x5/0xfbef5
 ? do_syscall_64+0x89/0x100
 ? srso_alias_return_thunk+0x5/0xfbef5
 ? do_syscall_64+0x89/0x100
 ? srso_alias_return_thunk+0x5/0xfbef5
 ? do_syscall_64+0x89/0x100
 ? srso_alias_return_thunk+0x5/0xfbef5
 ? do_syscall_64+0x89/0x100
 entry_SYSCALL_64_after_hwframe+0x4b/0x53
RIP: 0033:0x7fd18c7216e7
Code: 00 66 90 48 89 f2 b9 00 01 00 00 48 89 fe bf 9c ff ff ff e9 1b cc ff ff 66 2e 0f 1f 84 00 00 00 00 00 90 b8 53 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 19 47 0d 00 f7 d8 64 89 02 b8
RSP: 002b:00007ffd5d347128 EFLAGS: 00000246 ORIG_RAX: 0000000000000053
RAX: ffffffffffffffda RBX: 00005614628edf30 RCX: 00007fd18c7216e7
RDX: 0000000000000000 RSI: 00000000000001ed RDI: 00005614628fbd80
RBP: 00007ffd5d347170 R08: 000000000000000e R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fd18c8ce39a
R13: 00007ffd5d347140 R14: 00000000000000a0 R15: 00005614628c9560
 </TASK>
Modules linked in: efivarfs dmi_sysfs
---[ end trace 0000000000000000 ]---
RIP: 0010:obj_cgroup_charge_pages+0x27/0x2d5
Code: 90 90 90 55 41 57 41 56 41 55 41 54 53 89 d5 41 89 f6 49 89 ff 48 b8 00 00 00 00 00 fc ff df 49 83 c7 10 4d 89 fd 49 c1 ed 03 <41> 80 7c 05 00 00 74 08 4c 89 ff e8 5e 3a fd ff 49 8b 1f 4c 8d 63
RSP: 0018:ffffc90000067a78 EFLAGS: 00010212
RAX: dffffc0000000000 RBX: aaaaaaaaaaaaaaaa RCX: ffff8887df328b08
RDX: 000000000000000a RSI: 0000000000400cc0 RDI: aaaaaaaaaaaaaaaa
RBP: 000000000000000a R08: 3333333333333333 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8887df328b18
R13: 1555555555555557 R14: 0000000000400cc0 R15: aaaaaaaaaaaaaaba
FS:  00007fd18c5cb8c0(0000) GS:ffff8887df300000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005614629e5098 CR3: 0000000108066000 CR4: 0000000000b50ef0
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
Kernel Offset: 0x37000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
Rebooting in 40 seconds..

config_67-rc1_zen3.txt dmesg_67-rc1_zen3_02.log

nathanchance commented 11 months ago

I should have caught CONFIG_INIT_STACK_ALL_PATTERN=y...

Fix: https://lore.kernel.org/20231116025109.3775055-1-roman.gushchin@linux.dev/

nathanchance commented 11 months ago

The patch is in Andrew Morton's tree now: https://git.kernel.org/akpm/mm/c/442ba5647c2edaa17cc6b766d81ea841c0eb89e8

It is on the mm-hotfixes-unstable branch currently, which means that SHA is not stable. I'll post the final hash when I close this issue due to the mainline merge.

nathanchance commented 11 months ago

https://git.kernel.org/linus/5f79489a73d77419d18952e0258efbd5ecb74770

ClangBuiltLinux / linux

clang 16 built kernel crashes w. "BUG: kernel NULL pointer dereference, address: 00000007", gcc 13 built kernel with same config boots fine (6.7-rc1, x86_32) #1959