Open kees opened 2 months ago
Okay, reproduce this in a Debian i386 image. %esi
for me is 00000008
, fwiw:
[ 17.262774] BUG: kernel NULL pointer dereference, address: 0000000c
[ 17.263754] #PF: supervisor write access in kernel mode
[ 17.263754] #PF: error_code(0x0002) - not-present page
[ 17.263754] *pde = 00000000
[ 17.263754] Oops: 0002 [#1] PREEMPT SMP
[ 17.263754] CPU: 2 PID: 1805 Comm: cat Not tainted 6.9.0-rc2 #1
[ 17.263754] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[ 17.263754] EIP: __ubsan_handle_out_of_bounds+0x1b/0x150
[ 17.263754] Code: 5d c3 90 90 90 90 90 90 90 90 90 90 90 90 90 55 89 e5 53 57 56 83 ec 2c 64 8b b
[ 17.263754] EAX: 00000008 EBX: c2b18000 ECX: 91b74c03 EDX: f57a9f88
[ 17.263754] ESI: 00000008 EDI: c1fb6d80 EBP: c3de5e58 ESP: c3de5e20
[ 17.263754] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00010246
[ 17.263754] CR0: 80050033 CR2: 0000000c CR3: 02c4c000 CR4: 00350ed0
And it's exactly where the other one is:
$ LLVM=1 scripts/faddr2line clang-17-32/vmlinux __ubsan_handle_out_of_bounds+0x1b/0x150
__ubsan_handle_out_of_bounds+0x1b/0x150:
arch_test_and_set_bit at lib/ubsan.c:0
(inlined by) test_and_set_bit at include/asm-generic/bitops/instrumented-atomic.h:72
(inlined by) was_reported at lib/ubsan.c:112
(inlined by) suppress_report at lib/ubsan.c:117
(inlined by) __ubsan_handle_out_of_bounds at lib/ubsan.c:407
Under Clang tip-of-tree, it crashes later ... ?!
[ 10.417811] UBSAN: array-index-out-of-bounds in (null):0:16705
[ 10.419513] BUG: kernel NULL pointer dereference, address: 00000000
[ 10.420487] #PF: supervisor read access in kernel mode
[ 10.420487] #PF: error_code(0x0000) - not-present page
[ 10.420487] *pde = 00000000
[ 10.420487] Oops: 0000 [#1] PREEMPT SMP
[ 10.420487] CPU: 3 PID: 1794 Comm: cat Not tainted 6.9.0-rc2 #1
[ 10.420487] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[ 10.420487] EIP: __ubsan_handle_out_of_bounds+0xab/0x190
[ 10.420487] Code: 9a 00 83 c4 04 b8 ff ff ff 7f 23 47 04 ff 77 08 50 ff 37 68 6b f7 75 d6 68 8d a
[ 10.420487] EAX: c1e76d60 EBX: f57d9f88 ECX: 00000000 EDX: f57d9f88
[ 10.420487] ESI: c2734c00 EDI: c1e76d60 EBP: c3eb1e5c ESP: c3eb1e24
[ 10.420487] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00010286
[ 10.420487] CR0: 80050033 CR2: 00000000 CR3: 042e1000 CR4: 00350ed0
Crash is now in type_is_int()
static void val_to_string(char *str, size_t size, struct type_descriptor *type,
void *value)
{
if (type_is_int(type)) {
...
Still crashes with Clang-HEAD+v6.1.87 and Clang-14+v6.9-rc4, so bisection seems unlikely.
Thanks for taking that up another level! I have not realized this to be a clang inherent issue. I just assumed this was clangs' way of telling me there is a "UBSAN: array-index-out-of-bounds" situation. 🤔
If there is further testing required please let me know!
Ah, I think I figured it out. The handler calls aren't respecting -mregparm=3
. Upstream bug opened: https://github.com/llvm/llvm-project/issues/89670
Clang is fixed and the kernel has a work-around for earlier versions with commit c5d49b4773aac19782a57b11f7860f92692b230c.
Many thanks!
Seen here: https://gitlab.freedesktop.org/drm/amd/-/issues/3323
Appeared under Clang 17 on ARCH=i386.
The Code disassembles to:
The NULL deref (actually offset NULL + 1028) happened during the
bts
above, which maps to thetest_and_set_bit()
below:This should be impossible, though.
struct out_of_bounds_data
containslocation
as the first struct:So
data->location.report
should be offset_data + sizeof(void *)
(here, 4). This matches the assembly:DWORD PTR [esi+0x4]
, but%esi
is0x400
.Having a base address of 1024 seems like either a special value, a per-cpu variable, or some failed relocation?