lkrg-org / lkrg

Linux Kernel Runtime Guard
https://lkrg.org
Other
410 stars 72 forks source link

Intermittent NULL pointer dereference in p_arch_static_call_transform_entry() on system shutdown #149

Open solardiz opened 2 years ago

solardiz commented 2 years ago

As reported by @vt-alt in https://github.com/lkrg-org/lkrg/issues/137#issuecomment-1016139265 and related to or the same as #85:

[  OK  ] Removed slice User Slice of UID 0.
         Stopping Permit User Sessions...
[  OK  ] Stopped Permit User Sessions.
[  OK  ] Stopped target Basic System.
[  OK  ] Stopped target Path Units.
[  OK  ] Stopped target Remote File Systems.[    5.389751] BUG: kernel NULL pointer dereference, address: 0000000000000000
[    5.391314] #PF: supervisor read access in kernel mode
[    5.391314] #PF: error_code(0x0000) - not-present page
[    5.391314] PGD 0 P4D 0 
[    5.391314] Oops: 0000 [#1] PREEMPT SMP NOPTI
[    5.391314] CPU: 0 PID: 422 Comm: systemd-udevd Tainted: G           OE     5.16.0-051600daily20220115-generic #202201142115
[    5.391314] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
[    5.391314] RIP: 0010:p_arch_static_call_transform_entry+0x146/0x1f0 [p_lkrg]
[    5.391314] Code: 00 00 8b 10 85 d2 0f 85 a6 00 00 00 8b 35 2a 39 02 00 85 f6 74 c5 48 8b 0d 27 39 02 00 eb 0b 83 c2 01 48 83 c1 60 39 d6 74 b1 <48> 3b 01 75 f0 89 15 9f 38 02 00 eb a4 48 8b 05 fe fc 01 00 4c 89
[    5.391314] RSP: 0018:ffffa281404fb9a8 EFLAGS: 00010206
[    5.391314] RAX: ffffffffc074ddc0 RBX: ffff9776742bea40 RCX: 0000000000000000
[    5.391314] RDX: 0000000000000000 RSI: 0000000000000028 RDI: ffffffffc06d66bf
[    5.391314] RBP: ffffa281404fb9b8 R08: 0000000000000000 R09: 0000000000000102
[    5.391314] R10: 0000000000000002 R11: 0000000000000000 R12: ffffffffc06d66bf
[    5.391314] R13: 0000000000000000 R14: ffffffffc0412a40 R15: ffff97767ec1ff40
[    5.391314] FS:  00007f5bc4de08c0(0000) GS:ffff97767ec00000(0000) knlGS:0000000000000000
[    5.391314] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    5.391314] CR2: 0000000000000000 CR3: 0000000033dc8001 CR4: 0000000000770ef0
[    5.391314] PKRU: 55555554
[    5.391314] Call Trace:
[    5.391314]  <TASK>
[    5.391314]  pre_handler_kretprobe+0x94/0x170
[    5.391314]  ? arch_static_call_transform+0x1/0xa0
[    5.391314]  kprobe_ftrace_handler+0x114/0x1c0
[    5.391314]  ? arch_static_call_transform+0x5/0xa0
[    5.391314]  ? kvm_emulate_as_nop+0xf/0x40 [kvm]
[    5.391314]  0xffffffffc04330e3
[    5.391314] RIP: 0010:arch_static_call_transform+0x1/0xa0
[    5.391314] Code: 00 5d c3 3c e9 74 fa 3c c3 74 f6 e9 e2 d6 be 00 0f b6 4a 04 38 4f 04 74 e8 81 3f 66 66 48 31 0f 85 cd d6 be 00 eb d0 66 90 e8 <4b> 1d 5f 1a 55 48 89 e5 41 56 49 89 fe 48 c7 c7 20 dd c6 a7 41 55
[    5.391314] RSP: 0018:ffffa281404fbb18 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[    5.391314] RAX: 0000000000000001 RBX: ffffffffc074db54 RCX: 0000000000000000
[    5.391314] RDX: ffffffffc084dbe0 RSI: 0000000000000000 RDI: ffffffffc06d66bf
[    5.391314] RBP: ffffa281404fbb80 R08: 0000000000000000 R09: 0000000000000102
[    5.391314] R10: 0000000000000002 R11: 0000000000000000 R12: ffffffffc074d194
[    5.391314] R13: ffffffffc06d66bf R14: ffffffffc0737500 R15: ffffffffc074d194
[    5.391314]  ? kvm_emulate_as_nop+0xf/0x40 [kvm]
[    5.391314]  ? handle_ept_misconfig+0x130/0x130 [kvm_intel]
[    5.391314]  ? kvm_emulate_as_nop+0xf/0x40 [kvm]
[    5.391314]  ? arch_static_call_transform+0x5/0xa0
[    5.391314]  ? __static_call_update+0x167/0x200
[    5.391314]  ? arch_static_call_transform+0x5/0xa0
[    5.391314]  ? __static_call_update+0x167/0x200
[    5.391314]  ? handle_ept_misconfig+0x130/0x130 [kvm_intel]
[    5.391314]  kvm_ops_static_call_update+0x378/0xb40 [kvm]
[    5.391314]  kvm_arch_hardware_setup+0x7d/0x1f0 [kvm]
[    5.391314]  kvm_init+0x9e/0x3d0 [kvm]
[    5.391314]  ? hardware_setup+0x571/0x571 [kvm_intel]
[    5.391314]  vmx_init+0xb6/0x14d [kvm_intel]
[    5.391314]  ? hardware_setup+0x571/0x571 [kvm_intel]
[    5.391314]  do_one_initcall+0x46/0x210
[    5.391314]  ? kmem_cache_alloc_trace+0x1a6/0x320
[    5.391314]  do_init_module+0x62/0x290
[    5.391314]  load_module+0xab7/0xb80
[    5.391314]  __do_sys_finit_module+0xbf/0x120
[    5.391314]  __x64_sys_finit_module+0x18/0x20
[    5.391314]  do_syscall_64+0x59/0xc0
[    5.391314]  ? do_syscall_64+0x69/0xc0
[    5.391314]  ? syscall_exit_to_user_mode+0x27/0x50
[    5.391314]  ? __x64_sys_newfstatat+0x1c/0x20
[    5.391314]  ? do_syscall_64+0x69/0xc0
[    5.391314]  ? ksys_lseek+0x81/0xc0
[    5.391314]  ? exit_to_user_mode_prepare+0x37/0xb0
[    5.391314]  ? syscall_exit_to_user_mode+0x27/0x50
[    5.391314]  ? __x64_sys_lseek+0x18/0x20
[    5.391314]  ? do_syscall_64+0x69/0xc0
[    5.391314]  ? do_syscall_64+0x69/0xc0
[    5.391314]  ? do_syscall_64+0x69/0xc0
[    5.391314]  ? irqentry_exit+0x33/0x40
[    5.391314]  ? exc_page_fault+0x89/0x180
[    5.391314]  ? asm_exc_page_fault+0x8/0x30
[    5.391314]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[    5.391314] RIP: 0033:0x7f5bc54da94d
[    5.391314] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d b3 64 0f 00 f7 d8 64 89 01 48
[    5.391314] RSP: 002b:00007ffc54110c08 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[    5.391314] RAX: ffffffffffffffda RBX: 0000560cbd69e110 RCX: 00007f5bc54da94d
[    5.391314] RDX: 0000000000000000 RSI: 00007f5bc566e441 RDI: 0000000000000011
[    5.391314] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000002
[    5.391314] R10: 0000000000000011 R11: 0000000000000246 R12: 00007f5bc566e441
[    5.391314] R13: 0000560cbd699150 R14: 0000560cbd69ac80 R15: 0000560cbd697580
[    5.391314]  </TASK>
[    5.391314] Modules linked in: virtio_gpu virtio_dma_buf drm_shmem_helper drm_kms_helper kvm_intel(+) cec joydev rc_core kvm fb_sys_fops syscopyarea rapl i2c_i801 input_leds sysfillrect psmouse sysimgblt lpc_ich i2c_smbus mac_hid sch_fq_codel drm ip_tables x_tables crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd ahci serio_raw libahci virtio_scsi qemu_fw_cfg p_lkrg(OE) dm_mirror dm_region_hash dm_log virtio_rng autofs4
[    5.391314] CR2: 0000000000000000
[    5.391314] ---[ end trace 3a84b7f3f219db36 ]---
[    5.391314] RIP: 0010:p_arch_static_call_transform_entry+0x146/0x1f0 [p_lkrg]
[    5.391314] Code: 00 00 8b 10 85 d2 0f 85 a6 00 00 00 8b 35 2a 39 02 00 85 f6 74 c5 48 8b 0d 27 39 02 00 eb 0b 83 c2 01 48 83 c1 60 39 d6 74 b1 <48> 3b 01 75 f0 89 15 9f 38 02 00 eb a4 48 8b 05 fe fc 01 00 4c 89
[    5.391314] RSP: 0018:ffffa281404fb9a8 EFLAGS: 00010206
[    5.391314] RAX: ffffffffc074ddc0 RBX: ffff9776742bea40 RCX: 0000000000000000
[    5.391314] RDX: 0000000000000000 RSI: 0000000000000028 RDI: ffffffffc06d66bf
[    5.391314] RBP: ffffa281404fb9b8 R08: 0000000000000000 R09: 0000000000000102
[    5.391314] R10: 0000000000000002 R11: 0000000000000000 R12: ffffffffc06d66bf
[    5.391314] R13: 0000000000000000 R14: ffffffffc0412a40 R15: ffff97767ec1ff40
[    5.391314] FS:  00007f5bc4de08c0(0000) GS:ffff97767ec00000(0000) knlGS:0000000000000000
[    5.391314] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    5.391314] CR2: 0000000000000000 CR3: 0000000033dc8001 CR4: 0000000000770ef0
[    5.391314] PKRU: 55555554
[    5.391314] Kernel panic - not syncing: Fatal exception
[    5.391314] Kernel Offset: 0x24e00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
Adam-pi3 commented 2 years ago

@solardiz Can you reproduce it? Can we have more verbose logs? Looks like LKRG is not being unload but completely different module

solardiz commented 2 years ago

Can you reproduce it? Can we have more verbose logs? Looks like LKRG is not being unload but completely different module

It'll probably will take some stress-testing to reproduce this one - so far we've only seen it once on CI. Maybe let's use debugging builds once we have a stress-testing setup separate from CI.