milabs / khook

Linux Kernel hooking engine (x86)
GNU General Public License v2.0
327 stars 50 forks source link

Ubuntu 24.04 LTS AMD cpu crash #19

Closed geekjy closed 4 months ago

geekjy commented 5 months ago

I test Intel cpu insmod ok,but amd cpu crash.

geekjy commented 5 months ago

All linux 6.8 kernel,test fedora40 and Ubuntu 24.04 LTS,test intel cpu ok.Amd cpu all crash,other kernel ok

geekjy commented 5 months ago

[ 242.351280] [ T5535] khook_demo: module verification failed: signature and/or required key missing - tainting kernel [ 242.364868] [ T18] general protection fault, maybe for address 0x80040033: 0000 [#1] PREEMPT SMP NOPTI [ 242.364888] [ T18] CPU: 0 PID: 18 Comm: migration/0 Kdump: loaded Tainted: G OE 6.8.0-31-generic #31-Ubuntu [ 242.364900] [ T18] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 [ 242.364911] [ T18] Stopper: multi_cpu_stop+0x0/0x120 <- stop_machine_cpuslocked+0x13b/0x170 [ 242.364924] [ T18] RIP: 0010:khook_arch_sm_init_one+0x56/0x170 [khook_demo] [ 242.364934] [ T18] Code: 76 06 eb 1c 48 8b 7b 10 48 01 c7 e8 e4 fe ff ff 48 98 48 03 43 30 48 89 43 30 48 83 f8 04 76 e4 fa 0f 20 c0 48 25 ff ff fe ff <0f> 22 c0 48 8b 7b 20 f6 43 28 01 0f 84 e9 00 00 00 48 c7 c2 39 1e [ 242.364950] [ T18] RSP: 0018:ffffb7aa400bfe08 EFLAGS: 00010006 [ 242.364958] [ T18] RAX: 0000000080040033 RBX: ffffffffc0a830e0 RCX: 0000000000000000 [ 242.364967] [ T18] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 242.364975] [ T18] RBP: ffffb7aa400bfe10 R08: 0000000000000000 R09: 0000000000000000 [ 242.364983] [ T18] R10: 0000000000000000 R11: 0000000000000000 R12: ffffb7aa4077bafc [ 242.364992] [ T18] R13: 0000000000000002 R14: ffffffffa8e48660 R15: 0000000000000003 [ 242.365000] [ T18] FS: 0000000000000000(0000) GS:ffff89c437e00000(0000) knlGS:0000000000000000 [ 242.365010] [ T18] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 242.365017] [ T18] CR2: 000055654f694688 CR3: 0000000105234000 CR4: 0000000000f50ef0 [ 242.365033] [ T18] PKRU: 55555554 [ 242.365039] [ T18] Call Trace: [ 242.365045] [ T18] [ 242.365051] [ T18] ? show_regs+0x6d/0x80 [ 242.365059] [ T18] ? die_addr+0x37/0xa0 [ 242.365066] [ T18] ? exc_general_protection+0x1db/0x480 [ 242.365077] [ T18] ? asm_exc_general_protection+0x27/0x30 [ 242.365087] [ T18] ? khook_arch_sm_init_one+0x56/0x170 [khook_demo] [ 242.365097] [ T18] ? khook_arch_sm_init_one+0x3c/0x170 [khook_demo] [ 242.365106] [ T18] khook_sm_init_hooks+0x2b/0x50 [khook_demo] [ 242.365115] [ T18] multi_cpu_stop+0x6e/0x120 [ 242.365122] [ T18] ? pfx_multi_cpu_stop+0x10/0x10 [ 242.365129] [ T18] cpu_stopper_thread+0x99/0x170 [ 242.365137] [ T18] ? __pfx_smpboot_thread_fn+0x10/0x10 [ 242.365146] [ T18] smpboot_thread_fn+0xe0/0x1e0 [ 242.365153] [ T18] kthread+0xef/0x120 [ 242.365160] [ T18] ? pfx_kthread+0x10/0x10 [ 242.365167] [ T18] ret_from_fork+0x44/0x70 [ 242.365174] [ T18] ? __pfx_kthread+0x10/0x10 [ 242.365181] [ T18] ret_from_fork_asm+0x1b/0x30 [ 242.365190] [ T18] [ 242.365195] [ T18] Modules linked in: khook_demo(OE+) qrtr intel_rapl_msr intel_rapl_common vmw_balloon vmwgfx drm_ttm_helper ttm i2c_piix4 vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vmw_vmci cfg80211 binfmt_misc joydev input_leds mac_hid serio_raw dm_multipath msr efi_pstore nfnetlink dmi_sysfs ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha256_ssse3 sha1_ssse3 mptspi mptscsih psmouse ahci libahci mptbase e1000 scsi_transport_spi pata_acpi aesni_intel crypto_simd cryptd

milabs commented 5 months ago

I guess it's Virtual Machine issue ...

geekjy commented 5 months ago

IMG_2071 I use AMD 7950X Physical machine crash

milabs commented 5 months ago

Ok, that is the code which failed:

[ 242.364934] [ T18] Code: 76 06 eb 1c 48 8b 7b 10 48 01 c7 e8 e4 fe ff ff 48 98 48 03 43 30 48 89 43 30 48 83 f8 04 76 e4 fa 0f 20 c0 48 25 ff ff fe ff <0f> 22 c0 48 8b 7b 20 f6 43 28 01 0f 84 e9 00 00 00 48 c7 c2 39 1e

The following instruction failed:

0f 22 c0    mov    %rax,%cr0

I will try to understand why it failed that way for AMD but not on Intel cpu...

geekjy commented 5 months ago

Ok, that is the code which failed:

[ 242.364934] [ T18] Code: 76 06 eb 1c 48 8b 7b 10 48 01 c7 e8 e4 fe ff ff 48 98 48 03 43 30 48 89 43 30 48 83 f8 04 76 e4 fa 0f 20 c0 48 25 ff ff fe ff <0f> 22 c0 48 8b 7b 20 f6 43 28 01 0f 84 e9 00 00 00 48 c7 c2 39 1e

The following instruction failed:

0f 22 c0    mov    %rax,%cr0

I will try to understand why it failed that way for AMD but not on Intel cpu...

ok

geekjy commented 5 months ago

unsigned long cr0; static inline void write_cr0_forced(unsigned long val) { unsigned long __force_order;

asm volatile(
    "mov %0, %%cr0"
    : "+r"(val), "+m"(__force_order));

}

static int __init rootkit_init(void) { cr0 = read_cr0(); printk(KERN_INFO "cr0 is %lx\n",cr0); write_cr0_forced(cr0 & ~0x00010000); write_cr0_forced(cr0); return 0; } I tried to write a very simple CR0 operation, and it crashed. However, other virtual machines on the same computer with kernels less than 6.8 will not crash.

geekjy commented 5 months ago

[ 367.805404] [ T2002] cr0 is 80050033 [ 367.805512] [ T2002] RIP: 0010:rootkit_init+0x82/0xff0 [main] [ 367.805521] [ T2002] Code: b4 c0 48 89 c6 48 89 05 4c c5 ff ff e8 07 a0 c6 e6 48 8b 05 40 c5 ff ff 48 c7 45 e8 00 00 00 00 48 89 c2 48 81 e2 ff ff fe ff <0f> 22 c2 48 c7 45 e8 00 00 00 00 0f 22 c0 48 c7 c7 83 20 b4 c0 e8 [ 367.805537] [ T2002] RSP: 0018:ffffaaf6c092fac8 EFLAGS: 00010206 [ 367.805545] [ T2002] RAX: 0000000080050033 RBX: ffffffffa7828b50 RCX: 0000000000000000 [ 367.805553] [ T2002] RDX: 0000000080040033 RSI: 0000000000000000 RDI: 0000000000000000 [ 367.805561] [ T2002] RBP: ffffaaf6c092fae0 R08: 0000000000000000 R09: 0000000000000000 [ 367.805569] [ T2002] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 367.805577] [ T2002] R13: ffff978bc0d8f260 R14: ffffaaf6c092faf0 R15: 0000000000000000 [ 367.805585] [ T2002] FS: 000071f8a73b8080(0000) GS:ffff978bf7e00000(0000) knlGS:0000000000000000 [ 367.805594] [ T2002] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 367.805602] [ T2002] CR2: 0000629172bda688 CR3: 000000010534e000 CR4: 0000000000f50ef0 [ 367.805617] [ T2002] PKRU: 55555554 [ 367.805622] [ T2002] Call Trace: [ 367.805628] [ T2002] [ 367.805634] [ T2002] ? show_regs+0x6d/0x80 [ 367.805641] [ T2002] ? die_addr+0x37/0xa0 [ 367.805648] [ T2002] ? exc_general_protection+0x1db/0x480 [ 367.805658] [ T2002] ? asm_exc_general_protection+0x27/0x30 [ 367.805666] [ T2002] ? pfx_kallsyms_lookup_name+0x10/0x10 [ 367.805675] [ T2002] ? rootkit_init+0x82/0xff0 [main] [ 367.805684] [ T2002] ? rootkit_init+0x69/0xff0 [main] [ 367.805692] [ T2002] ? __pfx_rootkit_init+0x10/0x10 [main] [ 367.805700] [ T2002] do_one_initcall+0x5b/0x340 [ 367.805709] [ T2002] do_init_module+0xc0/0x2c0 [ 367.805718] [ T2002] load_module+0xba1/0xcf0 [ 367.805725] [ T2002] ? srso_alias_return_thunk+0x5/0xfbef5 [ 367.805733] [ T2002] ? security_kernel_post_read_file+0x75/0x90 [ 367.805743] [ T2002] init_module_from_file+0x96/0x100 [ 367.805750] [ T2002] ? srso_alias_return_thunk+0x5/0xfbef5 [ 367.805757] [ T2002] ? init_module_from_file+0x96/0x100 [ 367.805766] [ T2002] idempotent_init_module+0x11c/0x2b0 [ 367.805774] [ T2002] x64_sys_finit_module+0x64/0xd0 [ 367.805781] [ T2002] x64_sys_call+0x1d6e/0x25c0 [ 367.805787] [ T2002] do_syscall_64+0x7f/0x180 [ 367.805794] [ T2002] ? srso_alias_return_thunk+0x5/0xfbef5 [ 367.805801] [ T2002] ? do_syscall_64+0x8c/0x180 [ 367.805808] [ T2002] ? srso_alias_return_thunk+0x5/0xfbef5 [ 367.805815] [ T2002] ? ksys_read+0x73/0x100 [ 367.805824] [ T2002] ? srso_alias_return_thunk+0x5/0xfbef5 [ 367.805831] [ T2002] ? syscall_exit_to_user_mode+0x86/0x260 [ 367.805839] [ T2002] ? srso_alias_return_thunk+0x5/0xfbef5 [ 367.805846] [ T2002] ? do_syscall_64+0x8c/0x180 [ 367.805853] [ T2002] ? srso_alias_return_thunk+0x5/0xfbef5 [ 367.806027] [ T2002] ? do_syscall_64+0x8c/0x180 [ 367.806180] [ T2002] ? do_syscall_64+0x8c/0x180 [ 367.806331] [ T2002] ? srso_alias_return_thunk+0x5/0xfbef5 [ 367.806492] [ T2002] ? irqentry_exit+0x43/0x50 [ 367.806635] [ T2002] ? srso_alias_return_thunk+0x5/0xfbef5 [ 367.806776] [ T2002] ? exc_page_fault+0x94/0x1b0 [ 367.806922] [ T2002] entry_SYSCALL_64_after_hwframe+0x73/0x7b [ 367.807056] [ T2002] RIP: 0033:0x71f8a6b2725d [ 367.807188] [ T2002] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48 [ 367.807467] [ T2002] RSP: 002b:00007ffd98606208 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 367.807609] [ T2002] RAX: ffffffffffffffda RBX: 0000629172bd6700 RCX: 000071f8a6b2725d [ 367.807750] [ T2002] RDX: 0000000000000000 RSI: 0000629172356e52 RDI: 0000000000000003 [ 367.807892] [ T2002] RBP: 00007ffd986062c0 R08: 0000000000000040 R09: 0000000000000000 [ 367.808043] [ T2002] R10: 000071f8a6c03b20 R11: 0000000000000246 R12: 0000629172356e52 [ 367.808184] [ T2002] R13: 0000000000000000 R14: 0000629172bd63a0 R15: 0000000000000000 [ 367.808326] [ T2002] [ 367.808476] [ T2002] Modules linked in: main(OE+) intel_rapl_msr intel_rapl_common vmw_balloon qrtr vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock cfg80211 binfmt_misc vmwgfx i2c_piix4 drm_ttm_helper vmw_vmci ttm joydev input_leds mac_hid serio_raw dm_multipath msr efi_pstore nfnetlink dmi_sysfs ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha256_ssse3 sha1_ssse3 psmouse mptspi mptscsih ahci e1000 mptbase libahci scsi_transport_spi pata_acpi aesni_intel crypto_simd cryptd [last unloaded: main(OE)]

milabs commented 5 months ago

@geekjy CET might be the reason, will you be able to make a change and check according to the following https://lore.kernel.org/all/20211126123446.32324-59-andrew.cooper3@citrix.com/ ?

geekjy commented 4 months ago

@geekjy CET might be the reason, will you be able to make a change and check according to the following https://lore.kernel.org/all/20211126123446.32324-59-andrew.cooper3@citrix.com/ ?

Very good, I tested it and it loaded successfully

geekjy commented 4 months ago
cr4 cr44
milabs commented 4 months ago

Would you mind to share the fix?

geekjy commented 4 months ago

Would you mind to share the fix?

define kernel_write_enter() asm volatile ( \

"cli\n\t"               \
"mov %%cr4, %%rbx\n\t"          \
"and $~(1 << 23), %%rbx\n\t"        \
"mov %%rbx, %%cr4\n\t"          \
"mov %%cr0, %%rax\n\t"          \
"and $0xfffffffffffeffff, %%rax\n\t"    \
"mov %%rax, %%cr0\n\t"          \
::: "%rax", "%rbx" )

define kernel_write_leave() asm volatile ( \

"mov %%cr0, %%rax\n\t"          \
"or $0x0000000000010000, %%rax\n\t" \
"mov %%rax, %%cr0\n\t"          \
"mov %%cr4, %%rbx\n\t"          \
"or $1 << 23, %%rbx\n\t"        \
"mov %%rbx, %%cr4\n\t"          \
"sti\n\t"               \
::: "%rax", "%rbx" )
milabs commented 4 months ago

@geekjy please, check the fix proposed

geekjy commented 4 months ago

@geekjy please, check the fix proposed

make -C /lib/modules/6.8.0-31-generic/build M=$PWD modules make[1]: Entering directory '/usr/src/linux-headers-6.8.0-31-generic' warning: the compiler differs from the one used to build the kernel The kernel was built by: x86_64-linux-gnu-gcc-13 (Ubuntu 13.2.0-23ubuntu4) 13.2.0 You are using: gcc-13 (Ubuntu 13.2.0-23ubuntu4) 13.2.0 CC [M] /root/khook/khook_demo/main.o CC [M] /root/khook/khook_demo/../khook/engine.o CC [M] /root/khook/khook_demo/../khook/x86/hook.o /root/khook/khook_demo/../khook/x86/hook.c: In function ‘khook_arch_write_kernel’: /root/khook/khook_demo/../khook/x86/hook.c:148:15: error: implicit declaration of function ‘read_cr4’; did you mean ‘read_cr2’? [-Werror=implicit-function-declaration] 148 | cr4 = read_cr4(); | ^~~~ | read_cr2 /root/khook/khook_demo/../khook/x86/hook.c:151:17: error: implicit declaration of function ‘write_cr4’; did you mean ‘write_cr3’? [-Werror=implicit-function-declaration] 151 | write_cr4(cr4 & ~X86_CR4_CET); | ^~~~~ | write_cr3 cc1: some warnings being treated as errors make[3]: [scripts/Makefile.build:243: /root/khook/khook_demo/../khook/x86/hook.o] Error 1 make[2]: [/usr/src/linux-headers-6.8.0-31-generic/Makefile:1926: /root/khook/khook_demo] Error 2 make[1]: [Makefile:240: __sub-make] Error 2 make[1]: Leaving directory '/usr/src/linux-headers-6.8.0-31-generic' make: [Makefile:4: all] Error 2

geekjy commented 4 months ago

@geekjy please, check the fix proposed

I test code ok.

ifdef i686

define kernel_write_enter() asm volatile ( \

"cli\n\t"                                  \
"mov %%cr4, %%ebx\n\t"                     \
"test $0x800000, %%ebx\n\t"                \
"jz 1f\n\t"                                \
"and $~0x800000, %%ebx\n\t"                \
"mov %%ebx, %%cr4\n\t"                     \
"1:\n\t"                                   \
"mov %%cr0, %%eax\n\t"                     \
"and $0xfffeffff, %%eax\n\t"               \
"mov %%eax, %%cr0\n\t"                     \
::: "%eax", "%ebx")

define kernel_write_leave() asm volatile ( \

"mov %%cr0, %%eax\n\t"                     \
"or $0x00010000, %%eax\n\t"                \
"mov %%eax, %%cr0\n\t"                     \
"mov %%cr4, %%ebx\n\t"                     \
"test $0x800000, %%ebx\n\t"                \
"jz 1f\n\t"                                \
"or $0x800000, %%ebx\n\t"                  \
"mov %%ebx, %%cr4\n\t"                     \
"1:\n\t"                                   \
"sti\n\t"                                  \
::: "%eax", "%ebx")

else

define kernel_write_enter() asm volatile ( \

"cli\n\t"                                  \
"mov %%cr4, %%rbx\n\t"                     \
"test $0x800000, %%rbx\n\t"                \
"jz 1f\n\t"                                \
"and $~0x800000, %%rbx\n\t"                \
"mov %%rbx, %%cr4\n\t"                     \
"1:\n\t"                                   \
"mov %%cr0, %%rax\n\t"                     \
"and $0xfffffffffffeffff, %%rax\n\t"       \
"mov %%rax, %%cr0\n\t"                     \
::: "%rax", "%rbx")

define kernel_write_leave() asm volatile ( \

"mov %%cr0, %%rax\n\t"                     \
"or $0x0000000000010000, %%rax\n\t"        \
"mov %%rax, %%cr0\n\t"                     \
"mov %%cr4, %%rbx\n\t"                     \
"test $0x800000, %%rbx\n\t"                \
"jz 1f\n\t"                                \
"or $0x800000, %%rbx\n\t"                  \
"mov %%rbx, %%cr4\n\t"                     \
"1:\n\t"                                   \
"sti\n\t"                                  \
::: "%rax", "%rbx")

endif

milabs commented 4 months ago

@geekjy thanks for sharing your code, though I've used slightly different approach to simplify maintenance, please check the update one more time

geekjy commented 4 months ago

@geekjy thanks for sharing your code, though I've used slightly different approach to simplify maintenance, please check the update one more time

Test khook-cet-fix ok,thanks!

milabs commented 4 months ago

@geekjy merged to master