memflow / memflow-kvm

Linux kernel module for memflow's KVM connector
MIT License
36 stars 8 forks source link

Failing to load kmod on NixOS (kernel 5.10) #5

Open weewoo22 opened 2 years ago

weewoo22 commented 2 years ago

After compiling and inserting the kernel module for my system it fails to initialize with:

[  681.160492] do_init_module: 'memflow'->init suspiciously returned 9, it should follow 0/-E convention
               do_init_module: loading module anyway...
[  681.160495] CPU: 3 PID: 1031 Comm: systemd-modules Tainted: G           O      5.10.70 #1-NixOS

[  681.160496] Call Trace:
[  681.160502]  dump_stack+0x6b/0x83
[  681.160506]  do_init_module.cold+0x21/0x26
[  681.160508]  __do_sys_finit_module+0xb1/0x110
[  681.160511]  do_syscall_64+0x33/0x40
[  681.160513]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  681.160514] RIP: 0033:0x7f32261302a9
[  681.160516] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 97 3b 0c 00 f7 d8 64 89 01 48
[  681.160517] RSP: 002b:00007ffe7801c0e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[  681.160519] RAX: ffffffffffffffda RBX: 00005610a4122590 RCX: 00007f32261302a9
[  681.160519] RDX: 0000000000000000 RSI: 00007f322620e9bd RDI: 0000000000000006
[  681.160520] RBP: 0000000000020000 R08: 0000000000000000 R09: 00005610a41226a0
[  681.160520] R10: 0000000000000006 R11: 0000000000000246 R12: 00007f322620e9bd
[  681.160521] R13: 0000000000000000 R14: 00005610a4122230 R15: 00005610a4122590

Forcefully removing the kernel module with modprobe -rf memflow after load failure results in:

[ 1488.717423] BUG: kernel NULL pointer dereference, address: 0000000000000008
[ 1488.717427] #PF: supervisor write access in kernel mode
[ 1488.717427] #PF: error_code(0x0002) - not-present page
[ 1488.717428] PGD 50d548067 P4D 50d548067 PUD 60e1bb067 PMD 0
[ 1488.717431] Oops: 0002 [#1] SMP NOPTI
[ 1488.717433] CPU: 0 PID: 6763 Comm: modprobe Tainted: G           O      5.10.70 #1-NixOS

[ 1488.717438] RIP: 0010:misc_deregister+0x39/0xa0
[ 1488.717439] Code: 53 48 8b 57 18 2b 2f 48 39 c2 74 75 48 89 fb 48 c7 c7 a0 dc 3b bb e8 76 88 29 00 48 8b 43 20 48 8b 53 18 48 8b 3d 67 d6 40 01 <48> 89 42 08 48 89 10 8b 33 48 b8 00 01 00 00 00 00 ad de 48 89 43
[ 1488.717441] RSP: 0018:ffffaef3c1f77ec8 EFLAGS: 00010246
[ 1488.717442] RAX: 0000000000000000 RBX: ffffffffc0a95000 RCX: 0000000000000000
[ 1488.717443] RDX: 0000000000000000 RSI: 0000000000000007 RDI: ffff937ec0d96c00
[ 1488.717444] RBP: 00000000ffffffd2 R08: ffffaef3c1f77ee8 R09: 8080808080808080
[ 1488.717444] R10: 0000000000000037 R11: ffffaef3c1f77ee8 R12: 0000000000000000
[ 1488.717445] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 1488.717447] FS:  00007fc00ac9a740(0000) GS:ffff9385efa00000(0000) knlGS:0000000000000000
[ 1488.717448] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1488.717448] CR2: 0000000000000008 CR3: 00000005e1f22006 CR4: 00000000001726f0
[ 1488.717449] Call Trace:
[ 1488.717455]  memflow_exit+0x11/0x20 [memflow]
[ 1488.717458]  __do_sys_delete_module+0x19d/0x270
[ 1488.717461]  do_syscall_64+0x33/0x40
[ 1488.717464]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1488.717465] RIP: 0033:0x7fc00ad98eb7
[ 1488.717467] Code: 73 01 c3 48 8b 0d b9 df 0b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 89 df 0b 00 f7 d8 64 89 01 48
[ 1488.717468] RSP: 002b:00007ffc4404bd88 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[ 1488.717469] RAX: ffffffffffffffda RBX: 0000000000fedc00 RCX: 00007fc00ad98eb7
[ 1488.717470] RDX: 0000000000000001 RSI: 0000000000000a00 RDI: 0000000000fedc68
[ 1488.717471] RBP: 0000000000fedc00 R08: 0000000000000000 R09: 00007fc00ae08ae0
[ 1488.717471] R10: 00007fc00ae093e0 R11: 0000000000000206 R12: 0000000000fedc68
[ 1488.717472] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000fedc00

[ 1488.717561] CR2: 0000000000000008
[ 1488.717563] ---[ end trace 67f72b5fdad80535 ]---
[ 1488.737637] RIP: 0010:misc_deregister+0x39/0xa0
[ 1488.737645] Code: 53 48 8b 57 18 2b 2f 48 39 c2 74 75 48 89 fb 48 c7 c7 a0 dc 3b bb e8 76 88 29 00 48 8b 43 20 48 8b 53 18 48 8b 3d 67 d6 40 01 <48> 89 42 08 48 89 10 8b 33 48 b8 00 01 00 00 00 00 ad de 48 89 43
[ 1488.737647] RSP: 0018:ffffaef3c1f77ec8 EFLAGS: 00010246
[ 1488.737651] RAX: 0000000000000000 RBX: ffffffffc0a95000 RCX: 0000000000000000
[ 1488.737651] RDX: 0000000000000000 RSI: 0000000000000007 RDI: ffff937ec0d96c00
[ 1488.737652] RBP: 00000000ffffffd2 R08: ffffaef3c1f77ee8 R09: 8080808080808080
[ 1488.737654] R10: 0000000000000037 R11: ffffaef3c1f77ee8 R12: 0000000000000000
[ 1488.737654] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 1488.737656] FS:  00007fc00ac9a740(0000) GS:ffff9385efa00000(0000) knlGS:0000000000000000
[ 1488.737657] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1488.737658] CR2: 0000000000000008 CR3: 00000005e1f22006 CR4: 00000000001726f0

I'm on kernel 5.10.70 as you can see in dmesg

h33p commented 2 years ago

Nevermind what I posted earlier, I found the issue. Will push an update soon, but the module will most likely not work for you. Could you check if CONFIG_KALLSYMS is enabled on your kernel?

weewoo22 commented 2 years ago
$ zgrep CONFIG_KALLSYMS /proc/config.gz
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_ALL is not set
CONFIG_KALLSYMS_ABSOLUTE_PERCPU=y
CONFIG_KALLSYMS_BASE_RELATIVE=y
h33p commented 2 years ago

Okay, kallsyms are definitely on, I've pushed an update to the kallsyms-mod submodule. It should now return correctly and print which symbol it failed to lookup. It could be failing either because the symbols for kvm_lock and vm_list are not exported for some reason, or because kvm module is not loaded into the kernel.

On the former, I've noticed that my kernel has CONFIG_KALLSYMS_ALL enabled (which I thought was off), so could be that. If this is a requirement, then it's bad from my end, I thought it wasn't needed, but having a kernel with this config on would fix it.

weewoo22 commented 2 years ago

After restarting and inserting the updated kernel module it still fails to load in the same way as before:

[  134.646119] do_init_module: 'memflow'->init suspiciously returned 9, it should follow 0/-E convention
               do_init_module: loading module anyway...
[  134.646123] CPU: 2 PID: 3368 Comm: insmod Tainted: G           O      5.10.70 #1-NixOS

[  134.646124] Call Trace:
[  134.646130]  dump_stack+0x6b/0x83
[  134.646134]  do_init_module.cold+0x21/0x26
[  134.646136]  __do_sys_finit_module+0xb1/0x110
[  134.646139]  do_syscall_64+0x33/0x40
[  134.646140]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  134.646142] RIP: 0033:0x7fc4a31802a9
[  134.646144] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 97 3b
0c 00 f7 d8 64 89 01 48
[  134.646144] RSP: 002b:00007ffd872e53e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[  134.646146] RAX: ffffffffffffffda RBX: 000000000188e7f0 RCX: 00007fc4a31802a9
[  134.646146] RDX: 0000000000000000 RSI: 000000000041d288 RDI: 0000000000000003
[  134.646147] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007fc4a3248060
[  134.646148] R10: 0000000000000003 R11: 0000000000000246 R12: 000000000041d288
[  134.646148] R13: 0000000000000000 R14: 000000000188e750 R15: 0000000000000000

Setting CONFIG_KALLSYMS_ALL=y as you have for the kernel on your system allows the module to load successfully (memflow: initialized)

h33p commented 2 years ago

It could be git submodules not being updated. Could you check that?

h33p commented 2 years ago

And it's good to know that CONFIG_KALLSYMS_ALL is required, I will leave a note for now, but will look for a way around it for the future update.