Closed Polluktus closed 2 years ago
Let's keep everything archived on GitHub, rather than use pastebin, so here's a copy of dmesg taken from the pastebin above: dmesg.txt
Excerpts:
[ 0.000000] Linux version 5.16.2-polluktus-lkrg-test (root@Warsaw) (clang version 13.0.0, LLD 13.0.0) #1 SMP PREEMPT Fri Jan 21 22:45:45 CET 2022
[...]
[ 101.148106] [p_lkrg] Loading LKRG...
[ 101.164499] Freezing user space processes ... (elapsed 0.001 seconds) done.
[ 101.165709] OOM killer disabled.
[ 104.888290] [p_lkrg] [kretprobe] register_kretprobe() for <ovl_create_or_link> failed! [err=-2]
[ 104.888295] [p_lkrg] Can't hook 'ovl_create_or_link' function. This is expected if you are not using OverlayFS.
[ 105.840787] [p_lkrg] LKRG initialized successfully!
[ 105.840789] OOM killer enabled.
[ 105.840790] Restarting tasks ...
[ 105.841107] [p_lkrg] <Exploit Detection> process[4236 | IndexedDB #6] has corrupted 'off' flag!
[...]
[ 105.841162] [p_lkrg] <Exploit Detection> process[4236 | IndexedDB #6] has corrupted 'off' flag!
[ 105.841207] done.
[ 105.841376] [p_lkrg] <Exploit Detection> process[6694 | dwmblocks] has corrupted 'off' flag!
[ 105.841384] [p_lkrg] <Exploit Detection> process[6694 | sh] has corrupted 'off' flag!
[...]
[ 106.425733] [p_lkrg] <Exploit Detection> process[3897 | dbus-daemon] has corrupted 'off' flag!
[ 106.427659] [p_lkrg] <Exploit Detection> process[6709 | playerctl] has corrupted 'off' flag!
[ 106.429774] [p_lkrg] <Exploit Detection> process[6710 | awk] has corrupted 'off' flag!
[ 106.429787] [p_lkrg] <Exploit Detection> process[6710 | awk] has corrupted 'off' flag!
[ 106.429794] [p_lkrg] <Exploit Detection> process[6710 | awk] has corrupted 'off' flag!
[ 106.861689] [p_lkrg] <Exploit Detection> Detected pointer swapping attack!process[6715 | doas] has different 'cred' pointer
[ 106.861699] [p_lkrg] <Exploit Detection> Detected pointer swapping attack!process[6715 | doas] has different 'real_cred' pointer
[ 106.861702] [p_lkrg] <Exploit Detection> process[6715 | doas] has different EUID! 1000 vs 0
[ 106.861707] [p_lkrg] <Exploit Detection> process[6715 | doas] has different SUID! 1000 vs 0
[ 106.861710] [p_lkrg] <Exploit Detection> process[6715 | doas] has different EUID! 1000 vs 0
[ 106.861712] [p_lkrg] <Exploit Detection> process[6715 | doas] has different SUID! 1000 vs 0
[ 106.861715] [p_lkrg] <Exploit Detection> process[6715 | doas] has different FSUID! 1000 vs 0
[ 106.861729] [p_lkrg] <Exploit Detection> process[6715 | doas] has corrupted 'off' flag!
Quoting Adam's request in #106, which is also applicable here:
we have introduced a
P_LKRG_TASK_OFF_DEBUG
compilation option which helps to debug issues like that. Can you enable such option, recompile LKRG and re-run your tests invoking the described problem and share the logs? This option can be enabled insrc/modules/print_log/p_lkrg_log_level_shared.h
file (un-comment line 31)
My guess, though, is this time it's an effect of LTO combined with our attempted hooking of kernel-internal functions. Perhaps LTO made it so that our hooks are not in all the right places. Perhaps building the kernel without LTO would help.
Also quoting @Polluktus in #135:
I've build kernel locally on gentoo with my own custom config, custom patches, some yank from 5.17 and clang lto O3 march=native.
I labeled this "question" for now, although I feel this is also mid-way between "bug" and "portability". It isn't exactly a code bug that we hook kernel-internal functions, but it is indeed not ideal that we (have to) do that.
Yes, you were right, with
CONFIG_LTO_NONE=y
the problem is fixed
[ 87.798343] p_lkrg: loading out-of-tree module taints kernel.
[ 87.865313] [p_lkrg] Loading LKRG...
[ 87.880516] Freezing user space processes ... (elapsed 0.001 seconds) done.
[ 87.881736] OOM killer disabled.
[ 91.209411] [p_lkrg] [kretprobe] register_kretprobe() for <ovl_create_or_link> failed! [err=-2]
[ 91.209415] [p_lkrg] Can't hook 'ovl_create_or_link' function. This is expected if you are not using OverlayFS.
[ 92.059147] [p_lkrg] LKRG initialized successfully!
[ 92.059149] OOM killer enabled.
[ 92.059149] Restarting tasks ... done.
Later i will paste output of LTO + P_LKRG_TASK_OFF_DEBUG
Hm... Looks like some calls to override_creds
functions has been inlined so the hook is never fired and we have FP. I'm not sure if we should spend more time on this at least for now. Looks like LTO will be problematic...
Closing this issue for now... @solardiz any objections?
I think in the future this should be revisited at least. While I do not use LTO at the moment, I will likely end up building my kernels with Clang's CFI as well once it becomes available for x86_64. It would be nice to be able to continue using LKRG once that happens.
I think in the future this should be revisited at least. While I do not use LTO at the moment, I will likely end up building my kernels with Clang's CFI as well once it becomes available for x86_64. It would be nice to be able to continue using LKRG once that happens.
On that note I wonder if compiling LKRG as a builtin would workaround the LTO issue?
Closing this issue for now... @solardiz any objections?
No objections, I think it's a WONTFIX for now. I'll close.
I wonder if compiling LKRG as a builtin would workaround the LTO issue?
Currently, no, because we use the same hooking mechanism even when LKRG is linked in. We could avoid that, but then we'd need to implement and maintain two kinds of hooking.
Fallowing @solardiz advice, i created new issue after discussion in #135, seems similar to #30 and #106. Kernel: Custom 5.16.2 LKRG: 0.9.2
My steps:
dmesg: https://pastebin.com/meTDESak
I know that troubleshooting custom kernel may by hard and almost impossible to reproduce, so if you decide to close this issue, i will understand.