lkrg-org / lkrg

Linux Kernel Runtime Guard
https://lkrg.org
Other
402 stars 72 forks source link

Built-in LKRG on 6.1 Prevents Automatic Loading of RootFS Module at Boot #258

Closed sempervictus closed 1 year ago

sempervictus commented 1 year ago

When building LKRG into the kernel to permit trimming unused ksyms, booting Arch Linux fails at the FS mount with:

:: running early hook [udev]
Starting version 251.7-1-arch
:: running hook [udev]
:: Triggering uevents...
:: performing fsck on '/dev/vda1'
/dev/vda1: clean, 80060/2621440 files, 761980/10485499 blocks
:: mounting '/dev/vda1' on real root
[    3.030935][  T257] LKRG: ALERT: BLOCK: UMH: Executing program name /usr/bin/modprobe
mount: /new_root: unknown filesystem type 'ext4'.
       dmesg(1) may have more information after failed mount system call.
You are now being dropped into an emergency shell.
sh: can't access tty; job control turned off
[rootfs ]# 

despite use of lkrg.profile_enforce=0 and lkrg.profile_validate=0 at the kernel commandline. Issuing a mount for the root FS results in:

[rootfs ]# mount /dev/vda1 /root/
[  135.459467][  T264] LKRG: ALERT: BLOCK: UMH: Executing program name /usr/bin/modprobe
mount: /root: unknown filesystem type 'ext4'.
       dmesg(1) may have more information after failed mount system call.

Manually running modprobe ext4 works from the init shell, which then permits me to mount /dev/vda1 on /root

LKRG pulled in from upstream yesterday evening, so current revision. 6.1 is built w/ linux-hardened patchset on GCC 12.2

Adam-pi3 commented 1 year ago

Looks like /usr/bin/modprobe is not in the UMH allow list. Can you try setting lkrg.umh_enforce=0 and lkrg.umh_validate=0 and verify if it boots? If yes, we can add /usr/bin/modprobe to the allow list.

sempervictus commented 1 year ago

Roger, wilco - the test environment w/ that built-in kernel is toast, but i'll get another built-in one set up once i wrap up some other tasks (including LLVM 16's kCFI+LTO validation).

sempervictus commented 1 year ago

@Adam-pi3 - in #259 i'm seeing squashfs not being loaded while loop seems to be fine: image ^^ is after LKRG died during init though, and since it cant be unloaded, it might be getting stuck there.

Any chance the kernel commandline flags are ignored when built-in (or when crashing on init)? Also noticed that Debian's init environment has that modprobe binary @ /usr/sbin/modprobe vs Arch's usr/bin/modprobe

sempervictus commented 1 year ago

I think that's all related to the kCFI vs kprobes thing, will verify that

diff --git a/security/lkrg/modules/exploit_detection/syscalls/p_call_usermodehelper/p_call_usermodehelper.c b/security/lkrg/modules/exploit_detection/syscalls/p_call_usermodehelper/p_call_usermodehelper.c
index 51536d070de9..cb26c0703c09 100644
--- a/security/lkrg/modules/exploit_detection/syscalls/p_call_usermodehelper/p_call_usermodehelper.c
+++ b/security/lkrg/modules/exploit_detection/syscalls/p_call_usermodehelper/p_call_usermodehelper.c
@@ -50,6 +50,8 @@ static const char * const p_umh_global[] = {
    "/sbin/drbdadm",
    "/sbin/hotplug",
    "/sbin/modprobe",
+   "/usr/bin/modprobe",
+   "/usr/sbin/modprobe",
    "/sbin/nfs_cache_getent",
    "/sbin/nfsd-recall-failed",
    "/sbin/nfsdcltrack",

addresses next time i build things w/ GCC

Adam-pi3 commented 1 year ago

You are hitting multiple problems and they should be isolated and addressed individually. Yes, kCFI is one of them, LTO is another, UMH is another, etc. Btw. /usr/sbin/modprobe is already on the allow list: https://github.com/lkrg-org/lkrg/blob/main/src/modules/exploit_detection/syscalls/p_call_usermodehelper/p_call_usermodehelper.c#L67

sempervictus commented 1 year ago

To quote Homer - "Doh!" i noticed the first entry and put mine after it, will remove the redundant sbin one. Thanks I see, so the kCFI and LTO problems are separate - neat. Once i get another GCC build run (sans LTO and kCFI), will be able to verify the fix.

GCC can't be far behind to offer LTO and CFI (even if they snagged the last public RAP, they'd do a lot better than this) for the kernel - any thoughts on how to move LKRG forward in the over-optimizing future toward which we're all going?