lkrg-org / lkrg

Linux Kernel Runtime Guard
https://lkrg.org
Other
404 stars 72 forks source link

Module KOBJ list hash changed unexpectedly on mkosi-boot (bionic) #212

Closed solardiz closed 1 year ago

solardiz commented 1 year ago

This just happened in my fork of the repo, even though the same test had succeeded here, as well as in my fork on other occasions:

https://github.com/solardiz/lkrg/runs/7434862217?check_suite_focus=true

Ubuntu 18.04 LTS localhost -
localhost login: root (automatic login)
Last login: Wed Jul 20 17:46:39 UTC 2022 on tty1
[   40.494112] input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input3
Welcome to Ubuntu 18.04 LTS (GNU/Linux 4.15.0-189-generic x86_64)
 * Documentation:  https://help.ubuntu.com/
 * Management:     https://landscape.canonical.com/
 * Support:        https://ubuntu.com/advantage
[   43.252924] LKRG: ALERT: DETECT: Kernel: Module KOBJ list hash changed unexpectedly
[   43.253372] LKRG: ALERT: DETECT: Kernel: 1 checksums changed unexpectedly
ABORT
[   43.253658] LKRG: ALERT: BLOCK: Kernel: 1 checksums changed unexpectedly
[   43.254063] Kernel panic - not syncing: Kernel: 1 checksums changed unexpectedly
[   43.254637] CPU: 0 PID: 30 Comm: kworker/u2:1 Tainted: G           OE    4.15.0-189-generic #200-Ubuntu
[   43.255047] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
[   43.255800] Workqueue: events_unbound p_check_integrity [lkrg]
[   43.256191] Call Trace:
[   43.257371]  dump_stack+0x6d/0x8b
[   43.257602]  panic+0xe4/0x247
[   43.257774]  p_check_integrity+0x1830/0x18f0 [lkrg]
[   43.258088]  process_one_work+0x1de/0x420
[   43.258347]  worker_thread+0x32/0x410
[   43.258587]  kthread+0x121/0x140
[   43.258818]  ? process_one_work+0x420/0x420
[   43.259091]  ? kthread_create_worker_on_cpu+0x70/0x70
[   43.259414]  ret_from_fork+0x35/0x40
solardiz commented 1 year ago

This can possibly be a side effect of me having introduced this:

   if (P_CTRL(p_log_level) < P_LOG_WATCH)
      goto skip_db_checks;

The rationale was that those now-skipped checks would only produce WATCH severity messages (except for one FAULT in P_PRINT_FOUND_MORE, which I found unimportant). However, those pieces of code could also do a p_mod_bad_nr++; and we later use if (!p_mod_bad_nr) as one of the conditions on reporting a discrepancy like what's seen here or not.

So I'll drop the skipping.