linux-application-whitelisting / fapolicyd

File Access Policy Daemon
GNU General Public License v3.0
192 stars 55 forks source link

Protect system from deadly signals received by fapolicyd #273

Closed rmetrich closed 9 months ago

rmetrich commented 10 months ago

If for some reason, fapolicyd raises or receives a deadly signal, such as SIGBUS, the kernel task spawning the coredump handler (e.g. systemd-coredump) will hang waiting for fapolicyd to acknowledge the execution, which cannot happen since fapolicyd itself is dying.

This patch fixes the issue through unregistering from fanotify.

Example of deadlock without this patch:

[root@vm-fapolicy9 ~]# kill -SEGV 2331

[  757.531448] sysrq: Show Blocked State

[...]

[  757.544570] task:fapolicyd       state:D stack:    0 pid: 2331 ppid:     1 flags:0x00004002
[  757.545389] Call Trace:
[  757.545845]  <TASK>
[  757.546260]  __schedule+0x248/0x620
[  757.546760]  ? select_idle_sibling+0x28/0x520
[  757.547333]  schedule+0x2d/0x60
[  757.547810]  schedule_timeout+0x11d/0x160
[  757.548353]  ? try_to_wake_up+0x1e5/0x530
[  757.548870]  ? cn_esc_printf+0x5c/0x110
[  757.549597]  __wait_for_common+0x90/0x1d0
[  757.550123]  ? usleep_range_state+0x90/0x90
[  757.550627]  call_usermodehelper_exec+0x148/0x180
[  757.551212]  do_coredump+0x240/0x8a0
[  757.551701]  get_signal+0x95f/0xa00
[  757.552174]  arch_do_signal_or_restart+0x25/0x100
[  757.552742]  ? restore_altstack+0x4c/0x80
[  757.553269]  exit_to_user_mode_loop+0x9c/0x130
[  757.553816]  exit_to_user_mode_prepare+0xb6/0x100
[  757.554384]  syscall_exit_to_user_mode+0x12/0x30
[  757.554933]  do_syscall_64+0x69/0x90
[  757.555419]  ? syscall_exit_work+0x11a/0x150
[  757.555946]  ? syscall_exit_to_user_mode+0x12/0x30
[  757.556511]  ? do_syscall_64+0x69/0x90
[  757.557005]  ? syscall_exit_to_user_mode+0x12/0x30
[  757.557576]  ? do_syscall_64+0x69/0x90
[  757.558121]  ? do_syscall_64+0x69/0x90
[  757.558630]  ? do_syscall_64+0x69/0x90
[  757.559228]  ? do_syscall_64+0x69/0x90
[  757.559699]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  757.560277] RIP: 0033:0x7f7197f426ef
[  757.560822] RSP: 002b:00007ffe4ff0ca60 EFLAGS: 00000293
[  757.561393] RAX: fffffffffffffffc RBX: 0000000000000001 RCX: 00007f7197f426ef
[  757.562105] RDX: 00000000ffffffff RSI: 0000000000000002 RDI: 00007ffe4ff0cb70
[  757.562811] RBP: 00005643d519cb03 R08: 0000000000000000 R09: 0000000000000000
[  757.563528] R10: 0000000000000008 R11: 0000000000000293 R12: 00005643d519cb1f
[  757.564279] R13: 00007ffe4ff0cb70 R14: 00007ffe4ff0cbc0 R15: 00007ffe4ff0cb00
[  757.565000]  </TASK>

[...]

[  757.611802] task:kworker/u4:4    state:D stack:    0 pid: 2387 ppid:     2 flags:0x00004000
[  757.612559] Call Trace:
[  757.612969]  <TASK>
[  757.613348]  __schedule+0x248/0x620
[  757.613827]  schedule+0x2d/0x60
[  757.614292]  fanotify_get_response.constprop.0+0xe2/0x1d0
[  757.614867]  ? cpuacct_percpu_seq_show+0x10/0x10
[  757.615426]  fanotify_handle_event+0x338/0x380
[  757.615964]  send_to_group+0xe1/0x250
[  757.616454]  ? post_alloc_hook+0xb6/0xd0
[  757.616949]  fsnotify+0x273/0x3a0
[  757.617414]  __fsnotify_parent+0xff/0x300
[  757.617912]  ? fsnotify_perm.part.0+0xd2/0x150
[  757.618441]  fsnotify_perm.part.0+0xd2/0x150
[  757.618954]  do_dentry_open+0xea/0x370
[  757.619431]  do_open+0x21a/0x3d0
[  757.619873]  path_openat+0x10f/0x2b0
[  757.620359]  do_filp_open+0xb2/0x160
[  757.620827]  ? __get_user_pages_remote+0xda/0x340
[  757.621365]  ? __kmalloc+0x191/0x360
[  757.621823]  do_open_execat+0x64/0x1b0
[  757.622283]  bprm_execve.part.0+0xe9/0x220
[  757.622774]  kernel_execve+0x16c/0x1c0
[  757.623259]  call_usermodehelper_exec_async+0xd1/0x140
[  757.623806]  ? call_usermodehelper_exec_work+0xb0/0xb0
[  757.624394]  ret_from_fork+0x1f/0x30
[  757.624889]  </TASK>
stevegrubb commented 9 months ago

Thanks for the patch. Renaming things that don't need to be introduces more code churn. (does signum really need renaming, etc?) But I have run across this problem developing the code and yes, if it can be fixed should make people happier. But this also goes to an older discussion of should fapolicyd fail open or closed?

rmetrich commented 9 months ago

Thanks for the patch. Renaming things that don't need to be introduces more code churn. (does signum really need renaming, etc?) But I have run across this problem developing the code and yes, if it can be fixed should make people happier. But this also goes to an older discussion of should fapolicyd fail open or closed?

I can remove this renaming of signum, that was just be to aligned with the other handlers actually

stevegrubb commented 9 months ago

OK, I'll merge it. It's best to separate bug fix and cosmetic fixes for bisecting purposes.