SELinuxProject / selinux-kernel

GitHub mirror of the SELinux kernel repository
https://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux.git
Other
148 stars 56 forks source link

BUG: selinuxfs removes its directory entries in an unsafe way, causing soft lockups when tasks read selinuxfs directories #42

Closed WOnder93 closed 4 years ago

WOnder93 commented 5 years ago

Letting the following set of commands run long enough on a multi-core machine causes soft lockups in the kernel:

(cd /sys/fs/selinux/; while true; do find >/dev/null 2>&1; done) &
(cd /sys/fs/selinux/; while true; do find >/dev/null 2>&1; done) &
(cd /sys/fs/selinux/; while true; do find >/dev/null 2>&1; done) &

while true; do load_policy; echo -n .; sleep 0.1; done

This happens on the upstream kernel, as well as on selinux/next.

The problem appears to be that sel_remove_entries calls d_genocide to remove the whole contents of certain subdirectories in /sys/fs/selinux/. This function is apparently only intended for removing filesystem entries that are no longer accessible to userspace, because it doesn't follow the rule that any code removing entries from a directory must hold the lock on the directory's inode RW semaphore (formerly this was a mutex, see 9902af79c01a8e39bb99b922fa3eef6d4ea23d69).

Note that before commit ad52184b705c1048aa01225eccde119ef5c93000, SELinux used its own open-coded functions for removing entries, but these also had the same bug (they were not locking the directory inodes).

I think the best way to fix this will be to open code sel_remove_entries to remove the entries in a proper and robust way (similar to how it was done before ad52184b705c1048aa01225eccde119ef5c93000 but with locking of the parent inode before removal). Either way. it looks like a really bad idea to call d_genocide on a tree that is mounted in userspace (no parent inode locks, no fsnotify events, ...). Based on its usage it looks like an internal function that is not at all designed for this pupose: https://elixir.bootlin.com/linux/latest/ident/d_genocide

I have a patch prepared that seems to work and passes the above stress test. Let me polish it up and I'll post it to the list for review.

stephensmalley commented 5 years ago

Be sure to cc viro and perhaps linux-fsdevel too to seek input from the fs developers. I think the selinuxfs code related to removing entries was originally based on some code from proc, but possibly it was never fully safe.

WOnder93 commented 5 years ago

v1 patch posted: https://www.mail-archive.com/selinux@tycho.nsa.gov/msg07825.html

pcmoore commented 5 years ago

I believe we've resolved this now upstream, haven't we @WOnder93? I'm closing this out, but please reopen if you think this is still a problem.

stephensmalley commented 5 years ago

I don't think this is fixed yet. As I recall, we need to fundamentally restructure policy load logic, splitting it up into multiple phases, and then we can restructure selinuxfs code too. Ties in with the discussion of converting policy rwlock to RCU.

WOnder93 commented 5 years ago

Yep, this is not fixed yet (unless I missed something). I don't seem to have permissions to reopen, so @pcmoore, you'll have to do that yourself :)

pcmoore commented 5 years ago

Sorry guys, I thought we had addressed this via some of the other patches - my mistake. Reopened.

WOnder93 commented 5 years ago

Posted v2: https://lore.kernel.org/selinux/20190801140243.24080-1-omosnace@redhat.com/T/

pcmoore commented 5 years ago

@WOnder93 I'm assigning this to you (in the GitHub sense of the word) since you've been working on this so far.

WOnder93 commented 4 years ago

Turns out this bug has been fixed recently by Al Viro in commit torvalds/linux@d4f4de5e5ef8efde85febb6876cd3c8ab1631999. After applying it I can no longer reproduce the issue.