Closed WOnder93 closed 4 years ago
Be sure to cc viro and perhaps linux-fsdevel too to seek input from the fs developers. I think the selinuxfs code related to removing entries was originally based on some code from proc, but possibly it was never fully safe.
v1 patch posted: https://www.mail-archive.com/selinux@tycho.nsa.gov/msg07825.html
I believe we've resolved this now upstream, haven't we @WOnder93? I'm closing this out, but please reopen if you think this is still a problem.
I don't think this is fixed yet. As I recall, we need to fundamentally restructure policy load logic, splitting it up into multiple phases, and then we can restructure selinuxfs code too. Ties in with the discussion of converting policy rwlock to RCU.
Yep, this is not fixed yet (unless I missed something). I don't seem to have permissions to reopen, so @pcmoore, you'll have to do that yourself :)
Sorry guys, I thought we had addressed this via some of the other patches - my mistake. Reopened.
@WOnder93 I'm assigning this to you (in the GitHub sense of the word) since you've been working on this so far.
Turns out this bug has been fixed recently by Al Viro in commit torvalds/linux@d4f4de5e5ef8efde85febb6876cd3c8ab1631999. After applying it I can no longer reproduce the issue.
Letting the following set of commands run long enough on a multi-core machine causes soft lockups in the kernel:
This happens on the upstream kernel, as well as on selinux/next.
The problem appears to be that
sel_remove_entries
callsd_genocide
to remove the whole contents of certain subdirectories in/sys/fs/selinux/
. This function is apparently only intended for removing filesystem entries that are no longer accessible to userspace, because it doesn't follow the rule that any code removing entries from a directory must hold the lock on the directory's inode RW semaphore (formerly this was a mutex, see 9902af79c01a8e39bb99b922fa3eef6d4ea23d69).Note that before commit ad52184b705c1048aa01225eccde119ef5c93000, SELinux used its own open-coded functions for removing entries, but these also had the same bug (they were not locking the directory inodes).
I think the best way to fix this will be to open code
sel_remove_entries
to remove the entries in a proper and robust way (similar to how it was done before ad52184b705c1048aa01225eccde119ef5c93000 but with locking of the parent inode before removal). Either way. it looks like a really bad idea to calld_genocide
on a tree that is mounted in userspace (no parent inode locks, no fsnotify events, ...). Based on its usage it looks like an internal function that is not at all designed for this pupose: https://elixir.bootlin.com/linux/latest/ident/d_genocideI have a patch prepared that seems to work and passes the above stress test. Let me polish it up and I'll post it to the list for review.