netblue30 / firejail

Linux namespaces and seccomp-bpf sandbox
https://firejail.wordpress.com
GNU General Public License v2.0
5.76k stars 565 forks source link

RUNUSER should default to $XDG_RUNTIME_DIR #4535

Open crocket opened 3 years ago

crocket commented 3 years ago

Right now, RUNUSER is /run/user/user-id. But, pam_rundir.so sets XDG_RUNTIME_DIR to /run/users/user-id.

Thus, RUNUSER fails to capture XDG_RUNTIME_DIR.

RUNUSER should only be set to /run/user/user-id if there is no XDG_RUNTIME_DIR.

kmk3 commented 3 years ago

@crocket commented on Sep 12:

Right now, RUNUSER is /run/user/user-id. But, pam_rundir.so sets XDG_RUNTIME_DIR to /run/users/user-id.

Thus, RUNUSER fails to capture XDG_RUNTIME_DIR.

RUNUSER should only be set to /run/user/user-id if there is no XDG_RUNTIME_DIR.

Using XDG_RUNTIME_DIR sounds good to me.

But on what distro does that happen?

I only found a reference to XDG_RUNTIME_DIR on pam-related man pages on Arch's pam_systemd(8), which claims to use the default path:

On login, this module — in conjunction with systemd-logind.service — ensures the following:

1.If it does not exist yet, the user runtime directory /run/user/$UID is either created or mounted as new "tmpfs" file system with quota applied, and its ownership changed to the user that is logging in.

crocket commented 3 years ago

I set up pam_rundir manually because I use openrc instead of systemd. My login manager doesn't set XDG_RUNTIME_DIR.

pam_systemd sets XDG_RUNTIME_DIR to /run/user/id, but other PAM modules may not set XDG_RUNTIME_DIR to /run/user/id.

https://wiki.archlinux.org/title/XDG_Base_Directory says

XDG_RUNTIME_DIR

Not required to have a default value; warnings should be issued if not set or equivalents provided.

I think people should stick to standards instead of hardcoding what systemd does into their programs.

rusty-snake commented 3 years ago

I think people should stick to standards instead of hardcoding

In general yes, but with firejail it is a bit special since it must not trust a environment variable set by the users. There must be some sanitization.

crocket commented 3 years ago

I can't make pam_rundir or pam_systemd change what they set XDG_RUNTIME_DIR to. Another PAM module may set XDG_RUNTIME_DIR to a different value.

I think it's best to let PAM modules or something else decide the value of XDG_RUNTIME_DIR and make firejail set RUNUSER to XDG_RUNTIME_DIR when XDG_RUNTIME_DIR is available.

We can assume that when XDG_RUNTIME_DIR isn't set, there is no xdg runtime directory.

crocket commented 3 years ago

I maintain gentoo linux package for pam_rundir, and I discovered that I could compile pam_rundir with /run/user/uid as its XDG_RUNTIME_DIR.

For now, I worked around the problem by modifying my own package.

smitsohu commented 3 years ago

If we teach Firejail to consider XDG_RUNTIME_DIR, we probably need to update --whitelist as well, because it is hardcoding /run/user/$UID as a whitelist top level directory and is one of the main consumers of ${RUNUSER}.

ScoreUnder commented 2 years ago

In general yes, but with firejail it is a bit special since it must not trust a environment variable set by the users. There must be some sanitization.

Isn't this kind of a game-over scenario anyway? (e.g. able to control environment = LD_PRELOAD arbitrary code). If it's not tenable, is there a way we could override it with a specific argument to firejail?

kmk3 commented 2 years ago

@ScoreUnder commented on Dec 26:

In general yes, but with firejail it is a bit special since it must not trust a environment variable set by the users. There must be some sanitization.

Isn't this kind of a game-over scenario anyway? (e.g. able to control environment = LD_PRELOAD arbitrary code). If it's not tenable, is there a way we could override it with a specific argument to firejail?

One scenario to keep in mind is a compromised program running outside of firejail (e.g.: on a separate user) trying to run firejail to gain root privileges, by e.g.: messing with paths and environment variables.

Assuming that firejail itself is not vulnerable to LD_PRELOAD attacks (which sounds like low-hanging fruit anyway), then allowing outside programs to control where firejail stores its runtime files seems dangerous.

I think that there is a way that does not involve environment variables; I'll post it in my next reply.

kmk3 commented 2 years ago

(Note: Most of this was written months ago)

Rambling

@crocket commented on Sep 12:

I set up pam_rundir manually because I use openrc instead of systemd. My login manager doesn't set XDG_RUNTIME_DIR.

You use seatd, right? So far I've only come across that and another login manager (other than (e)logind):

Though only the latter appears to set XDG_RUNTIME_DIR:

I also found a somewhat related discussion:

Misc: On Artix Linux, XDG_RUNTIME_DIR is set to /run/user/id by (I suppose) elogind and I've been wanting to try a simpler login manager, but I never got around to it. The main issue on Artix is that xorg-server depends on elogind (and also on dbus for some reason), though that might be solved by just changing the PKGBUILD and editing some init scripts.

pam_systemd sets XDG_RUNTIME_DIR to /run/user/id, but other PAM modules may not set XDG_RUNTIME_DIR to /run/user/id.

https://wiki.archlinux.org/title/XDG_Base_Directory says

XDG_RUNTIME_DIR Not required to have a default value; warnings should be issued if not set or equivalents provided.

I think people should stick to standards instead of hardcoding what systemd does into their programs.

Sure, but to be fair I think that this is more due to few people being aware of how these components connect together in practice rather than e.g.: going out of your way to deviate from the standard. Especially since AFAIK the most common login managers either set XDG_RUNTIME_DIR to /run/user/id or do not set it at all. For example, I don't use systemd and I always assumed that /run/user/id was actually the default in the spec, similarly to how ~/.config is the default path for XDG_CONFIG_HOME.

Not too long ago I had no idea that the login manager was the thing responsible for setting XDG_RUNTIME_DIR (in my mind it could have been done on e.g.: xinitrc). IIRC I only found this out because of the Void wiki:

And before this exchange, I didn't know that PAM was yet another part of the puzzle.

Questions

What exactly is it that manages the mounting of /run and /run/user/id? I thought that it was all done by the login manager, but from what I've seen now, it appears that pam does that and the login manager only sets the XDG_RUNTIME_DIR variable. Is that correct?

Is there any straightforward explanation about how the login process is supposed to work on Linux? Such as: what are the components/types of programs involved (with an example of each), what each one is responsible for, on which order they are executed. Simplified example for init, with "type (example)":

@crocket commented on Sep 12:

I can't make pam_rundir or pam_systemd change what they set XDG_RUNTIME_DIR to. Another PAM module may set XDG_RUNTIME_DIR to a different value.

How does this work exactly? Can it be expected that there should be a single PAM module responsible for managing the run dir on a given system? Is it possible for there to be multiple PAM modules, with each using a different run dir at once?

Proposal

@crocket commented on Sep 12:

I think it's best to let PAM modules or something else decide the value of XDG_RUNTIME_DIR and make firejail set RUNUSER to XDG_RUNTIME_DIR when XDG_RUNTIME_DIR is available.

We can assume that when XDG_RUNTIME_DIR isn't set, there is no xdg runtime directory.

@rusty-snake commented on Sep 12:

I think people should stick to standards instead of hardcoding

In general yes, but with firejail it is a bit special since it must not trust a environment variable set by the users. There must be some sanitization.

@crocket commented on Sep 12:

I maintain gentoo linux package for pam_rundir

Nice.

and I discovered that I could compile pam_rundir with /run/user as its XDG_RUNTIME_DIR.

For now, I worked around the problem by modifying my own package.

Considering what @rusty-snake said, I assume that the constraint is that the run dir has to be known by firejail at compile time. In which case, would it suffice to have a configure option for it?

I know that Debian patches autoconf 2.69 (with the "runstatedir" patch) to have a --with-runstatedir= option (related: #4595), and that at least autoconf 2.69 defaults to using /var/run (which seems to be the default outside of Linux AFAICT).

And as of autoconf 2.70, that option has been upstreamed. From autoconf's NEWS:

* Noteworthy changes in release 2.70 (2020-12-08) [stable]

[...]

** New features

*** Configure scripts now support a ‘--runstatedir’ option.

  This defaults to ‘${localstatedir}/run’.  It can be used, for
  instance, to place per-process temporary runtime files (such as pid
  files) into ‘/run’ instead of ‘/var/run’.

So how about we read --with-runstatedir at configure time and use its value instead of hardcoding /run?

crocket commented 2 years ago

firejail can read the value of XDG_RUNTIME_DIR when it starts.

kmk3 commented 2 years ago

@crocket commented on Dec 27:

firejail can read the value of XDG_RUNTIME_DIR when it starts.

See https://github.com/netblue30/firejail/issues/4535#issuecomment-1001532423.

crocket commented 2 years ago

I don't understand it.

kmk3 commented 2 years ago

@crocket commented on Dec 27:

I don't understand it.

Here is one example of a vulnerability that could come from allowing unprivileged users to specify an arbitrary runstate directory:

Let's say you have a separate user ("user2") to run a program foo, which is malicious or compromised and that foo either runs outside of firejail or that it manages to escape firejail. Barring exploits unrelated to firejail, it can only affect what is owned by user2.

Then foo runs:

XDG_RUNTIME_DIR=~/myrun firejail --noprofile --private-etc /bin/bash

And modifies ~/myrun while it is being set up by firejail. It manages to put its own modified shadow file with a known root password into the temporary etc directory in ~/myrun (or it moves ~/myrun to ~/myrun2 and puts a different "myrun" dir in its place or whatever). Firejail then bind mounts the modified etc directory into /etc inside the sandbox. The program can now log in as root inside the sandbox. Since private-bin was not used, the real /bin is visible inside the sandbox. And since foo is root, it then can modify anything in the real /bin.

See also CVE-2021-26910, which similarly used firejail to gain root privileges. It exploited TOCTOU race conditions by messing with paths used by firejail. Also related:

crocket commented 2 years ago

You can't log in as root in firejail. user2 can also manipulate /run/user/uid-of-user2 in the same way? There is nothing that stops a malware from modifying /run/user/uid-of-user. XDG_RUNTIME_DIR=~/myrun is not necessary. firejail doesn't protect you from malwares run outside firejail.

kmk3 commented 2 years ago

@crocket commented on Dec 27:

You can't log in as root in firejail.

$ firejail --quiet --noprofile sudo su -
# whoami
root

Note: /etc/sudoers could be modified besides /etc/shadow, or foo could log in through su directly (i.e.: without sudo).

See also noroot.

Though it seems that even though /usr/bin has the same inode inside the sandbox, it is read-only:

$ firejail --quiet --noprofile sudo su -
# whoami
root
# touch /usr/bin/foobar
touch: cannot touch '/usr/bin/foobar': Read-only file system

So alternatively, as root in the sandbox you could modify the home dir of any user if running with --allusers.

user2 can also manipulate /run/user/uid-of-user2 in the same way? There is nothing that stops a malware from modifying /run/user/uid-of-user.

But without root it cannot manipulate /run/user/uid-of-user1 nor /home/user1.

XDG_RUNTIME_DIR=~/myrun is not necessary. firejail doesn't protect you from malwares run outside firejail.

Without root, the malware is limited to user2 in this scenario.

crocket commented 2 years ago

How is ~/myrun any different from /run/user/uid? /run/user/uid is just a tmpfs mount.

If a malware has user rights, it can modify either ~/myrun or /run/user/uid.

ScoreUnder commented 2 years ago

@kmk3 just to add my own experience, the XDG runtime dir is created by systemd on arch, and on gentoo without systemd it is not created at all. To help with this issue I am using a script with a sudo exemption to create it for me:

.xprofile excerpt

if test -z "$XDG_RUNTIME_DIR"; then
    export XDG_RUNTIME_DIR=$(sudo create-runuser)
fi

/usr/local/sbin/create-runuser

#!/bin/sh
dir="/var/run/user/${SUDO_UID:?}"
mkdir -m 700 -p "$dir"
chmod 755 /var/run/user
chown "$SUDO_UID:$SUDO_GID" "$dir"
printf %s\\n "$dir"

I was previously creating it manually in /tmp without going through sudo at all, as described in the gentoo wiki. It wasn't a problem for me until I needed sound from firejail.

What exactly is it that manages the mounting of /run and /run/user/id?

It seems like /run might be something built into systemd and openrc. It's mounted very early for me. /run/user/ is created automatically on login with systemd (because PAM notifies systemd of the login), but with openrc there is no automatic mechanism. Under systemd, XDG_RUNTIME_DIR is also set during login by pam_systemd(8).


It manages to put its own modified shadow file with a known root password into the temporary etc directory in ~/myrun

I don't get this, why does firejail read from $XDG_RUNTIME_DIR to find /etc/shadow?

Or... does it put the temporary /etc into /var/run/user/\? Because that vulnerability will be exploitable right now if so, as that resource's contents are fully owned and controlled by the user. I am currently on the same page as crocket regarding this, I think.

In my view, the purpose of the XDG runtime dir is as a sort of secure, disambiguated /tmp for the current user to use without worrying about clashing with other users. It's mostly used for sockets in practice. It isn't incredibly special as directories go and most programs work fine even if one doesn't exist. Special mention to pipewire which does not.

crocket commented 2 years ago

My point is that even if firejail doesn't read the value of XDG_RUNTIME_DIR, malwares outside firejail can launch a firejail sandbox with a customized version of /etc/passwd. This looks like a privilege escalation issue.

kmk3 commented 2 years ago

@crocket commented on Dec 28:

How is ~/myrun any different from /run/user/uid? /run/user/uid is just a tmpfs mount.

If a malware has user rights, it can modify either ~/myrun or /run/user/uid.

@crocket commented on Dec 28:

My point is that even if firejail doesn't read the value of XDG_RUNTIME_DIR, malwares outside firejail can launch a firejail sandbox with a customized version of /etc/passwd. This looks like a privilege escalation issue.

Sorry for the confusion, in yesterday's comments I was thinking that the root run directory differed as well (e.g.: /run vs /var/run). And so that part of the request was to allow setting it to an arbitrary path, which would mean that /run/firejail could thus be set to an arbitrary path, which would not be a good idea since that is where firejail stores its runtime files and so it is not intended to be user-writable. However, the core of the issue appears to be just about specifying the value of the ${RUNUSER} macro in firejail profiles, which is intended to be a user-writable path, so my example does not apply; please ignore it.

@ScoreUnder commented on Dec 28:

@kmk3 just to add my own experience, the XDG runtime dir is created by systemd on arch, and on gentoo without systemd it is not created at all.

Which login manager are you using?

To help with this issue I am using a script with a sudo exemption to create it for me:

.xprofile excerpt

if test -z "$XDG_RUNTIME_DIR"; then
    export XDG_RUNTIME_DIR=$(sudo create-runuser)
fi

/usr/local/sbin/create-runuser

#!/bin/sh
dir="/var/run/user/${SUDO_UID:?}"
mkdir -m 700 -p "$dir"
chmod 755 /var/run/user
chown "$SUDO_UID:$SUDO_GID" "$dir"
printf %s\\n "$dir"

Nice, I had thought about doing something like that in case I started using seatd.

I was previously creating it manually in /tmp without going through sudo at all, as described in the gentoo wiki. It wasn't a problem for me until I needed sound from firejail.

What exactly is it that manages the mounting of /run and /run/user/id?

It seems like /run might be something built into systemd and openrc. It's mounted very early for me.

I see, so at least /run is not noticeably different then.

/run/user/ is created automatically on login with systemd (because PAM notifies systemd of the login), but with openrc there is no automatic mechanism. Under systemd, XDG_RUNTIME_DIR is also set during login by pam_systemd(8).

Artix does not have systemd but /run/user/id is created and XDG_RUNTIME_DIR is set by elogind I think. There are some shims for systemd and I'm not exactly sure what exactly does what:

$ pacman -Q artix-archlinux-support elogind
artix-archlinux-support 2-1
elogind 246.10-5
$ pacman -Qlq artix-archlinux-support
/etc/
/etc/arch-release
/usr/
/usr/bin/
/usr/bin/systemd-sysusers
/usr/bin/systemd-tmpfiles
/usr/share/
/usr/share/libalpm/
/usr/share/libalpm/hooks/
/usr/share/libalpm/hooks/arch-repos-install.hook
/usr/share/libalpm/scripts/
/usr/share/libalpm/scripts/arch-repos-hook
$ pacman -Qlq elogind | grep pam
/etc/pam.d/
/etc/pam.d/elogind-user
/usr/lib/security/pam_elogind.so
/usr/share/factory/etc/pam.d/
/usr/share/factory/etc/pam.d/other
/usr/share/factory/etc/pam.d/system-auth
/usr/share/man/man8/pam_elogind.8.gz

But apparently it's pam_elogind:

pam_elogind(8)

``` DESCRIPTION pam_elogind registers user sessions with the elogind login manager and hence the elogind control group hierarchy. The module also applies various resource management and runtime parameters to the new session, as configured in the JSON User Record[1] of the user, when one is defined. On login, this module — in conjunction with elogind-logind.service — ensures the following: 1. If it does not exist yet, the user runtime directory /run/user/$UID is either created or mounted as new "tmpfs" file system with quota applied, and its ownership changed to the user that is logging in. [...] ENVIRONMENT The following environment variables are initialized by the module and available to the processes of the user's session: [...] $XDG_RUNTIME_DIR Path to a user-private user-writable directory that is bound to the user login time on the machine. It is automatically created the first time a user logs in and removed on the user's final logout. If a user logs in twice at the same time, both sessions will see the same $XDG_RUNTIME_DIR and the same contents. If a user logs in once, then logs out again, and logs in again, the directory contents will have been lost in between, but applications should not rely on this behavior and must be able to deal with stale files. To store session-private data in this directory, the user should include the value of $XDG_SESSION_ID in the filename. This directory shall be used for runtime file system objects such as AF_UNIX sockets, FIFOs, PID files and similar. It is guaranteed that this directory is local and offers the greatest possible file system feature set the operating system provides. For further details, see the XDG Base Directory Specification[3]. $XDG_RUNTIME_DIR is not set if the current user is not the original user of the session. ```

It manages to put its own modified shadow file with a known root password into the temporary etc directory in ~/myrun

I don't get this, why does firejail read from $XDG_RUNTIME_DIR to find /etc/shadow?

Or... does it put the temporary /etc into /var/run/user/? Because that vulnerability will be exploitable right now if so, as that resource's contents are fully owned and controlled by the user. I am currently on the same page as crocket regarding this, I think.

My mistake; see above.

In my view, the purpose of the XDG runtime dir is as a sort of secure, disambiguated /tmp for the current user to use without worrying about clashing with other users. It's mostly used for sockets in practice. It isn't incredibly special as directories go and most programs work fine even if one doesn't exist. Special mention to pipewire which does not.

Yes, I also much prefer using /run/user/id compared to using something like /tmp.


Going back to the original issue, I think that the main concern is that the profile macros used for paths are currently both rather "static" and are also taken from places that are harder to change compared to environment variables (so they are harder to be used in attacks). Examples:

Source:

Either way, can you think of a scenario where $XDG_RUNTIME_DIR/.. would be created on a different path on the same distro? Because if not, wouldn't being able to set ${RUNUSER} to e.g.: /run/users at configure-time solve the issue? That might be the most straightforward fix.

crocket commented 2 years ago

pam_elogind or pam_rundir can create XDG_RUNTIME_DIR. Any PAM module can create XDG_RUNTIME_DIR upon login.

A different PAM module can set XDG_RUNTIME_DIR to a different value. It would be cumbersome to set RUNUSER at compile time unless you use gentoo linux. Most people use binary distributions.

ScoreUnder commented 2 years ago

While it would solve the OP's issue where there is a different prefix, I think the approach mentioned in the gentoo wiki which causes nondeterministic directory names would still be left unaccounted for.

Which login manager are you using?

On arch, just systemd-logind. On gentoo, nothing in particular.

kmk3 commented 2 years ago

@crocket commented on Dec 28:

pam_elogind or pam_rundir can create XDG_RUNTIME_DIR. Any PAM module can create XDG_RUNTIME_DIR upon login.

A different PAM module can set XDG_RUNTIME_DIR to a different value. It would be cumbersome to set RUNUSER at compile time unless you use gentoo linux. Most people use binary distributions.

How does one end up with different PAM modules that set XDG_RUNTIME_DIR to different paths (and presumably also creates them) on the same distro? Is that a valid/desirable outcome?

If not, to me that sounds like something that should be solved at the distro packaging level. Unless people download binary PAM modules from third-party sources like for kernel drivers?

@ScoreUnder commented on Dec 28:

While it would solve the OP's issue where there is a different prefix, I think the approach mentioned in the gentoo wiki which causes nondeterministic directory names would still be left unaccounted for.

Could you elaborate (and provide the source)? Doesn't that imply building from source and thus being able to set the runuser dir at configure time?

Which login manager are you using?

On arch, just systemd-logind. On gentoo, nothing in particular.

Interesting, I didn't know that this was possible.

Kind of related to that, I just found a PAM module that apparently just creates and sets XDG_RUNTIME_DIR:

pam_rundir

```console $ pacman -Sii pam_rundir Repository : world Name : pam_rundir Version : 1.2.0-1 Description : PAM module to provide $XDG_RUNTIME_DIR Architecture : x86_64 URL : https://gitea.artixlinux.org/artix/pam_rundir Licenses : GPL2+ Groups : None Provides : None Depends On : pam Optional Deps : None Required By : None Optional For : None Conflicts With : None Replaces : None Download Size : 15.12 KiB Installed Size : 34.21 KiB Packager : Artix Build Bot Build Date : Wed 22 Sep 2021 02:28:35 PM -03 MD5 Sum : 5118ef9713d14e4d2d9373848dfa8621 SHA-256 Sum : 8f665b987fd4be2e9ed1e18debe5a7999a9f462723e3dc079798ddd1b2540ec7 Signatures : 1247D995F165BBAC ``` From [README.md](https://gitea.artixlinux.org/artix/pam_rundir/src/branch/master/README.md): > pam_rundir is a PAM module that can be used to provide user runtime > directory, as described in the XDG Base Directory Specification. > > The directory will be created on login (open session) and removed on logout > (close session), and its full path made available in an environment variable, > usually `$XDG_RUNTIME_DIR`. > > This fork contains some changes of the original implementation for Artix > Linux.

ScoreUnder commented 2 years ago

Could you elaborate (and provide the source)? Doesn't that imply building from source and thus being able to set the runuser dir at configure time?

Potentially yes, though gentoo packages often do not expose direct configuration like that. (It's mostly a set of binary on/off switches per package. Some packages like busybox and dwm have an option to read configuration from an existing file though.)

The configuration I was referring to in the last comment was included in the wiki here, halfway down this section: https://wiki.gentoo.org/wiki/PipeWire#Login_without_session_management

The described method of creating an XDG_RUNTIME_DIR in that article (and indeed, any secure method which does not require root) creates one in a nondeterministic location, so if it were to be a compile-time switch configurable option, it would need to be compiled and installed each time someone logs in.

crocket commented 2 years ago

People can replace elogind with seatd which doesn't come with any PAM module that sets XDG_RUNTIME_DIR upon login.

Some people may install a PAM module like pam_rundir that sets XDG_RUNTIME_DIR upon login.

rusty-snake commented 2 years ago

(I didn't read the full discussion until here, was too long)

Since some things got mixed up and consfused, I've a few things to clarify.

  1. This issues isn't about firejail's own run-state files which are hardcoded to /run/firejail (or whatever is set at ./configure).
  2. This issues is about the expansion of the ${RUNUSER} macro, which in turn can expand to whatever we want.
  3. The issue comes with that
    1. /run/user/$UID is hardcoded at multiple places (x=0; for i in $(git grep -c -h "/run/user/%d" src); do x=$(( x + i)); done; echo $x: 9).
    2. anything other than /run/user/$UID will not work with whitelist