systemd / systemd

The systemd System and Service Manager
https://systemd.io
GNU General Public License v2.0
13.27k stars 3.8k forks source link

stop leaking PIDs, command line arguments, UID, GID through systemctl, D-Bus interface / API - `hidepid` compatibility #29893

Closed adrelanos closed 1 year ago

adrelanos commented 1 year ago

Component

systemd

Is your feature request related to a problem? Please describe

hidepid is a security feature. However, even if using hidepid and unprivileged run of systemctl status unit-name and systemd's D-Bus interface / API.

quote https://access.redhat.com/solutions/6704531

Last problem, that we would like to highlight is potential information leak and false sense of security that hidepid= provides. Information (PID numbers, command line arguments, UID and GID) about system services are tracked by systemd. By default this information is available to everyone to read via systemd's D-Bus interface. When hidepid= option is used systemd doesn't take it into consideration and still exposes all this information at the API level.

Describe the solution you'd like

Describe alternatives you've considered

None.

The systemd version you checked that didn't have the feature you are asking for

No response

bluca commented 1 year ago

Simply disable access to the bus by using InaccessiblePaths= or so when you also configure ProtectProc=. It makes no sense to avoid publishing this information by default, as it's needed for tracking and whatnot, and would be a compat break. In other words: don't break the whole system, isolate what you want to isolate instead.

DemiMarie commented 1 year ago

@bluca In some cases “what you want to isolate” is an entire user login session. The request is for systemd to honor hidepid=. If this needs to be opt-in it can be configured via systemd’s configuration files.

bluca commented 1 year ago

Then remove access to the system bus from that session. But running in that configuration is not supported at all, it's only supported for individual units, and it doesn't really make sense in any other way.

bluca commented 1 year ago

Or even better, use nspawn, as that's the right tool to completely isolate a workload from the host: https://www.freedesktop.org/software/systemd/man/latest/systemd-nspawn.html

DemiMarie commented 1 year ago

Or even better, use nspawn, as that's the right tool to completely isolate a workload from the host: https://www.freedesktop.org/software/systemd/man/latest/systemd-nspawn.html

Isolating workloads from the host is not the goal of hidepid. Isolating users from each other is.

grawity commented 1 year ago

Not exposing this information by default. Or a new option to disable exposing this information.

My local solution to this is blocking the respective API calls at D-Bus level:

/etc/dbus-1/system.d/systemd-restrict.conf

<?xml version="1.0"?>
<busconfig>
    <policy user="root">
        <allow send_destination="org.freedesktop.systemd1"
            send_interface="org.freedesktop.systemd1.Manager"
            send_member="GetUnitProcesses"/>
    </policy>

    <policy group="proc">
        <allow send_destination="org.freedesktop.systemd1"
            send_interface="org.freedesktop.systemd1.Manager"
            send_member="GetUnitProcesses"/>
    </policy>

    <policy context="default">
        <deny send_destination="org.freedesktop.systemd1"
            send_interface="org.freedesktop.systemd1.Manager"
            send_member="GetUnitProcesses"/>
    </policy>
</busconfig>
poettering commented 1 year ago

I am sorry, but no. global "hidepid" doesn't really work on a general purpose Linux system, because tons of tools need to check /proc/, for example pid1, journald, polkit and similar. We do not support it, full stop.

use per-service ProtectProc= and ProcSubset=, which use it per-service, but the global switch is simply not supported.

grawity commented 1 year ago

I am sorry, but no. global "hidepid" doesn't really work on a general purpose Linux system, because tons of tools need to check /proc/, for example pid1, journald, polkit and similar.

Those tools still can check /proc like before – pid1 runs as root and therefore has unlimited /proc access regardless; journald also runs as root; polkit and similar services which do not run as root can be easily added to the proc group and continue to have access as before. (On a server system, that's pretty much only polkitd, if at all. Very few things are simultaneously unprivileged and in need of full /proc access.)

So in practice, on systems that need it (which is mainly servers), it does work quite well, really.

(I have not tested it with a general-purpose desktop, nor do I see the need offhand, but aside from polkit – which is solved via groups – I don't see any particular reason why it wouldn't work either.)

use per-service ProtectProc= and ProcSubset=, which use it per-service, but the global switch is simply not supported.

Per-service knobs don't work when the goal is to hide /proc from user sessions, which are not services...

bluca commented 1 year ago

use per-service ProtectProc= and ProcSubset=, which use it per-service, but the global switch is simply not supported.

Per-service knobs don't work when the goal is to hide /proc from user sessions, which are not services...

Again, that's not supported. You can use nspawn or a VM for those user sessions. ProtectProc is supported for individual units, not system-wide.

DemiMarie commented 1 year ago

use per-service ProtectProc= and ProcSubset=, which use it per-service, but the global switch is simply not supported.

Per-service knobs don't work when the goal is to hide /proc from user sessions, which are not services...

Again, that's not supported. You can use nspawn or a VM for those user sessions. ProtectProc is supported for individual units, not system-wide.

The feature request is to add this support, preferably by detecting that hidepid is in use and acting accordingly.

So in practice, on systems that need it (which is mainly servers), it does work quite well, really.

Why do these servers need it @grawity?

adrelanos commented 1 year ago

Why does a normal human user (non-root) using a virtual terminal or graphical session running systemctl status unit-name need to know and show the PID, command line arguments of systemd units run under different users accounts or root?

Can't this be prevented somehow and wouldn't that be a worthwhile security enhancement?

bluca commented 1 year ago

Why should a normal user not do that? These are not privileged information. You can restrict access to the system bus as you see fit, using d-bus' own configuration and options.

adrelanos commented 1 year ago

Why should a normal user not do that?

It can be an information leak. A normal user compromised by malware shouldn't have this information as this might aid the further exploitation of the system, compromising other services or aiding priviledge escalation to root.

In a server DOS context, this could be used to get instant feedback on much more RAM gets used by a unit by looking at its RAM use.

Command line parameters might contain usernames, passwords or other information useful for further exploitation.

A compromised user could attempt to crash a systemd unit and get feedback about if that succeeded. Sometimes exploitation needs multiple programatic attempts and is getting easier if feedback is available.

It improves the privacy of other unprivileged users in a multi-user context.

This isn't just my opinion. This is part of many security hardening instructions.

https://linux-audit.com/linux-system-hardening-adding-hidepid-to-proc/ https://www.elastic.co/guide/en/security/current/potential-hidden-process-via-mount-hidepid.html https://wiki.archlinux.org/title/security#hidepid Presumably this is why the kernel implemented hidepid support to begin with.

You can restrict access to the system bus as you see fit, using d-bus' own configuration and options.

But will that be stable or open bunch of strange, hard to debug follow-up issues because this isn't considered a default (or at least common setting) by upstream systemd? I expect the latter.

That's why I opened this ticket. Because if this is supported for real, then things will go much smoother.

bluca commented 1 year ago

Sorry, but that makes no sense. If you have permissions to "crash a unit" then you are already root. And if you are adding secrets to command line parameters, then fix that instead, because that's just broken.

If you have untrusted workloads then run them in nspawn or a VM, and then they'll be isolated from the rest of userspace or from the rest of the entire system.

grawity commented 1 year ago

Why do these servers need it @grawity?

Privacy between users. That is, so that user A could not run ps axf (or systemctl status user-1002.slice) and see the cmdlines of programs run by user B, e.g. URLs when user B is doing a wget or file names when user B is in vim foo.txt.

I don't really care about services or system security, but hiding user processes from each other would be very much in line with e.g. systemd's home directory isolation.

DemiMarie commented 1 year ago

Sorry, but that makes no sense. If you have permissions to "crash a unit" then you are already root. And if you are adding secrets to command line parameters, then fix that instead, because that's just broken.

The goal is to make exploiting pre-existing vulnerabilities harder, and (as @grawity mentioned) to provide privacy in a multi-user environment.

monsieuremre commented 1 year ago

Sorry, but that makes no sense. If you have permissions to "crash a unit" then you are already root. And if you are adding secrets to command line parameters, then fix that instead, because that's just broken.

First and foremost, not hiding /proc by default wastly undermines a lot of the effort for privacy between users, both humans and services.

It exposes a lot of system and kernel information too. This on its own is risky, as finding out about hardware makes a whole group of targeted attacks possible. There have been numerous CVEs that could have been prevented even before they were discovered if the targeted system didn't expose proc information. And these vary from escalating to root to bypassing MAC policies even.

This is a massive reduction in attack surface and prevents 0-days. Protecting selected services manually is better than nothing but is not enough. It has to be disabled globally and opt in for services. A whitelist is the optimal solution, not a blacklist. This will break many things and is very difficult to integrate in a system of course. Any effort for trying to make it work tho should be appreciated nevertheless.

Winterhuman commented 1 year ago

It's worth mentioning that you can construct the "disabled globally and opt in for services" logic using a drop-in under /etc/systemd/system.conf.d/, which would be globally applied across all system services.

I understand this issue is about making that the case by default, just noting it here in case anyone wants to implement this themselves.

voidzero commented 9 months ago

Snarky comment, my apologies, I have deleted it.

Maryse47 commented 9 months ago

Why should a normal user not do that? These are not privileged information.

Under hidepid=2 it was privileged information at least until systemd was leaking it so it's some sort of regression.

More detailed response can be found here.

adrelanos commented 9 months ago

With (at least) 2 security experts [1] [2] in support of this being a security relevant issue that should be fixed, please kindly consider to acknowledge this issue and re-open.

If that isn't convincing enough, what would be needed? More security experts? Specific CVE's where fixing this issue would have mitigated exploitation?


[1] @DemiMarie (has a history of multiple CVE reports against various projects; Qubes OS core developer) speaking here in this ticket, and [2] @solardiz (wikipedia) speaking here. [3] (Sorry if I am unaware of other's credentials not listed here.)

bluca commented 9 months ago

If that isn't convincing enough, what would be needed?

Absolutely nothing, as this request makes no sense. As already explained, D-Bus provides already the necessary configuration knobs to block access to the system bus when needed, use that instead.

poettering commented 9 months ago

@adrelanos i am not convinced the existing hidepid= logic with a group as access control makes much sense. Binding access to the /proc tree to a group is just wrong, it's not how you do this, because group membership is sticky, you can persist it.

It might be manage open() access to inodes via groups/users, i.e. file ownership. But I think it's fundamentally broken to do visibility checks via groups.

We have added support for ProtectProc= quite a while ago. it has this nice benefit that it's not a group-based concept, but a service-based one. Which is how it should be: you can select individually for each service whether they get access or not. WHich is how it should be.

ProtectProc= is not ideal though, as it implies mount namespacing which makes it hard to use from unpriv environments (and also detaches mount tables, which makes it useless for many services). I wish there was a better way. There was a much better approach proposed in the past:

https://lists.openwall.net/linux-kernel/2016/11/03/471

Unfortunately it was derailed by people who see userns as the center of the world. I think the capability based proposal (by "cabaility" i mean the concept, not the bad incarnation of it that POSIX came up with) in that thread would be much nicer, because it means trivial use from unpriv code, and a very elegant security model.

I wished there was another attempt made to propose this again. I think it's fundamentally a better design. With something like that in place we could then even introduce DefaultProtectProc= or so, which would allow to take away the access to procfs for all services, and then require an opt-in for for services that actually need access, i.e. the dbus, polkit and suchlike of this world.

But with the current hidepid= mess around groups? no, that's stupid.

monsieuremre commented 9 months ago

@poettering yes messing around groups is just a dirty workaround. Systemd's ProtectProc is, tho not ideal, a better solution, yet not as comprehensive. Number and type of services on a given system can vary by a lot, and setting this option for every service is tedious to say the very least. Systemd does not support ProtectProc for a system-wide default configuration. If that was supported, then one can just do that and all services are protected by default. Then the select daemons and services who need this can be enabled by overriding the option in their own drop-in config. This would still be not as comprehensive as the mount option but it would much more than what systemd offers now. Is there are reason why systemd does not allow setting ProtectProc universally in system.conf or whatever?

bluca commented 9 months ago

Of course it's supported, there are top-level drop-ins that apply automatically with the lowest priority, just read the manpage: https://www.freedesktop.org/software/systemd/man/latest/systemd.unit.html of course setting that as a default means a good chunk of the system will just break for no good reason, so you get to keep the pieces

poettering commented 9 months ago

@poettering yes messing around groups is just a dirty workaround. Systemd's ProtectProc is, tho not ideal, a better solution, yet not as comprehensive. Number and type of services on a given system can vary by a lot, and setting this option for every service is tedious to say the very least. Systemd does not support ProtectProc for a system-wide default configuration. If that was supported, then one can just do that and all services are protected by default. Then the select daemons and services who need this can be enabled by overriding the option in their own drop-in config. This would still be not as comprehensive as the mount option but it would much more than what systemd offers now. Is there are reason why systemd does not allow setting ProtectProc universally in system.conf or whatever?

As @bluca said you can enable it for all services via a drop-in. But this will not realistically work. I already mentioned: this stuff requires mount namespaces, and this means you have to disconnect your mount namespace from the host. But that means, mounts will not just propagate both ways. Thus you cannot realistically enable this globally and expect things to work. It will break everywhere, because suddenly a shitload of stuff loses access to the root mount namespace that shouldn't lose it...

the proposal in the link i provided is much less problematic: it does not require privs, does not imply mount namespaces. It just gives up the privs with no way to get it back. That's a great security concept.

DemiMarie commented 9 months ago

@poettering A major use-case for hidepid=2 is multi-user systems with interactive logins by untrusted users. These must be able to access the system bus.

poettering commented 9 months ago

@DemiMarie I think there'd be a lot of value if someone would propose the kernel patchset I linked again, maybe there's a chance to make it work this time. If so, we can really nicely do hidepid without bothering with group memberships, and without bothering with mount namespace. A small PAM module would simply fire the prctl() for relevant users. And systemd would call it for most services. And everything would be jolly. But until that happens I am very sure we shouldn't bother, the current group-based hidepid stuff is just not the right approach.

gettysburg commented 7 months ago

@poettering I agree, but hidepid should also be supported for public shell servers or servers with multiple users, with systemd or it's tools leaking information.

I know it can be done using DBus config files, but it's really systemd who introduced the information leaks, so it's also systemd's duty to stop it from leaking information about services and other user sessions.

bluca commented 7 months ago

First of all, there is no "information leak", this is pure mystification and FUD. Secondly, access to D-Bus is regulated via D-Bus, it makes no sense whatsoever to reinvent the wheel. If you want to block access to D-Bus, just do so, you already can do that since it was first created.

Maryse47 commented 7 months ago

If you want to block access to D-Bus, just do so, you already can do that since it was first created.

I don't think anyone discussing here wants to block D-bus - it's merely a poor workaround for the issue.

First of all, there is no "information leak", this is pure mystification and FUD.

Information leak is term used in mentioned RHEL doc describing systemd behavior in relation to hidepid:

Last problem, that we would like to highlight is potential information leak and false sense of security that hidepid= provides. Information (PID numbers, command line arguments, UID and GID) about system services are tracked by systemd. By default this information is available to everyone to read via systemd's D-Bus interface. When hidepid= option is used systemd doesn't take it into consideration and still exposes all this information at the API level.

bluca commented 7 months ago

If you want to block access to D-Bus, just do so, you already can do that since it was first created.

I don't think anyone discussing here wants to block D-bus - it's merely a poor workaround for the issue.

It's the right workaround for a non-existing problem

First of all, there is no "information leak", this is pure mystification and FUD.

Information leak is term used in mentioned RHEL doc describing systemd behavior in relation to hidepid:

Doesn't change the fact that it's a nonsensical claim

DemiMarie commented 7 months ago

If you want to block access to D-Bus, just do so, you already can do that since it was first created.

I don't think anyone discussing here wants to block D-bus - it's merely a poor workaround for the issue.

It's the right workaround for a non-existing problem

If it wasn’t a problem there would not be people complaining about it.

grawity commented 7 months ago

I'll be honest, I have less and less respect for attempts to push this through via brute force as some kind of infosec™ inFORMatioN LeAk thing. That's not the issue and that's not helping. Just deploy the dbus .conf.

But at the same time, @bluca, I don't think you get to say "That's a nonsensical claim" at this point, when this entire thread is all based on Lennart's own nonsensical claims on why hidepid allegedly cannot work, even though the only real issue mentioned – polkitd – has a trivial workaround (and the rest such as pid0 are never affected by hidepid to begin with).

(And somehow half the thread is about ProtectProc= even though that has nothing to do with the issue at hand. Even supposing we set ProtectProc on all user sessions so that they cannot see each other, how does that help with the original problem of systemctl status going around it, anyway?)

Groups persisting is not an issue. This is not situational access like with /dev ACLs that's based on a frequently changing parameter, where groups persisting would indeed be an issue; this is almost always going to be a permanent kind of access where %wheel or sudoers can see things and that's it. It's literally like /var/log/journal being visible to the 'adm' group – the same kind of information, the same kind of access.

I'm only worried that systemd will at some point try to deliberately make it actually not work, out of spite. When you say "You can just block D-Bus if you want", yeah, that's what we do now and it works, but how long is that going to work in the future once the same APIs are on varlink? Will there be a <policy> equivalent for that?

bluca commented 7 months ago

I'm only worried that systemd will at some point try to deliberately make it actually not work, out of spite. When you say "You can just block D-Bus if you want", yeah, that's what we do now and it works, but how long is that going to work in the future once the same APIs are on varlink? Will there be a <policy> equivalent for that?

If there's a full varlink replacement (big if, lots of work, not great gains), then access control should be even easier - access to the socket file determines access to the API, so you have multiple ways of blocking it, starting from a simple chmod. Varlink also fully supports polkit now.

poettering commented 7 months ago

I'll be honest, I have less and less respect for attempts to push this through via brute force as some kind of infosec™ inFORMatioN LeAk thing. That's not the issue and that's not helping. Just deploy the dbus .conf.

But at the same time, @bluca, I don't think you get to say "That's a nonsensical claim" at this point, when this entire thread is all based on Lennart's own nonsensical claims on why hidepid allegedly cannot work, even though the only real issue mentioned – polkitd – has a trivial workaround (and the rest such as pid0 are never affected by hidepid to begin with).

sd-bus has an entire API to expose to you the creds of the peer, part of which it reads from /proc/. we also read this stuff in journald, and various other things, to let people know what happens.

My issues with hidepid as it stands now is that it basically breaks current documented, defined behaviour system wide: something that generally used to work generally doesn't anymore. And also that to except yourself from it you need a group membership. Thus somebody who is supposed to be able to watch a system and gets the group membership to do this, will have it forever. That's just a really bad security model. Morever various daemons use setresgid() and initrgoups() to drop privs, and that will now drop API access in very unexpected ways.

I can only repeat: if you want something like hidepid= supported, then fix hidepid, i.e. make it a flag a process can have and can drop but never regain. A bit like a true capabilities system (not in the fucked up sense of POSIX caps), decouple it from group management. And we are happy to support it in systemd on every level, maybe even to the point where it can just be the default. But the current model is just broken.

And access control to D-Bus exists already. Just use that. It's terrible to use, but should allow you to do what you want already.

grawity commented 7 months ago

sd-bus has an entire API to expose to you the creds of the peer, part of which it reads from /proc/. we also read this stuff in journald, and various other things, to let people know what happens.

My issues with hidepid as it stands now is that it basically breaks current documented, defined behaviour system wide: something that generally used to work generally doesn't anymore.

That's true but I think that's acceptable for the intended audience. It's a feature a lot like AppArmor, where you can break things in all sorts of ways – including all of /proc – and it's the user's responsibility to write profiles to allow what needs to be allowed.

(And frankly, aside from journald, the "augment creds from /proc" feature in sd-bus has always been advertised as racy and potentially inaccurate, hasn't it?)

And also that to except yourself from it you need a group membership.

That's not really a problem in practice though. It's the same as using group membership to control access to system logs (like the 'systemd-journald' group we have), which so far has been considered acceptable. There's no requirement for it to be dynamic.

Thus somebody who is supposed to be able to watch a system and gets the group membership to do this, will have it forever.

I'm not sure where that 'forever' comes from. Yeah, I've used +setgid binaries to "keep" group memberships myself in the past, but nosuid /home is a thing, find -perm is a thing, etc. And other than setuid binaries, it only lasts as long as the user's processes keep running, doesn't it? (Similar to other operating systems, where removing admin access pretty much means killing all processes.) Am I missing something obvious there?

And access control to D-Bus exists already. Just use that. It's terrible to use, but should allow you to do what you want already.

Yes, it works well enough that I don't really think there's anything specific that systemd needs to "support" explicitly.

gettysburg commented 7 months ago

I think frankly the only argument you need is that the Linux kernel, including hidepid, existed way before systemd did or before systemd became used widely, so either go make it work (given that the systemd developers are not as lazy as they seem, or actually value privacy and anonymity on a multi user system), or people will switch to other projects like OpenRC.

gettysburg commented 7 months ago

Ignorance truly is bliss, it seems?

poettering commented 7 months ago

sd-bus has an entire API to expose to you the creds of the peer, part of which it reads from /proc/. we also read this stuff in journald, and various other things, to let people know what happens. My issues with hidepid as it stands now is that it basically breaks current documented, defined behaviour system wide: something that generally used to work generally doesn't anymore.

That's true but I think that's acceptable for the intended audience. It's a feature a lot like AppArmor, where you can break things in all sorts of ways – including all of /proc – and it's the user's responsibility to write profiles to allow what needs to be allowed.

AppArmor and other MACs surely are invasive, but they generally don't take over the group list for their purposes, and if done correctly do not actually change anything in the behaviour or what a service sees.

hidepid= as it is implemented via GID currently is quite different: it is hooked into a concept that a service generally owns on its own (the group list) – even if systemd initializes it, and thus alters systematically how services behave, and what they see. For example any service that calls setgroups(NULL, 0) suddenly gets a lot more dropped than they might expect.

(And frankly, aside from journald, the "augment creds from /proc" feature in sd-bus has always been advertised as racy and potentially inaccurate, hasn't it?)

Well, it is what it is, but it exists.

And also that to except yourself from it you need a group membership.

That's not really a problem in practice though. It's the same as using group membership to control access to system logs (like the 'systemd-journald' group we have), which so far has been considered acceptable. There's no requirement for it to be dynamic.

If we own a resource we can define access mechanisms for it freely. Thus what journald does for its journal files is up to journald, and it did in fact just do what was basically done on most distros before: define a group that can access the journal files.

I think you assume it is not an issue to muck with the groip membership. But htat's just an assumption.

Thus somebody who is supposed to be able to watch a system and gets the group membership to do this, will have it forever.

I'm not sure where that 'forever' comes from. Yeah, I've used +setgid binaries to "keep" group memberships myself in the past, but nosuid /home is a thing, find -perm is a thing, etc. And other than setuid binaries, it only lasts as long as the user's processes keep running, doesn't it? (Similar to other operating systems, where removing admin access pretty much means killing all processes.) Am I missing something obvious there?

Yeah, chgrip + sgid gives you persistency, and it doesn't matter in /home or elsewhere, there's usually some place you can put that stuff.

And access control to D-Bus exists already. Just use that. It's terrible to use, but should allow you to do what you want already.

Yes, it works well enough that I don't really think there's anything specific that systemd needs to "support" explicitly.

Sure, so what this this bug report about then?

I wouldn't mind having something like hidepid in place that we can sensibly enable more widely than ProtectProc=, but I am pretty sure as long as this is bound to group membership it's not the time to do this.