util/sockopt: add fallback for SELinux denial on SO_PEERPIDFD

bluca commented 7 months ago

If SELinux is in enforcing and we get EACCES, return an error to skip using pidfds. This can happen with an old policy and a new kernel that uses the new pidfs filesystem to back pidfds, instead of anonymous inodes, as the existing policy denies access.

dvdhrm commented 7 months ago

Is this fixed in the selinux rules?

bluca commented 7 months ago

No, there are many policies around, I don't think any is fixed yet

dvdhrm commented 7 months ago

Can you elaborate why this needs to be fixed in the code-base, rather than in the policy? I mean, this will not hit distributions any earlier than a fix of the policies. But a policy-fix will actually enable the feature, while this workaround will just disable it until another component (i.e. the policy) has been updated.

Also, it is very common for package updates to conflict with existing policies. It is common procedure to update the policies before pushing package updates to release branches. Is this different here?

Lastly, what do you mean with there are many policies around? The reference policy carries the required bits for dbus-broker. I am not aware of any deployment that does not base its policies on it. If there are any, please let us know so we can track them when updating the dbus policies.

Just for the record, the dbus (and dbus-broker) related poclies can be found here: https://github.com/SELinuxProject/refpolicy/tree/main/policy/modules/services

bluca commented 7 months ago

Redhat have their own policy that is disjoint from the upstream one. In MSFT we have our own policy as well, that sometimes intersecates but not always. That's at least three. I think Android has its own too - not that it would matter for this, but still that's four that I know of. and I don't really follow selinux that closely.

In general, with new syscalls or adjacent, there's always going to be LSMs that reject them. It's not just selinux, there's also seccomp that usually is deny-by-default. This new kernel filesystem triggers unexpected denials that you wouldn't know unless you are a kernel developers - an 'open' on a getsockopt() call is not that obvious. So given there's a clear fallback, it's better to use it instead of trying to foresee what the kernel might or might not return, as these things change all the time, and having your system suddenly fail to boot because there was a new kernel version is not a nice experience for a user, and having to suddenly go in fire fighting mode is not a nice experience for developers.

dbus-daemon doesn't have this issue, it falls back if the getsockopt is rejected without any problem.

dvdhrm commented 7 months ago

Redhat have their own policy that is disjoint from the upstream one. In MSFT we have our own policy as well, that sometimes intersecates but not always. That's at least three. I think Android has its own too - not that it would matter for this, but still that's four that I know of. and I don't really follow selinux that closely.

Excellent! So lets start with selinux-refpolicy and fedora-selinux, and send a request to msft. With the policies fixed, we can switch the pidfd to use dentry_open() and everything will keep working.

In general, with new syscalls or adjacent, there's always going to be LSMs that reject them. It's not just selinux, there's also seccomp that usually is deny-by-default. This new kernel filesystem triggers unexpected denials that you wouldn't know unless you are a kernel developers - an 'open' on a getsockopt() call is not that obvious.

I will cheekily restrict this PR to a workaround for the referenced kernel changes. I am open to discuss support for systems that block specific syscalls via seccomp, yet I would prefer to discuss that separately (yes, a possible fix might involve checking for the same error-code, but their root cause is IMO very much different; but they very likely should not check for selinux enforcing mode).

So given there's a clear fallback, it's better to use it instead of trying to foresee what the kernel might or might not return, as these things change all the time[,...]

Imagine someone deploying a Fedora machine with linux-6.7 and dbus-broker-35. They use this as minimum required versions for their setup and deploy an application that relies on ProcessFD in GetConnectionCredentials(). They are very much allowed to do that, and they can rely on ProcessFD to be available and present. They control the setup, they are very much allowed to set high minimum requirements, and they can rely on released kernel features to not break. They can even put this into a device and sell it.

With the referenced kernel-update their setup will break, regardless whether the suggested PR is applied to dbus-broker or not. dbus-broker cannot report ProcessFD even with your PR applied. Hence, this kernel update is a breaking change. There is nothing dbus-broker can do to avoid this, and thus I very much disagree with your statement that there is a "clear fallback".

I see 2 options to avoid this breakage: 1) update SELinux policies and ensure they are synchronized with kernel updates (which very much relies on lenient update policies of the respective distributions) 2) ensure that no open()-LSM-hook is triggered by SO_PEERPIDFD.

Maybe there is no dbus client that relies on ProcessFD, maybe there is. But we as bus maintainers will protect the APIs we grant to our clients and ensure that they do not break. And that is why we do not have default-fallbacks to disable features at runtime. We believe this hides bugs. And this proposed kernel change could have broken D-Bus clients, if we hadn't refused to disable features at runtime.

[...] and having your system suddenly fail to boot because there was a new kernel version is not a nice experience for a user, and having to suddenly go in fire fighting mode is not a nice experience for developers.

I don't understand. The linux kernel famously has one important rule: "You do not break existing user-space." And history has shown that (almost) anything violating this rule will get rigorously reverted. And this isn't even a niche issue hard to trigger. No. You cannot boot the most recent Fedora version with this proposed kernel update, hence if this finds its way into a released kernel it would be a careless neglect of testing practices.

No-one has to go into "fire fighting mode" for this change other than kernel developers.

dbus-daemon doesn't have this issue, it falls back if the getsockopt is rejected without any problem.

dbus-daemon possibly breaks existing clients silently without telling anyone. Yes, if no such client existed, they would get away with it and thus make life easier for breaking kernel changes (which is a very valid thing to do, we just happen to choose a different route). However, if such a client existed, they would put all the burden of ensuring the kernel does not break existing code on those clients, which might receive much much less exposure and testing and not notice breakage until it is far too late.

brauner commented 7 months ago

On Mon, Feb 26, 2024 at 12:25:00AM -0800, David Rheinsberg wrote:

Redhat have their own policy that is disjoint from the upstream one. In MSFT we have our own policy as well, that sometimes intersecates but not always. That's at least three. I think Android has its own too - not that it would matter for this, but still that's four that I know of. and I don't really follow selinux that closely.

Excellent! So lets start with selinux-refpolicy and fedora-selinux, and send a request to msft. With the policies fixed, we can switch the pidfd to use dentry_open() and everything will keep working.

Which was already started before we came here.

I don't understand. The linux kernel famously has one important rule: "You do not break existing user-space." And history has shown that (almost) anything violating this rule will get rigorously reverted. And this isn't even a niche issue hard to trigger. No. You cannot boot the most recent Fedora version with this proposed kernel update, hence if this finds its way into a released kernel it would be a careless neglect of testing practices.

No-one has to go into "fire fighting mode" for this change other than kernel developers.

The only thing that you missed is that we've already fixed this right as we noticed.

And this change here is more about sane fallbacks on permission errors which are common in containers irrespective of kernel changes; I've tried to point that out elsewhere.

In any case, this is stealing everyone's time as clearly this is not a pragmatic but a philosophical issue to you which is fine but I don't think this will go anywhere.

dvdhrm commented 7 months ago

[...] this change here is more about sane fallbacks on permission errors which are common in containers irrespective of kernel changes;

Is it? This PR adds a single comment that reads:

/* Kernel 6.8 moves pidfds from anonymous inodes to a new
 * pidfs pseudo-filesystem, and the existing SELinux policy
 * denies access. If we are running in enforcing mode and
 * we get an EACCES, treat it as if pidfd support was not
 * available. */

If this is about "sane fallbacks on permission errors which are common in containers irrespective of kernel changes", then please write a comment and commit message that state this intent.

The PR in its current form contradicts your statement.

In any case, this is stealing everyone's time as clearly this is not a pragmatic but a philosophical issue to you which is fine but I don't think this will go anywhere.

I am listing example D-Bus scenarios that fail if I apply the PR, and explain the rationale behind our decision. If you are not interested in this, you are welcome to disengage. No need to call me unpragmatic and philosophical.

brauner commented 7 months ago

The PR in its current form contradicts your statement.

Last I checked I am neither the author of this commit nor am I in control of the actual author's code comments. I did however tell you about what I think would be useful on another medium.

In any case, this is stealing everyone's time as clearly this is not a pragmatic but a philosophical issue to you which is fine but I don't think this will go anywhere.

I am listing example D-Bus scenarios that fail if I apply the PR, and explain the rationale behind our decision. If you are not interested in this, you are welcome to disengage. No need to call me unpragmatic and philosophical.

I'm not sure why you took that as an insult? I have very principled ideas about things too. Here you bring forward principled arguments against handling error checking for permission denied errors. So I'm not sure why you refuse the obvious.

But ok so:

Imagine someone deploying a Fedora machine with linux-6.7 and dbus-broker-35. They use this as minimum required versions for their setup and deploy an application that relies on ProcessFD in GetConnectionCredentials(). They are very much allowed to do that, and they can rely on ProcessFD to be available and present

Imagine someone deploying a Fedora container and using dbus-broker with a standard seccomp profile forbidding ProcessFD in GetConnectionCredentials() or an LSM restricting access to it in the future. Happens today.

In your case dbus-broker will refuse to be usable on permission error and cause cascading failures for other services. I would rather use a fall back even on permission error right now.

And then, if you're really serious about not falling back on errors other than this thing not being available go all the way and simply remove the non-SO_PEERPIDFD code completely in a few years.

dvdhrm commented 7 months ago

Imagine someone deploying a Fedora container and using dbus-broker with a standard seccomp profile forbidding ProcessFD in GetConnectionCredentials() or an LSM restricting access to it in the future.

You mean blocking the required syscalls to implement ProcessFD, right? I am not aware of an LSM that allows blocking based on entries in GetConnectionCredentials(), but only based on transaction metadata.

Not that it changes the setup, just wondering.

Happens today. In your case dbus-broker will refuse to be usable on permission error and cause cascading failures for other services. I would rather use a fall back even on permission error right now. And then, if you're really serious about not falling back on errors other than this thing not being available go all the way and simply remove the non-SO_PEERPIDFD code completely in a few years.

Right. As said before, if such setups exist, tell us about them, explain their policy on rule-updates, and who deploys them, and we can talk about supporting it. So far, no-one has reported breakage in the past, and we have continuously made use of new socketops in a similar fashion. As far as I understand, your report is also theoretical and you are not aware of a system that broke, right?

(Note that I do not think the breakage that would have been caused by the proposed kernel changes applies here, as it would have broken D-Bus clients anyway.)

Additionally, I would strongly prefer to upstream any seccomp-filters to the service files provided by us, rather than applying them downstream. This allows keeping them in-sync with updates and avoiding conflicts. Such attempts are currently ongoing in Fedora, btw (https://fedoraproject.org/wiki/Changes/SystemdSecurityHardening).

Lastly, if downstream intends to apply system-wide seccomp-filters (or other restrictions) which cannot take into account service-individual requirements, then I would very much like to know why they insist on updating the software they run, but refuse to update the filters to the requirements of that software. The industry maintains a huge effort to provide stable releases in RHEL and friends for decades. Lets use it! If you run git-main, then ensure git-main requirements.

Finally, note that we have deviated from that (I grant you) rather philosophical argument many times in the past. But on good grounds based on actual deployments. In this case, in particular, I would want to know what the policy for seccomp-filter updates is, how we can track this, and most importantly how we can annotate this so it does not accidentally break. I want this tested in our CI, before I make any promises.

bus1 / dbus-broker

util/sockopt: add fallback for SELinux denial on SO_PEERPIDFD #343