systemd / systemd

The systemd System and Service Manager
https://systemd.io
GNU General Public License v2.0
13.2k stars 3.77k forks source link

when determining whether a unit file is enabled, ignore .wants/ symlinks pointing to invalid unit file paths, such as /dev/null #23341

Open twojstaryzdomu opened 2 years ago

twojstaryzdomu commented 2 years ago

systemd version the issue has been seen with

250.4 (Debian), 246 (openSUSE)

Used distribution

Debian, openSUSE

Linux kernel version used (uname -a)

5.15.38 (Debian) & 5.3.18 (openSUSE), issue seems to be independent of kernel release

CPU architecture issue was seen on

armv6l & x86_64, issue seems to be platform-agnostic

Expected behaviour you didn't see

systemctl status examining the link points to a file that does not contain any service definition

Unexpected behaviour you saw

systemd reports misleading service enabled with symlink pointing to /dev/null

Steps to reproduce the problem Create/remove a symlink pointing to /dev/null under /etc/systemd/system/multi-user.target.wants to black out an existing service. Observe its impact on systemctl status. Reload-daemon after removing/creating the symlink. The blacked-out service does not start as intended, this seems to be a service status reporting problem only.

Additional program output to the terminal or log subsystem illustrating the issue

# rm -rf /{etc,lib}/systemd/system/multi-user.target.wants/test.service /lib/systemd/system/test.service 
# systemctl status test
Unit test.service could not be found.
# cat > /lib/systemd/system/test.service << EOL
[Service]
ExecStart=true
[Install]
WantedBy=multi-user.target
EOL
# systemctl daemon-reload
# systemctl status test
○ test.service
     Loaded: loaded (/lib/systemd/system/test.service; disabled; vendor preset: enabled)
     Active: inactive (dead)
# ln -sf /dev/null /etc/systemd/system/multi-user.target.wants/test.service
# systemctl daemon-reload
# systemctl status test
○ test.service
     Loaded: loaded (/lib/systemd/system/test.service; enabled; vendor preset: enabled)
     Active: inactive (dead)
poettering commented 2 years ago

the .wants/ symlinks are about adding deps, they do not affect the definition of the unit as a whole. Hence this works as intended. If you want to mask a unit, mask the unit file itself, i.e. in /etc/systemd/system/ instead of a .wants/ subdir below it.

Closing, since this works as intended really.

twojstaryzdomu commented 2 years ago

the .wants/ symlinks are about adding deps, they do not affect the definition of the unit as a whole. Hence this works as intended.

I beg to differ, so you propose to break ("mask the unit file itself") a service altogether. Isn't that throwing the baby out with the bathwater?

What if you need to use the service, just not under that specific target/.wants directory?

The use case is blackmasking a unit under one target and no other. There are plenty of distro provisioned dependencies under (/lib/systemd/...) that you may want to disable selectively per target basis.

/dev/null links are quite common to manage such instances. I'd expect systemd not to present misleading status.

Hence, this bug is still present.

mbiebl commented 2 years ago

Right. While it's not clear what your use case is (trying to add /dev/null symlink in a wants directory appears strange), systemctl should not report misleading information, i.e. showing it as enabled.

poettering commented 2 years ago

What if you need to use the service, just not under that specific target/.wants directory?

i cannot parse this. What is this supposed to mean?

You want to cancel individual deps? We have no mechanism for that. And its hard to do, since deps can be created in both directions and via various means.

mbiebl commented 2 years ago

Just to understand your use case better: You want a service that has been enabled statically via /lib/systemd/system/multi-user.targets.wants/foo.service disable via a /etc/systemd/system/multi-user.targets.wants/foo.service, to prevent it from being started "during boot"? But you don't want to mask it completely to be able to start the service manually?

poettering commented 2 years ago

4830 is the issue where we track the request to be able to mask indivudual deps. If this is what is requested here, we can close this one.

twojstaryzdomu commented 2 years ago

Just to better understand your use case better: You want a service that has been enabled statically via /lib/systemd/system/multi-user.targets.wants/foo.service disable via a /etc/systemd/system/multi-user.targets.wants/foo.service, to prevent it from being started "during boot"? But you don't want to mask it completely to be able to start the service manually?

Precisely. Suppose /lib/systemd is read-only and the only means to interact with its contents is via blackmasking under /etc/systemd. As systemd is building its fragments first from the contents of /lib/systemd, then the contents of /etc/systemd, systemd upon finding a service link pointing at /dev/null considers a service disabled under a target. It continues to operate under other targets, where the link is pointing at a service file. It's as simple as that.

mbiebl commented 2 years ago

I think there are two issues here:

twojstaryzdomu commented 2 years ago
  • Not sure what the current status of such a /dev/null symlink would mean, a misconfiguration probably. But certainly not enabled. disabled would probably be a better representation.

Indeed, considering the presence of a link as a sure fire sign a service is at all enabled, without looking at its contents, or where it points to, is dodgy at best.

mbiebl commented 2 years ago
  • Not sure what the current status of such a /dev/null symlink would mean, a misconfiguration probably. But certainly not enabled. disabled would probably be a better representation.

Indeed, considering the presence of a link as a sure fire sign a service is at all enabled, without looking at its contents, or where it points to, is dodgy at best.

If we consider such a /dev/null symlink in a .wants directory a misconfiguration, systemd should ignore it and log an error I think.

twojstaryzdomu commented 2 years ago

If we consider such a /dev/null symlink in a .wants directory a misconfiguration, systemd should ignore it and log an error I think.

Why error? It is an empty file after all. It may be a result of informed user activity. It should be no different from the case when an empty file (not a link) is encountered. Logging may be an option but there may be cases when it's intended (as in the case of the read-only /usr/systemd).

To consider as service fully enabled, systemd needs to do due dilligence, tantamount to the find statements:

# touch a
# ln -s a b
# find -H a -samefile b
a
# find -H b -samefile a
b
# ln -s /dev/null c
# find -H c -samefile a
# find -H c -samefile b

And cascade those checks to /etc, to ensure a user black out hasn't been put in place.

Isn't this a matter of an additional IS_LNK check on each fragment file, a readlink(), coupled with an strcmp() on the symlink and the target (but mind technically nothing stops you from naming a systemd service symlink to something else than its target AND the target doesn't have to be /dev/null, it may be /dev/zero)?

My point being without any due dilligence checks, a broken link (not necessarily one created with a informed purpose like in my use case) will falsely trigger a service enabled flag.

twojstaryzdomu commented 2 years ago

What if you need to use the service, just not under that specific target/.wants directory?

i cannot parse this. What is this supposed to mean?

You want to cancel individual deps? We have no mechanism for that. And its hard to do, since deps can be created in both directions and via various means.

Configuration in systemd is built up from fragments. Now, if fragments is permitted, there has to be some hierarchy, or an order,in which these get parsed. There has to be some handling for symlinks beyond e.g. a symlink for service X exists and we don't look at what it's pointing at at all.

For this issue specifically to be resolved, the premise is systemd needs to resolve the symlink and ensure it isn't a bad link /dev/null. An empty placeholder that has been put into place doesn't appear to be enabling a service. See below:

# cat > /lib/systemd/system/test.service << EOL
[Service]
ExecStart=true
[Install]
WantedBy=multi-user.target
EOL
# systemctl daemon-reload
# systemctl status test
○ test.service
     Loaded: loaded (/lib/systemd/system/test.service; disabled; vendor preset: enabled)
     Active: inactive (dead)
# ln -sf /dev/null /etc/systemd/system/multi-user.target.wants/test.service
# systemctl daemon-reload
# systemctl status test
○ test.service
     Loaded: loaded (/lib/systemd/system/test.service; enabled; vendor preset: enabled)
     Active: inactive (dead)

An empty file in place of a dependency link doesn't enable a service:

# rm /etc/systemd/system/multi-user.target.wants/test.service
# touch /etc/systemd/system/multi-user.target.wants/test.service
# systemctl daemon-reload
# systemctl status test
○ test.service
     Loaded: loaded (/lib/systemd/system/test.service; disabled; vendor preset: enabled)
     Active: inactive (dead)

A bad service file in place of a dependency link doesn't enabled a service:

# rm /etc/systemd/system/multi-user.target.wants/test.service
# echo test > /etc/systemd/system/multi-user.target.wants/test.service
# cat /etc/systemd/system/multi-user.target.wants/test.service
test
# systemctl daemon-reload
# systemctl status test
○ test.service
     Loaded: loaded (/lib/systemd/system/test.service; disabled; vendor preset: enabled)
     Active: inactive (dead)

Linking against a disabled service is triggering enabled status:

# rm /etc/systemd/system/multi-user.target.wants/test.service
# systemctl status ssh
○ ssh.service - OpenBSD Secure Shell server
     Loaded: loaded (/lib/systemd/system/ssh.service; disabled; vendor preset: enabled)
     Active: inactive (dead)
       Docs: man:sshd(8)
             man:sshd_config(5)
# ln -s /lib/systemd/system/ssh.service /etc/systemd/system/multi-user.target.wants/test.service
# systemctl daemon-reload
# systemctl status test
○ test.service
     Loaded: loaded (/lib/systemd/system/test.service; enabled; vendor preset: enabled)
     Active: inactive (dead)

A non-empty symlink pointing at a file is triggering enabled status:

# rm /etc/systemd/system/multi-user.target.wants/test.service
# ln -s /etc/passwd /etc/systemd/system/multi-user.target.wants/test.service
# systemctl daemon-reload
# systemctl status test
○ test.service
     Loaded: loaded (/lib/systemd/system/test.service; enabled; vendor preset: enabled)
     Active: inactive (dead)

Heck, even a symlink pointing at a directory is triggering enabled status:

# rm /etc/systemd/system/multi-user.target.wants/test.service
# ln -s /tmp /etc/systemd/system/multi-user.target.wants/test.service
# systemctl daemon-reload
# systemctl status test
○ test.service
     Loaded: loaded (/lib/systemd/system/test.service; enabled; vendor preset: enabled)
     Active: inactive (dead)

The systemd version is 251.2.

All these lead me to conclude symlinks aren't being examined, leading to possible issues when bad links (or merely pointing at an empty file i.e. /dev/null) are encountered.

As for #4830 being a duplicate, this issue is tangential, limited to symlinks specifically and as such smaller in scope. I'd be reluctant to close it. Including the problem under #4830 would amount to sweeping it under the carpet as #4830 has been outstanding for almost 7 years.

poettering commented 2 years ago

Why error? It is an empty file after all. It may be a result of informed user activity. It should be no different from the case when an empty file (not a link) is encountered. Logging may be an option but there may be cases when it's intended (as in the case of the read-only /usr/systemd).

We do not load units through those symlinks, but we always load them from the main unit dirs, never the .wants/ subdirs. Or in other words, we use only the basename of the path stored in the symlinks themselves, and never open a unit file through the symlink. It's the only thing that makes sense really, because units typically have many deps that reference them, and if we loaded them through those symlinks then it would be undeterministic which definition we'd end up with...

poettering commented 2 years ago
  • The misleading status output. systemctl shows the service as enabled when actually it isn't Not sure what the current status of such a /dev/null symlink would mean, a misconfiguration probably. But certainly not enabled. disabled would probably be a better representation.

The man pages already say that symlinks in .wants/ have to point to validly named unit files. /dev/null does not qualify. It's unspecified what it means if they point to random nonesense, but I am not sure this is a problem, if you do stuff that is clearly against what is documented.

That said, I#d be finde with just ignoring symlinks entirely that point to invalid stuff when determining whether a unit is enabled or no.

poettering commented 2 years ago

/cc @keszybz

twojstaryzdomu commented 2 years ago

Why error? It is an empty file after all. It may be a result of informed user activity. It should be no different from the case when an empty file (not a link) is encountered. Logging may be an option but there may be cases when it's intended (as in the case of the read-only /usr/systemd).

We do not load units through those symlinks, but we always load them from the main unit dirs, never the .wants/ subdirs. Or in other words, we use only the basename of the path stored in the symlinks themselves, and never open a unit file through the symlink. It's the only thing that makes sense really, because units typically have many deps that reference them, and if we loaded them through those symlinks then it would be undeterministic which definition we'd end up with...

Right, I have figured out that much from the tests run above. But it seems it isn't only the name filename that is considered; it specifically has to be a symlink. The example with an empty file in place of a dependency link isn't enabling the service. So there is some logic in the code to handle symlinks, the problem is it isn't exhaustive enough to provide safeguards against symlinks that are: 1) Unintentionally broken. 2) Purposefully serving as service dependency black out (e.g. read-only /lib/systemd case) by pointing at an empty file (or /dev/null). 3) Purposefully pointing at another service.

I'd propose to forget about the symlink filename altogether and start looking at the symlink target to begin with. Would that cause a dependency hell? Not immediately. That would be a start to solving all 3 cases however.

The man pages already say that symlinks in .wants/ have to point to validly named unit files. /dev/null does not qualify

Indeed, updating the man page would be picking low hanging fruit, a way to disclaim responsibility for a lack of due diligence checks in the code in the event of a failure. It's an option but not very proactive.

That said, I#d be finde with just ignoring symlinks entirely that point to invalid stuff when determining whether a unit is enabled or no.

That is some strategy, but it also leaves open the above 3 problems as a side-effect.

poettering commented 2 years ago

Right, I have figured out that much from the tests run above. But it seems it isn't only the name filename that is considered; it specifically has to be a symlink.

Yes it has to be a symlink. And we'll read the string that the symlink stores and use that.

We use that because it allows you to instantiate stuff, ie. have a symlink from foo@quux.service to foo@.service.

poettering commented 2 years ago

2. Purposefully serving as service dependency black out (e.g. read-only /lib/systemd case) by pointing at an empty file (or /dev/null).

As mentioned this has been requested before, we do not support this right now. I'd be OK with adding a masking concept for deps, but it's not that easy to implement as you might think, because such masking should probably also override deps synthesized other ways then just .wants/ symlinks. Anyway, for that discussion → #4830

I'd propose to forget about the symlink filename altogether and start looking at the symlink target to begin with.

We look at symlink name and target. As I said.

twojstaryzdomu commented 2 years ago

We use that because it allows you to instantiate stuff, ie. have a symlink from foo@quux.service to foo@.service.

The above notwithstanding, systemd ought to ensure foo@instance.service is pointing at foo@.service, or report an error (misconfigured? blacked out?) and give up including that dependency.

We look at symlink name and target. As I said.

Right but looking at the symlink clearly isn't enough, otherwise why do I end up with a directory symlink enabling a service?

# rm /etc/systemd/system/multi-user.target.wants/test.service
# ln -s /tmp /etc/systemd/system/multi-user.target.wants/test.service
# systemctl daemon-reload
# systemctl status test
○ test.service
     Loaded: loaded (/lib/systemd/system/test.service; enabled; vendor preset: enabled)
     Active: inactive (dead)

Systemd needs to cover more ground than merely a readlink() for its own sake.

The problem is when the facility to control dependencies lives in the file system, there will always be users who maintain their files in ways a man page does not cover. It is a fact, users wish to employ the facilities like symlinks they have at their disposal. Admittedly, it is very handy to be controlling dependencies thus; I wouldn't unnecessarily take that from the users but merely add a few conditional statements around it and perhaps an additional misconfigured or blacked out status messages. If it requires a dependency logic overhaul, that is another matter.

In short, if any dependency link broken or misconfigured, I'd propose not include it at all under that target instance, erring on the side of caution.