systemd units in /usr/local/lib/systemd/system don't get loaded

dustymabe commented 4 years ago

Host system details

FCOS @31.20191031.dev.0

[root@coreos ~]# rpm -q rpm-ostree ostree 
rpm-ostree-2019.6-1.fc31.x86_64
ostree-2019.4-3.fc31.x86_64

Expected vs actual behavior

Units under /usr/local/lib/systemd/system don't seem to be working on ostree based systems.

[root@coreos ~]# mkdir -p /usr/local/lib/systemd/system/
[root@coreos ~]# cat <<EOF > /usr/local/lib/systemd/system/example.service
> [Service]
> Type=oneshot
> ExecStart=/usr/bin/false
> [Install]
> WantedBy=multi-user.target
> EOF
[root@coreos ~]# ls -lZ /usr/local/lib/systemd/system/example.service
-rw-r--r--. 1 root root unconfined_u:object_r:var_t:s0 85 Oct 31 17:17 /usr/local/lib/systemd/system/example.service
[root@coreos ~]# chcon -v system_u:object_r:lib_t:s0 /usr/local/lib/systemd/system/example.service
changing security context of '/usr/local/lib/systemd/system/example.service'
[root@coreos ~]# ls -lZ /usr/local/lib/systemd/system/example.service
-rw-r--r--. 1 root root system_u:object_r:lib_t:s0 85 Oct 31 17:17 /usr/local/lib/systemd/system/example.service
[root@coreos ~]# systemctl enable example.service
Created symlink /etc/systemd/system/multi-user.target.wants/example.service → /usr/local/lib/systemd/system/example.service.
[root@coreos ~]# systemctl status example.service
● example.service
   Loaded: loaded (/usr/local/lib/systemd/system/example.service; enabled; vendor preset: disabled)
   Active: inactive (dead)
[root@coreos ~]# systemctl start example.service
Job for example.service failed because the control process exited with error code.
See "systemctl status example.service" and "journalctl -xe" for details.
[root@coreos ~]# systemctl status example.service
● example.service
   Loaded: loaded (/usr/local/lib/systemd/system/example.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Thu 2019-10-31 17:18:22 UTC; 1s ago
  Process: 1266 ExecStart=/usr/bin/false (code=exited, status=1/FAILURE)
 Main PID: 1266 (code=exited, status=1/FAILURE)

Oct 31 17:18:22 coreos systemd[1]: Starting example.service...
Oct 31 17:18:22 coreos systemd[1]: example.service: Main process exited, code=exited, status=1/FAILURE
Oct 31 17:18:22 coreos systemd[1]: example.service: Failed with result 'exit-code'.
Oct 31 17:18:22 coreos systemd[1]: Failed to start example.service.

After reboot:

[root@coreos ~]# systemctl status example.service
Unit example.service could not be found.
[root@coreos ~]# systemctl daemon-reload
[root@coreos ~]# systemctl status example.service
● example.service
   Loaded: loaded (/usr/local/lib/systemd/system/example.service; enabled; vendor preset: disabled)
   Active: inactive (dead)

A few things:

The file doesn't get created with the correct SELinux label to begin with
After a reboot the running systemd is unaware of the example.service unit under /usr/local/lib

Steps to reproduce it

See reproducer in the above

Would you like to work on the issue?

Maybe if we can determine the root cause

jlebon commented 4 years ago

The file doesn't get created with the correct SELinux label to begin with

Right yeah, we need another equivalency rule for /var/usrlocal and /usr/local.

After a reboot the running systemd is unaware of the example.service unit under /usr/local/lib

This is likely because /var gets mounted later, and /usr/local is just /var/usrlocal so systemd doesn't see /usr/local at startup time when it's scanning for units. Try systemctl daemon-reload, it'll force it to rescan for units and find it. I guess we could adapt systemd for this. Though... is this something you're installing into? Aren't you layering it? Or if it's manual, why not /etc/systemd?

I guess this might become more relevant once we more easily support "rootfs layers" client-side to make it easier to test with locally built software, which could target /usr/local. Although, with layers, you could also just target prefix=/usr anyway since OSTree tracks it all for you.

jlebon commented 4 years ago

Opened https://src.fedoraproject.org/rpms/selinux-policy/pull-request/24 for the SELinux bit.

dustymabe commented 4 years ago

Try systemctl daemon-reload, it'll force it to rescan for units and find it. I guess we could adapt systemd for this.

Yep. The daemon-reload after you're up and booted works, but if you reboot the unit won't be started. Also note that if any units (even if they exist in /etc/systemd/sytem) use a non-absolute path and the executable is in /usr/local/bin then systemd will not enable the unit because it is detected as a bad unit:

/etc/systemd/system/foo@.service:11: Executable "foo" not found in path "/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin"

Though... is this something you're installing into? Aren't you layering it? Or if it's manual, why not /etc/systemd?

I'm using /etc/systemd as a workaround. I wanted to use /usr/local/lib because I'm placing other files there for this particular service and it made sense to put them all under the same structure.

lucab commented 4 years ago

This is likely because /var gets mounted later

@jlebon was pointing me to the generator that takes care of creating the mount-unit to bind-mount /var. I think that this generator approach has a few corner-case like this (e.g. other generators won't see /usr/local as well), which at the root could probably be fixed by having this mount-unit ini initramfs written to /run and scheduled before initrd-fs.target.

cgwalters commented 4 years ago

Bigger picture, we have two main ways to deliver software:

Shipped with the host or layered RPMs
Distinct containers

Installing static-ish binaries in /usr/local is of course supported too.

What you're trying to do here - yes, we should make it work - but I don't think it's a high priority.

If you're integrating with the host, it should be more like package layering. Is there a reason you're not using that?

dustymabe commented 4 years ago

If you're integrating with the host, it should be more like package layering. Is there a reason you're not using that?

I think that is the goal longer term, for right now I was trying to get everything working by delivering files via ignition and sharing a recipe with others to do the same thing.

dustymabe commented 4 years ago

I've got workarounds for this for now, so it is not a high priority.

jlebon commented 4 years ago

My suggestion is:

Recommend against using storage.files to create systemd units. Just stick with systemd.units.
Discuss with upstream about doing the executable path resolution at unit start time, instead of unit load time. E.g. what about a unit that pulls down a binary at boot-time that a later unit needs?

I think that this generator approach has a few corner-case like this (e.g. generators won't see /usr/local as well), which at the root could probably be fixed by having this mount-unit ini initramfs written to /run and scheduled before initrd-fs.target.

Hmm, so we'd mount all the filesystems in the initrd on every boot instead of post-pivot? I'm not sure the benefit here is worth making a fundamental change like this. The interaction with Ignition + OSTree + systemd on first boot would get even more complex than it already is. :)

dustymabe commented 4 years ago

1. Recommend against using `storage.files` to create systemd units. Just stick with `systemd.units`.

yeah, the reason I ended up here was because of https://github.com/coreos/ignition/issues/586.

lucab commented 4 years ago

Hmm, so we'd mount all the filesystems in the initrd on every boot instead of post-pivot? I'm not sure the benefit here is worth making a fundamental change like this. The interaction with Ignition + OSTree + systemd on first boot would get even more complex than it already is. :)

Not all of them, just the ones that explicitly require mounting before pivot_root (in this case, /var).

For reference, the idea is that for a few mount-points we'd need to mimic the x-initrd.mount semantics. To my knowledge, this is the systemd-way of dealing with the issue we are seeing (i.e. a mount-point with content that systemd/generators/etc needs early on).

I acknowledge it could be an invasive change and too bothersome in the short term, so I guess it's fine to punt.

Aegeontis commented 7 months ago

This is still an issue in Fedora CoreOS

M1cha commented 1 month ago

Yes it's still an issue. Also about:

If you're integrating with the host, it should be more like package layering. Is there a reason you're not using that?

Personally, I have zero interest in that. All of my modifications to the coreOS system are tracked in a git-repository and rsynced into the host. Putting stuff into /usr/local allows using --delete and thus deleting files removed from git, because the distro itself doesn't put anything there.

With layering I'd have to use a complex build system to build an rpm package, wait multiple minutes to commit/rebase and reboot the whole system. By simply copying files it takes seconds to apply and doesn't need a reboot.

M1cha commented 1 month ago

For my specific use-case I found a workaround: I can diff /usr/etc/systemd/system against /etc/systemd/system and delete old files using some (hopefully good enough) logic to prevent deleting important files. With that I can now use /etc/, too.

coreos / rpm-ostree

systemd units in /usr/local/lib/systemd/system don't get loaded #1936