coreos / rpm-ostree

⚛📦 Hybrid image/package system with atomic upgrades and package layering
https://coreos.github.io/rpm-ostree
Other
860 stars 195 forks source link

Update auto-tmpfiles code to skip already specified directories #26

Closed cgwalters closed 10 months ago

cgwalters commented 10 years ago

systemd started shipping tmpfiles.d snippets for a variety of directories; this causes conflict warnings at runtime.

rpm-ostree would need to scan the tmpfiles.d snippets at install time and skip synthesizing entries if they are already specified.

jlebon commented 3 years ago

This is still relevant; I think we've just gotten used to seeing the duplicate warnings heh.

HuijingHei commented 11 months ago

I think you mean the duplicated info in systemd-tmpfiles-setup.

$ journalctl -u systemd-tmpfiles-setup
...
Oct 24 06:58:39 localhost systemd-tmpfiles[1481]: /usr/lib/tmpfiles.d/tmp.conf:12: Duplicate line for path "/var/tmp", ignoring.
Oct 24 06:58:39 localhost systemd-tmpfiles[1481]: /usr/lib/tmpfiles.d/var.conf:14: Duplicate line for path "/var/log", ignoring.
Oct 24 06:58:39 localhost systemd-tmpfiles[1481]: /usr/lib/tmpfiles.d/var.conf:19: Duplicate line for path "/var/cache", ignoring.
Oct 24 06:58:39 localhost systemd-tmpfiles[1481]: /usr/lib/tmpfiles.d/var.conf:21: Duplicate line for path "/var/lib", ignoring.
Oct 24 06:58:39 localhost systemd-tmpfiles[1481]: /usr/lib/tmpfiles.d/var.conf:23: Duplicate line for path "/var/spool", ignoring.
Oct 24 06:58:39 localhost systemd-tmpfiles[1481]: "/home" already exists and is not a directory.
Oct 24 06:58:39 localhost systemd-tmpfiles[1481]: "/srv" already exists and is not a directory.
Oct 24 06:58:39 localhost systemd-tmpfiles[1481]: "/root" already exists and is not a directory.
chrisawi commented 11 months ago

This seems to be preventing cleanup of /var/tmp by systemd-tmpfiles-clean.service on F38/F39 Silverblue.

pkg-filesystem.conf contains

d /var/tmp 1777 root root - -

whereas systemd's tmp.conf is

q /var/tmp 1777 root root 30d
cgwalters commented 11 months ago

Thanks, that is a pretty important bug.

jlebon commented 11 months ago

rpm-ostree would need to scan the tmpfiles.d snippets at install time and skip synthesizing entries if they are already specified.

@HuijingHei brought up internally that this is not as simple because the auto-conversion we do at package import time may be duplicating a tmpfiles.d dropin that lives in a separate package. This is in fact the case for all the warnings we currently see in FCOS at least:

[root@cosa-devsh tmpfiles.d]# journalctl --grep 'Duplicate line'
Nov 16 19:58:12 localhost systemd-tmpfiles[1619]: /usr/lib/tmpfiles.d/tmp.conf:12: Duplicate line for path "/var/tmp", ignoring.
Nov 16 19:58:12 localhost systemd-tmpfiles[1619]: /usr/lib/tmpfiles.d/var.conf:14: Duplicate line for path "/var/log", ignoring.
Nov 16 19:58:12 localhost systemd-tmpfiles[1619]: /usr/lib/tmpfiles.d/var.conf:19: Duplicate line for path "/var/cache", ignoring.
Nov 16 19:58:12 localhost systemd-tmpfiles[1619]: /usr/lib/tmpfiles.d/var.conf:21: Duplicate line for path "/var/lib", ignoring.
Nov 16 19:58:12 localhost systemd-tmpfiles[1619]: /usr/lib/tmpfiles.d/var.conf:23: Duplicate line for path "/var/spool", ignoring.

Both tmp.conf and var.conf come from systemd, but the auto-generation we do comes from filesystem.

Hmm, one way to fix https://github.com/coreos/rpm-ostree/issues/26#issuecomment-1799538899 is to prefix our auto-generated dropins with e.g. zzzz- to ensure they're ordered last. That wouldn't fix the warnings, but at least the right entry would be given priority.

Fixing the warnings I think would require doing this in a postprocessing pass after all the packages are installed where we scan and collect all the non-generated entries, then hardlink-break & edit the auto-generated dropins to filter out duplicate entries.

jlebon commented 11 months ago

One interesting note while looking at this: systemd-tmpfiles warns only when there are incompatible duplicate lines. Otherwise, it doesn't mind them. (https://github.com/systemd/systemd/blob/905dd9d6e644fbcf12185cd45bbd05e08ac0e926/src/tmpfiles/tmpfiles.c#L3903-L3907 -> https://github.com/systemd/systemd/blob/905dd9d6e644fbcf12185cd45bbd05e08ac0e926/src/tmpfiles/tmpfiles.c#L3413-L3414).

We do have instances of generated tmpfiles.d dropins that duplicate dropins within the same package, but they're just silently ignored. E.g.:

[root@cosa-devsh tmpfiles.d]# cat mdadm.conf
d /run/mdadm 0710 root root -
[root@cosa-devsh tmpfiles.d]# cat pkg-mdadm.conf
d /run/mdadm 0710 root root - -
jlebon commented 11 months ago

Fixing the warnings I think would require doing this in a postprocessing pass after all the packages are installed where we scan and collect all the non-generated entries, then hardlink-break & edit the auto-generated dropins to filter out duplicate entries.

Or something like:

  1. at assembly time, skip checkout of auto-generated dropins and instead collect the entries in them
  2. once assembled, collect all the entries from the non-auto-generated dropins in the rootfs
  3. write a single e.g. rpm-ostree-autoconverted.conf with all the non-duplicate entries

Also awkward, but more elegant than post-editing checked out files I think.

HuijingHei commented 11 months ago

Fixing the warnings I think would require doing this in a postprocessing pass after all the packages are installed where we scan and collect all the non-generated entries, then hardlink-break & edit the auto-generated dropins to filter out duplicate entries.

Or something like:

  1. at assembly time, skip checkout of auto-generated dropins and instead collect the entries in them
  2. once assembled, collect all the entries from the non-auto-generated dropins in the rootfs
  3. write a single e.g. rpm-ostree-autoconverted.conf with all the non-duplicate entries

Also awkward, but more elegant than post-editing checked out files I think.

Thanks @jlebon for the pointer, will look at it.

jlebon commented 11 months ago

Something a bit more concrete/detailed:

  1. in the importer code, change the filename to our auto-generated tmpfiles.d entries to e.g. rpm-ostree-autovar-$pkg.conf
  2. in checkout_filter() in rpmostree-core.cxx, do something like:
    • if path is under /usr/lib/tmpfiles.d/:
      • read in all the entries in the config (we don't need a full parser here, just enough to pick up the path field)
      • if path has prefix 'rpm-ostree-autovar-`, add those entries to hash table A and skip checkout
      • otherwise, add the entries to hash table B and allow checkout
  3. in rpmostree_context_assemble(), after all the packages have been checked out, call a function that creates a new rpm-ostree-2-pkg-autovar.conf containing all the entries in hash table A that are not in B
HuijingHei commented 11 months ago

Thanks @jlebon for the guidance.

For the auto-generated tmpfiles.d entries, we are using pkg-$pkg.conf, maybe we can still use it?

The tricky thing here is checkout_filter() will be triggered if there is files_skip || files_remove_regex, else checkout_filter() will be skipped. For example, when checkout package filesystem, the files_skip is NULL, we can not trigger checkout_filter(), so can not skip /usr/lib/tmpfiles.d/pkg-filesystem.conf.

jlebon commented 11 months ago

Had a real-time chat with @HuijingHei about this. One thing we realized is that in the client-side case, the base checkout doesn't go through filtering. And adding a checkout filter to the base just for this feels like overhead. Now, we don't need to have this working client-side too, but if we can have it work both server and client-side for the same amount of work, that seems worth it.

So the suggestion I made is to do it all in step 3 essentially. I.e. from rpmostree_context_assemble(), we call a function like rpmostreecxx::deduplicate_tmpfiles_entries(tmprootfs_dfd), which scans the dropins, sorts them into two HashMaps and then unlink all the pkg-* files and create a single pkg-rpm-ostree-autovar.conf (re-using the same prefix ensures that this same logic works client-side).

For the auto-generated tmpfiles.d entries, we are using pkg-$pkg.conf, maybe we can still use it?

Yeah, I'm OK with that. The reason I suggested changing the prefix is so that it's more obviously rpm-ostree-owned. In practice, at least in FCOS right now, no package ships a dropin in the form pkg-....

HuijingHei commented 10 months ago

Thanks @jlebon for the pointer, this is really helpful to me.

If put rpmostreecxx::deduplicate_tmpfiles_entries in rpmostree_context_assemble(), there is one duplicated entry /var/lib both in rpm-ostree-0-integration.conf and rpm-ostree-1-autovar.conf that are added in postprocess_final after rpmostree_context_assemble, which we can not remove.

But indeed this can fix https://github.com/coreos/rpm-ostree/issues/26#issuecomment-1799538899.

$ grep "/var/lib " *
rpm-ostree-0-integration.conf:d /var/lib 0755 root root -
rpm-ostree-1-autovar.conf:d /var/lib 0755 root root - -
var.conf:d /var/lib 0755 - - -

$ journalctl --grep 'Duplicate line'
Nov 24 12:59:02 localhost systemd-tmpfiles[1591]: /usr/lib/tmpfiles.d/var.conf:21: Duplicate line for path "/var/lib", ignoring.

If want to also remove the duplicated entry /var/lib, should put rpmostreecxx::deduplicate_tmpfiles_entries to postprocess_final after adding rpm-ostree-1-autovar.conf and rpm-ostree-0-integration.conf, also need to add prefix pkg- to both the conf names. Finally we only get a single pkg-rpm-ostree-autovar.conf (without duplication) which contains all auto-generated tmpfiles.d entries, also rpm-ostree-1-autovar.conf and rpm-ostree-0-integration.conf.

WDYT?

cgwalters commented 10 months ago

Longer term, perhaps we should back away from the unified-core style auto-translating to tmpfiles.d snippets and just do a general postprocessing version.

The problem is we are going to need a general postprocessing version for container use cases, so I'm thinking it may be simpler in the end to just have one path.

And when we do that, it gets a bit simpler to avoid duplication.

jlebon commented 10 months ago
$ grep "/var/lib " *
rpm-ostree-0-integration.conf:d /var/lib 0755 root root -
rpm-ostree-1-autovar.conf:d /var/lib 0755 root root - -
var.conf:d /var/lib 0755 - - -

Yeah, I think we should just nuke the /var/lib entries from both of those files.

If want to also remove the duplicated entry /var/lib, should put rpmostreecxx::deduplicate_tmpfiles_entries to postprocess_final after adding rpm-ostree-1-autovar.conf and rpm-ostree-0-integration.conf, also need to add prefix pkg- to both the conf names.

rpm-ostree-0-integration.conf contains legitimate OSTree convention symlinks that should be owned by us. So it's not clear to me that we should also defer to other packages for that (which is what would happen if we prefix with pkg-). I think I'd rather leave it and let warnings happen so we notice if that happens. But yeah, as mentioned above, the /var/lib entry should be removed at this point so we rely on systemd's config.

For rpm-ostree-1-autovar.conf, I'm 80% sure we could remove that file in unified mode as per https://github.com/coreos/rpm-ostree/blob/0a573f5195ec24ea469ada77f2a91765ecffec18/rust/src/composepost.rs#L700-L709 but for now at least, it's safe to make it a pkg- prefix.

jlebon commented 10 months ago

Longer term, perhaps we should back away from the unified-core style auto-translating to tmpfiles.d snippets and just do a general postprocessing version.

The problem is we are going to need a general postprocessing version for container use cases, so I'm thinking it may be simpler in the end to just have one path.

Hmm, though in the container use case, we're not going through the importer at all, no? So will we have an issue there? Even so, I think https://github.com/coreos/rpm-ostree/pull/4697 is quite general; it just takes a rootfs dfd and does the conversion.

Edit: ahh, are you thinking about the case where you layer a package that ships a tmpfiles.d dropin that has entries that conflict with an auto-generated one from the base compose?

HuijingHei commented 10 months ago

What I should do is in postprocess_final, call deduplicate_tmpfiles_entries to get a single pkg-rpm-ostree-autovar.conf which contains all auto-generated tmpfiles.d entries, which also includes all entries in rpm-ostree-1-autovar.conf and rpm-ostree-0-integration.conf. Is this what you meant?

jlebon commented 10 months ago

For rpm-ostree-0-integration.conf I think we can just do

diff --git a/src/app/rpm-ostree-0-integration.conf b/src/app/rpm-ostree-0-integration.conf
index c5fd009a..1e9822ef 100644
--- a/src/app/rpm-ostree-0-integration.conf
+++ b/src/app/rpm-ostree-0-integration.conf
@@ -14,5 +14,4 @@ d /var/usrlocal/share 0755 root root -
 d /var/usrlocal/src 0755 root root -
 d /var/mnt 0755 root root -
 d /run/media 0755 root root -
-d /var/lib 0755 root root -
 L /var/lib/rpm - - - - ../../usr/share/rpm

For rpm-ostree-1-autovar.conf... OK right, looking at what remains in it now in unified core mode, we have:

L /var/lib/alternatives - - - - ../../usr/lib/alternatives
L /var/lib/rpm - - - - ../../usr/share/rpm
L /var/lib/vagrant - - - - ../../usr/lib/vagrant
d /var/lib 0755 root root - -

Almost all of those duplicate entries in other more authoritative tmpfiles dropins. The only one that remains is /var/lib/vagrant which... honestly, at this point we should just try to get the package fixed if it's still relevant so I'd vote to just stop carrying this ourselves.

So I guess I'm leaning more now towards just pulling the trigger on not calling convert_var_to_tmpfiles_d() or generating rpm-ostree-1-autovar.conf at all when in unified core mode.

And then once we do that, I think we can just leave the call to deduplicate_tmpfiles_entries() where you have it currently for now in the core.

HuijingHei commented 10 months ago

So I guess I'm leaning more now towards just pulling the trigger on not calling convert_var_to_tmpfiles_d() or generating rpm-ostree-1-autovar.conf at all when in unified core mode.

Thanks @jlebon , tried this but get error like:

   installroot: + ostree --repo=/var/tmp/test.JHW2Mm/repo ls fedora/x86_64/coreos/testing-devel /var/lib/rpm
   installroot: + echo found /var/lib/rpm in commit

   basic: Committing...done
   basic: error: Writing commit: While writing rootfs to mtree: Failed to look up SELinux label for '/var/lib/nfs/rpc_pipefs'

Add convert_var_to_tmpfiles_d() and append the files in delete some files list to https://github.com/coreos/rpm-ostree/blob/ae2002cf009d51cf4c551c4f469dfc0c7e6052bd/rust/src/composepost.rs#L729-L736

        // dup in pkg-rpm-ostree-autovar.conf
        "var/lib/alternatives",
        // dup in rpm-ostree-0-integration.conf
        "var/lib/rpm",
        "var/lib/vagrant",

These files would be deleted before converting path, so the entries will not be written in rpm-ostree-1-autovar.conf, it might be risky to remove var/lib directly.

But can still get duplicated /var/lib:

$ cat rpm-ostree-1-autovar.conf
d /var/lib 0755 root root - -
HuijingHei commented 10 months ago

Think more about this, one concern is how to skip trigger deduplicate_tmpfiles_entries() if there is nothing changes under /usr/lib/tmpfiles.d/, and it is difficult to remove the related tmpfiles entries if we run rpm-ostree override remove the package.

jlebon commented 10 months ago

Ahhh yes, good point. convert_var_to_tmpfiles_d() doesn't only create tmpfiles.d entries but also empties out /var as it goes. So I think what I'm saying is that in unified core mode, we should be able to just directly delete everything in /var. E.g. something like

diff --git a/src/libpriv/rpmostree-postprocess.cxx b/src/libpriv/rpmostree-postprocess.cxx
index 0b0b33d4..ed450ddb 100644
--- a/src/libpriv/rpmostree-postprocess.cxx
+++ b/src/libpriv/rpmostree-postprocess.cxx
@@ -425,7 +425,20 @@ postprocess_final (int rootfs_dfd, rpmostreecxx::Treefile &treefile, gboolean un

   ROSCXX_TRY (rootfs_prepare_links (rootfs_dfd), error);

-  ROSCXX_TRY (convert_var_to_tmpfiles_d (rootfs_dfd, *cancellable), error);
+  if (!unified_core_mode)
+    ROSCXX_TRY (convert_var_to_tmpfiles_d (rootfs_dfd, *cancellable), error);
+  else
+    {
+      /* In unified core mode, /var entries are converted to tmpfiles.d at
+       * import time and scriptlets are prevented from writing to /var. What
+       * remains is just the compat symlinks that we created ourselves, which we
+       * should stop writing since it duplicates other tmpfiles.d entries. */
+      if (!glnx_shutil_rm_rf_at (rootfs_dfd, "var", cancellable, error))
+        return FALSE;
+      /* but we still want the mount point as part of the OSTree commit */
+      if (!mkdirat (rootfs_dfd, "var", 0755))
+        return glnx_throw_errno_prefix (error, "mkdirat(var)");
+    }

   if (!rpmostree_rootfs_postprocess_common (rootfs_dfd, cancellable, error))
     return FALSE;

But let me think through it a bit more tomorrow and verify that this is safe to do.

Think more about this, one concern is how to skip trigger deduplicate_tmpfiles_entries() if there is nothing changes under /usr/lib/tmpfiles.d/, and it is difficult to remove the related tmpfiles entries if we run rpm-ostree override remove the package.

Yes, good insight. I think that means the "prefix our output with pkg- so it gets read client-side" trick isn't quite right. Probably the way we should look at this is that rpm-ostree-autovar.conf is a derived result and we need to make sure we can re-derive it from scratch client-side if e.g. the user removes or replaces a package.

So maybe we can tweak it like this:

HuijingHei commented 10 months ago

Thanks @jlebon for the pointer, will try this.

jlebon commented 10 months ago

But let me think through it a bit more tomorrow and verify that this is safe to do.

So yeah, I do think this is safe to do. For reference, I verified /var is what we'd expect at that point in the compose:

# ls -lR var
var:
total 0
drwxr-xr-x. 1 root root 44 Nov 30 11:22 lib

var/lib:
total 12
lrwxrwxrwx. 1 root root 26 Nov 30 11:12 alternatives -> ../../usr/lib/alternatives
lrwxrwxrwx. 1 root root 19 Nov 30 11:22 rpm -> ../../usr/share/rpm
lrwxrwxrwx. 1 root root 21 Nov 30 11:12 vagrant -> ../../usr/lib/vagrant

So nothing surprising there; it basically matches the tmpfiles.d entries we derived into rpm-ostree-1-autovar.conf.

Note though that we're still creating those symlinks at assembly time so that they're part of the scriptlet environments. We're just not converting them to yet another tmpfiles.d dropin.

Also, the importer currently has code to automatically generate an entry for the /var/lib/vagrant symlink when importing vagrant. (It's the same logic that generates an entry for /var/lib/alternatives when composing server-side.) I've verified layering vagrant client-side still works and the symlink shows up.

I think one thing we could do in the future is run systemd-tmpfiles to populate /var during assembly so the tmpfiles.d dropins are canonical even for scriptlet environments. That should allow us to drop rootfs_prepare_links() entirely.