coreos / rpm-ostree

⚛📦 Hybrid image/package system with atomic upgrades and package layering
https://coreos.github.io/rpm-ostree
Other
856 stars 195 forks source link

unified core 🌐 migration #729

Open cgwalters opened 7 years ago

cgwalters commented 7 years ago

Update: https://github.com/projectatomic/rpm-ostree/pull/940 has been merged, but more work remains.


When rpm-ostree was first created, the "build process" was basically "yum install --installroot + ostree commit". This was very simplistic obviously, but worked as a minimum viable product.

Since then, the project has grown much, much more sophisticated. In particular, "client side" package layering when it first landed worked in a different way where packages are imported into OSTree commits. Basically since then, rpm-ostree has grown a significant re-implementation of major parts of librpm. For example, we isolate scripts in containers.

The default for rpm-ostree compose tree is still the "old" path, and we'd like to drop it.

This does change some corner cases in behavior of the system, so out of conservatism it's still not enabled by default. We would like to eventually make it the default and drop the old path.

Porting

The simple answer is add --unified-core and it may just work.

However, trying this out in e.g. a bare mock root, selinux-policy is now a requirement. And you really want to use --cachedir. Also, /dev/fuse becomes a requirement.

For best practices around this, the coreos-assembler sources are useful.

See: https://github.com/coreos/rpm-ostree/pull/1793

cgwalters commented 6 years ago

Keeping this issue open for things to do to move --ex-unified-core to --unified-core, and then probably some time later flip the defaults so we have rpm-ostree compose tree --compat-libdnf.

jlebon commented 5 years ago

and then probably some time later flip the defaults so we have rpm-ostree compose tree --compat-libdnf.

Yeah, I think we're almost ready for that, right? Definitely still some minor issues to figure out (e.g. the locale issue we're seeing in RHCOS), but overall it's been pretty solid so far. Need to figure out how this intersects with Silverblue/FAH/IoT as well (have you tried composing a Silverblue tree in unified mode yet?)

martinpitt commented 4 years ago

I noticed the deprecation message when building my own ostree, and tried to add --unified-core. However, this fails with

error: importing RPMs: Importing package 'librados2': Importing archive: Writing content object: min-free-space-percent '3%' would be exceeded, at least 10,9 MB requested

This doesn't happen with the default. This isn't mentioned here or in the linked #1793

cgwalters commented 4 years ago

What's the backing filesystem like for the repository? Is it pretty low on free space or not? Could it have been transiently full?

martinpitt commented 4 years ago

@cgwalters: I use a dynamic tmpfs, as (1) I'm not interested in permanently keeping it around, and (2) writing to disk is too slow:

tmpfs                    7,8G  4,0K  7,8G   1% /var/tmp/repo

My entire tree is < 2G, so far this has always worked fine.

I tried this again without a tmpfs on /var/home/repo, and that's bigger:

/dev/dm-2                189G  124G   56G  69% /var/home

but I get exactly the same problem, except on "kernel-core" instead of "librados2"; but I got it on different packages with tmpfs as well, there seems to be some randomness there.

cgwalters commented 4 years ago

Ah, I think I see the problem. With --unified-core, unless you specify --cachedir, we create one under /var/tmp. And that's probably on a limited space filesystem? (I personally strongly advocate using a mount point for /var btw and not just /var/home on ostree systems)

So does it work to specify --cachedir=/var/tmp/repo/cache or so?

BTW at a much higher level...rpm-ostree today is not at all optimized for this "compose repo locally" flow - but if you use unified core, it can be much better because the packages are cached in the ostree repo itself, so you only download changed packages.

The most in line though with rpm-ostree today is to use a separate server for building, or to switch to e.g. using a pre-shipped system FCOS/FSB as a base system and layering on things you want.

EDIT: I think we should warn when we're auto-creating a cachedir in --unified-core mode, will do a PR.

martinpitt commented 4 years ago

@cgwalters: My compose.sh has always used --cachedir=/var/cache/ostree (on the root fs), as indeed keeping a permanent cache is really useful. And argh, indeed that was full -- over time that cache aggregated over 11 GB of stuff (apparently there's no automatic cleaning). After cleaning it up, taht indeed doesn't happen any more with --unified-core, including my usual "repo on tmpfs" approach. brown paperpag, I'm sorry for the noise!

It now fails with

error: Running %post for dpkg: Executing bwrap(/bin/sh): Child process killed by signal 1

but I better file this as a separate bug, instead of littering this one. Thanks for your help!y

Update: I took out dpkg-dev and python3-debian from my ostree compose, and it works fine now.

cgwalters commented 4 years ago

And argh, indeed that was full -- over time that cache aggregated over 11 GB of stuff (apparently there's no automatic cleaning).

Yeah...one of the two hard problems in computer science :wink:

In coreos-assembler we have this issue related to this: https://github.com/coreos/coreos-assembler/issues/1495

error: Running %post for dpkg: Executing bwrap(/bin/sh): Child process killed by signal 1

That indeed looks exactly like a unified core compat issue, and indeed all the stuff in the package post is not allowed. In rpm-ostree we keep the RPM db in /usr of course. This use case for dpkg is weird because it's not really the system package manager...could make more sense to have it only be created on demand instead.

Alternatively a simple patch might be to just if ! test -w /var; then exit 0; fi.

See also https://bugzilla.redhat.com/show_bug.cgi?id=1352154

cgwalters commented 4 years ago

And argh, indeed that was full -- over time that cache aggregated over 11 GB of stuff (apparently there's no automatic cleaning).

To extend on this slightly though...rpm-ostree does correctly handle caching for layered packages in the client case - it's basically "walk over deployments, query their rpmdb, add those to referenced set, then prune unreferenced pkgs". But on the server side we'd need to do it based on commit objects or so, and then we still need to handle things like the repodata which we're not committing into ostree today (though it could make sense).

This is just another variant of the "rpm-ostree is optimized for server side composes and client side layering, not client side composes yet".

martinpitt commented 4 years ago

Thanks @cgwalters! Indeed this is https://bugzilla.redhat.com/show_bug.cgi?id=1817258 . It's not crucial for me, I can run autopkgtest in toolbox (or so I hope..). Working now \o/

henrywang commented 4 years ago

I saw a lot of this kind of log when I build my own fedora 32 compose with --unified-core added.

systemd.prein: (2020-09-16  7:51:23:163230): [sss_cache] [confdb_init] (0x0010): Unable to open config database [/var/lib/sss/db/config.ldb]
systemd.prein: Could not open available domains
systemd.prein: groupadd: sss_cache exited with status 5
systemd.prein: groupadd: Failed to flush the sssd cache.
henrywang commented 4 years ago

I built my own fedora 32 compose with rpm-ostree compose tree --cachedir=/var/cache/ostree --repo=/var/tmp/repo xiaofeng-desktop.yaml. And compose/commit can boot without issue. When I add --unified-core option, compose/commit can't boot. And have some strange logs. This kind of log can't be found without --unified-core option.

Running post scripts... openssl-libs
util-linux.post: mkdir: cannot create directory ‘/var/log’: Read-only file system
util-linux.post: touch: cannot touch '/var/log/lastlog': No such file or directory
util-linux.post: chown: cannot access '/var/log/lastlog': No such file or directory
⠠ Running post scripts... systemd-udev
systemd-udev.post: Failed to create directory /var/lib/systemd: Read-only file system

And

Running post scripts... cups
cups.post: Created symlink /etc/systemd/system/multi-user.target.wants/cups.path → /usr/lib/systemd/system/cups.path.
cups.post: Created symlink /etc/systemd/system/sockets.target.wants/cups.socket → /usr/lib/systemd/system/cups.socket.
cups.post: /usr/bin/touch: cannot touch '/var/log/cups/error_log': No such file or directory
cups.post: /usr/bin/ls: cannot access '/var/log/cups/error_log': No such file or directory
cups.post: /usr/bin/tail: cannot open '/var/log/cups/error_log' for reading: No such file or directory
cups.post: /usr/cups.post: line 49: /var/log/cups/error_log: No such file or directory
cups.post: /usr/bin/touch: cannot touch '/var/log/cups/access_log': No such file or directory
cups.post: /usr/bin/ls: cannot access '/var/log/cups/access_log': No such file or directory
cups.post: /usr/bin/tail: cannot open '/var/log/cups/access_log' for reading: No such file or directory
cups.post: /usr/cups.post: line 49: /var/log/cups/access_log: No such file or directory
cups.post: /usr/bin/touch: cannot touch '/var/log/cups/page_log': No such file or directory
cups.post: /usr/bin/ls: cannot access '/var/log/cups/page_log': No such file or directory
cups.post: /usr/bin/tail: cannot open '/var/log/cups/page_log' for reading: No such file or directory
cups.post: /usr/cups.post: line 49: /var/log/cups/page_log: No such file or director⠤