coreos / fedora-coreos-tracker

Issue tracker for Fedora CoreOS
https://fedoraproject.org/coreos/
262 stars 59 forks source link

Ship dnf in FCOS and RHCOS #1687

Open jlebon opened 6 months ago

jlebon commented 6 months ago

As part of the bootable containers effort, we want to put emphasis on a consistent experience when deriving images. A big part of that is being able to type dnf install -y ... in one's Containerfile, just like one can when building an app container. On the client-side, dnf cannot do much right now. However, it does work if /usr is read-write (either using e.g. ostree admin unlock/bootc usr-overlay or if transient root is enabled). So basically, having dnf in FCOS/RHCOS would give us:

Note also dnf currently ships in the tier-1 c9s and Fedora ELN images for the same reasons.

Additionally, there are plans to move functionality from rpm-ostree to dnf for a more unified experience. So eventually, dnf will also allow more host management features.

The default dnf4 in Fedora is Python-based, which we're currently blocking from Fedora CoreOS. However, dnf5, currently planned to be the default for Fedora 41, no longer pulls in Python.

So concretely, the proposal is to:

That said, ideally doing a dnf operation on the client-side that requires write access would give a useful error that says to either unlock /usr or use rpm-ostree depending on what the user wants (transient vs permanent).

cgwalters commented 6 months ago

Sounds good to me.

travier commented 6 months ago

Does this work as expected compared to the current rpm-ostree commands?

Could we even redirect rpm-ostree commands to dnf when not run "live", i.e. in container builds?

jlebon commented 6 months ago

Does this work as expected compared to the current rpm-ostree commands?

Could we even redirect rpm-ostree commands to dnf when not run "live", i.e. in container builds?

Yeah, possibly... We should keep rpm-ostree install working to not break people, but ideally we try to redirect them to use dnf install instead directly. (Maybe add a delay? Though that's easy to miss if your builds are automated.)

jbtrystram commented 6 months ago

The following was agreed today in the Fedora CoreOS community meeting :

AGREED: @jlebon will add dnf5 in the rawhide stream to test it out. We will add to the f41 branching checklist to reconsider this and evaluate at this point in time.

jlebon commented 6 months ago

As mentioned above, we'll rediscuss this at f41 branching time, at which point we should have more information around dnf, rpm-ostree, and the bootable containers effort. And it'll either have landed in Fedora as default or not yet. Also, hopefully by then we'll have a better error message from dnf install client-side (will try to see if we can get a dnf issue to link here).

jlebon commented 6 months ago

https://github.com/coreos/fedora-coreos-config/pull/2915 adds dnf5 to rawhide.

Verifying it works in unlocked mode:

root@cosa-devsh:~# ostree admin unlock
Development mode enabled.  A writable overlayfs is now mounted on /usr.
All changes there will be discarded on reboot.
root@cosa-devsh:~# dnf5 install -y strace
Updating and loading repositories:
 Fedora rawhide openh264 (From Cisco) - x86_64                                 100% |   5.1 KiB/s |   2.6 KiB |  00m01s
 Fedora - Rawhide - Developmental packages for the next Fedora release         100% |  24.6 MiB/s |  21.0 MiB |  00m01s
Repositories loaded.
Package                                Arch       Version                                Repository                Size
Installing:
 strace                                x86_64     6.7-1.fc40                             rawhide                2.4 MiB

Transaction Summary:
 Installing:        1 packages

Total size of inbound packages is 1 MiB. Need to download 1 MiB.
After this operation 2 MiB will be used (install 2 MiB, remove 0 B).
[1/1] strace-0:6.7-1.fc40.x86_64                                               100% |  27.6 MiB/s |   1.4 MiB |  00m00s
-----------------------------------------------------------------------------------------------------------------------
[1/1] Total                                                                    100% |   4.1 MiB/s |   1.4 MiB |  00m00s
Running transaction
Importing PGP key 0xE99D6AD1:
 Userid     : "Fedora (41) <fedora-41-primary@fedoraproject.org>"
 Fingerprint: 466CF2D8B60BC3057AA9453ED0622462E99D6AD1
 From       : file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-rawhide-x86_64
The key was successfully imported.
Importing PGP key 0x105EF944:
 Userid     : "Fedora (42) <fedora-42-primary@fedoraproject.org>"
 Fingerprint: B0F4950458F69E1150C6C5EDC8AC4916105EF944
 From       : file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-42-x86_64
The key was successfully imported.
[1/3] Verify package files                                                     100% | 200.0   B/s |   1.0   B |  00m00s
[2/3] Prepare transaction                                                      100% |  29.0   B/s |   1.0   B |  00m00s
[3/3] Installing strace-0:6.7-1.fc40.x86_64                                    100% |  32.1 MiB/s |   2.4 MiB |  00m00s
>>> Running trigger-install scriptlet: glibc-common-0:2.39.9000-9.fc41.x86_64
>>> Stop trigger-install scriptlet: glibc-common-0:2.39.9000-9.fc41.x86_64
root@cosa-devsh:~# strace -h
Usage: strace [-ACdffhikkqqrtttTvVwxxyyzZ] [-I N] [-b execve] [-e EXPR]...
...

Verifying it works in layering (using cosa buildextend-layered just to make it easier to test, but it's really just podman build && skopeo copy):

$ cat src/config/Containerfile.foobar
FROM overridden
RUN dnf5 install -y strace && ostree container commit
$ cosa buildextend-layered foobar
$ cosa run
[root@cosa-devsh ~]# rpm-ostree rebase ostree-unverified-image:oci-archive:$(ls /mnt/workdir/builds/latest/x86_64/*foobar*.ociarchive)
Pulling manifest: ostree-unverified-image:oci-archive:/mnt/workdir/builds/latest/x86_64/fedora-coreos-41.20240321.dev.0-layered-foobar.x86_64.ociarchive
Importing: ostree-unverified-image:oci-archive:/mnt/workdir/builds/latest/x86_64/fedora-coreos-41.20240321.dev.0-layered-foobar.x86_64.ociarchive (digest: sha256:5fd2a64cb39638448a22580201c50879f310e0146d52aa65da9e4d2960949dc3)
ostree chunk layers needed: 51 (757.1 MB)
custom layers needed: 1 (10.5 MB)
Staging deployment... done
Added:
  strace-6.7-1.fc40.x86_64
Changes queued for next boot. Run "systemctl reboot" to start a reboot
[root@cosa-devsh ~]# reboot
...
[root@cosa-devsh ~]# strace -h
Usage: strace [-ACdffhikkqqrtttTvVwxxyyzZ] [-I N] [-b execve] [-e EXPR]...
dustymabe commented 6 months ago

hmm. I feel like requiring the user to type dnf5 doesn't really achieve what we want here. The reason I bring this up is I don't know how much value it brings to not go all the way to parity with what the user does today (which is just use dnf). Could we add a symlink?

On another topic it might be nice to analyze the package addition to at least have the questions answered like we do for any new package: https://github.com/coreos/fedora-coreos-tracker/blob/main/.github/ISSUE_TEMPLATE/new-package.yml

jlebon commented 6 months ago

hmm. I feel like requiring the user to type dnf5 doesn't really achieve what we want here. The reason I bring this up is I don't know how much value it brings to not go all the way to parity with what the user does today (which is just use dnf). Could we add a symlink?

Hmm, my understanding is that it'll be dnf once it's the default in Fedora. I don't expect users to ever actually have to type dnf5. Since the rawhide bit is just for testing, it didn't seem as crucial to add a symlink. I can if wanted, but it seems useful also to just track its state as it is in Fedora.

On another topic it might be nice to analyze the package addition to at least have the questions answered like we do for any new package: main/.github/ISSUE_TEMPLATE/new-package.yml

Good point, will do!

dustymabe commented 6 months ago

Hmm, my understanding is that it'll be dnf once it's the default in Fedora. I don't expect users to ever actually have to type dnf5. Since the rawhide bit is just for testing, it didn't seem as crucial to add a symlink. I can if wanted, but it seems useful also to just track its state as it is in Fedora.

Yeah. That's what I'm wondering here too. Is the point that we want people to pick this up and try it (in which case we'd want them to use dnf and not have to understand or type dnf5) or is the point to just test it?

jlebon commented 6 months ago

Symlink added in https://github.com/coreos/fedora-coreos-config/pull/2915.

Diff:

Added:
  dnf-data-4.19.0-1.fc40.noarch
  dnf5-5.1.15-2.fc41.x86_64
  libdnf5-5.1.15-2.fc41.x86_64
  libdnf5-cli-5.1.15-2.fc41.x86_64

Sizes:

$ rpm -q dnf-data dnf5 libdnf5 libdnf5-cli --qf '%{name}: %{size}\n'
dnf-data: 39753
dnf5: 1544753
libdnf5: 2969648
libdnf5-cli: 563658

So ~4.5M, which is surprisingly larger than I expected.

travier commented 6 months ago

Maybe we should make a Fedora Change Request and land this for Fedora Atomic Desktops as well at the same time? (and IoT?)

travier commented 6 months ago

We've met with @dcantrell to discuss this. The current focus on the code changes on the DNF side is going to happen on the dnf4 branch, and ideally that would be the version that we include before moving to dnf5 once the changes land there too.

As dnf4 depends on Python, and we really don't want to include python (https://github.com/coreos/fedora-coreos-tracker/issues/32), a temporary option for Fedora CoreOS would be to include microdnf instead and set it up as dnf. Other rpm-ostree variants that already include Python will include dnf4 directly.

We did this in the past in https://github.com/coreos/fedora-coreos-tracker/issues/1050.

@jmarrero is working on a change proposal to clarify that.

jlebon commented 6 months ago

I'm not sure I follow that. If dnf5 is set to be the default, why would we not ship that in FCOS to match the rest of Fedora?

Between getting better error messages sooner into FCOS and being consistent across Fedora variants, I'd rather the latter. Of course also there's the feature gap between microdnf vs dnf. For the longer-term stuff around folding rpm-ostree functionality into dnf5, I think it's understood that'll take some time and that's OK.

Doing any work on microdnf, especially if it's just for FCOS, seems like not the best use of time. Maybe we could help out with at least the error message improvements in dnf5?

(I'd prefer seeing other OSTree variants also ship dnf5 for consistency, but yes the Python question changes the calculus there a bit.)

yasminvalim commented 6 months ago

In FCOS meeting today we decided to not discuss about this topic in the current meeting, I will keep the meeting label in case people want to discuss in the next ones.

jlebon commented 5 months ago

RHCOS side of this: https://github.com/openshift/os/pull/1476

travier commented 1 month ago

We've discussed this topic in today's community meeting.

Unfortunately, dnf5 did not yet get better error messages when running on ostree/bootc systems and it's unlikely that it will get them in time for F41.

travier commented 1 week ago

The "error messages" work in dnf did not land in Fedora 41, thus users are confused: https://discussion.fedoraproject.org/t/f41-kinoite-native-container-dnf5-pgp-key-import-error/131946

jlebon commented 1 week ago

The "error messages" work in dnf did not land in Fedora 41, thus users are confused: discussion.fedoraproject.org/t/f41-kinoite-native-container-dnf5-pgp-key-import-error/131946

I filed https://github.com/rpm-software-management/dnf5/issues/1727 to have this be tracked upstream.

travier commented 1 week ago

Ah, good catch, I forgot that those changes were for dnf4 and not dnf5.