coreos / rpm-ostree

⚛📦 Hybrid image/package system with atomic upgrades and package layering
https://coreos.github.io/rpm-ostree
Other
874 stars 197 forks source link

Rebase to container image without resetting can lead to corrupted ostree commit #4791

Open karuboniru opened 10 months ago

karuboniru commented 10 months ago

Describe the bug

I am trying to switch a silverblue system from layering to container image, and this end up with a broken commit

Reproduction steps

  1. Start with a system with some overrrides:

    State: busy
    AutomaticUpdates: stage; rpm-ostreed-automatic.timer: no runs since boot
    Transaction: rebase ostree-unverified-registry:quay.io/karuboniru/ostree-container:master 
    Initiator: client(id:cli dbus:1.181 unit:run-r9b250ced442745ab9b05f736add55d3b.scope uid:1000)
    BootedDeployment:
    ● fedora:fedora/39/x86_64/testing/silverblue
                  Version: 39.20240120.0 (2024-01-20T00:37:56Z)
               BaseCommit: 0076da5b7ad2786c932a905346d5f76e20a1bda56135c6f0012cd1bb167c2dfd
             GPGSignature: Valid signature by E8F23996F23218640CB44CBE75CF5AC418B8E74C
      RemovedBasePackages: firefox firefox-langpacks 121.0.1-1.fc39
          LayeredPackages: fcitx5 fcitx5-autostart fcitx5-chinese-addons fcitx5-configtool fcitx5-gtk fcitx5-mozc fcitx5-qt zsh
  2. Rebase to a container image, with simliar package list

    rpm-ostree rebase ostree-unverified-registry:quay.io/karuboniru/ostree-container:master
    Pulling manifest: ostree-unverified-registry:quay.io/karuboniru/ostree-container:master
    Importing: ostree-unverified-registry:quay.io/karuboniru/ostree-container:master (digest: sha256:8cbe33d5beeedfe2581ab56864056cb6c471b63c4b7f17f55983efd4c71d6387)
    ostree chunk layers needed: 65 (2.0 GB)
    custom layers needed: 2 (185.4 MB)
    Checking out tree dde2395... done
    Inactive base removals:
    firefox
    firefox-langpacks
    Inactive requests:
    fcitx5-gtk (already provided by fcitx5-gtk-5.1.1-1.fc39.x86_64)
    fcitx5-mozc (already provided by fcitx5-mozc-2.17.2102.102.1-28.20230508git242b4f7.fc39.x86_64)
    fcitx5-chinese-addons (already provided by fcitx5-chinese-addons-5.1.3-4.fc39.x86_64)
    zsh (already provided by zsh-5.9-9.fc39.x86_64)
    fcitx5-qt (already provided by fcitx5-qt-5.1.4-4.fc39.x86_64)
    fcitx5-autostart (already provided by fcitx5-autostart-5.1.7-1.fc39.noarch)
    fcitx5-configtool (already provided by fcitx5-configtool-5.1.3-3.fc39.x86_64)
    fcitx5 (already provided by fcitx5-5.1.7-1.fc39.x86_64)
    Staging deployment... done
    Upgraded:
    ipp-usb 0.9.23-5.fc39 -> 0.9.24-1.fc39
    libnfsidmap 1:2.6.4-0.rc2.fc39 -> 1:2.6.4-0.rc3.fc39
    nfs-utils 1:2.6.4-0.rc2.fc39 -> 1:2.6.4-0.rc3.fc39
    sos 4.6.0-1.fc39 -> 4.6.1-1.fc39
    Downgraded:
    ImageMagick 1:7.1.1.26-2.fc39 -> 1:7.1.1.15-1.fc39
    ImageMagick-libs 1:7.1.1.26-2.fc39 -> 1:7.1.1.15-1.fc39
    alternatives 1.26-1.fc39 -> 1.25-1.fc39
    amd-gpu-firmware 20240115-2.fc39 -> 20231211-1.fc39
    amd-ucode-firmware 20240115-2.fc39 -> 20231211-1.fc39
    at-spi2-atk 2.50.1-1.fc39 -> 2.50.0-1.fc39
    at-spi2-core 2.50.1-1.fc39 -> 2.50.0-1.fc39
    atheros-firmware 20240115-2.fc39 -> 20231211-1.fc39
    atk 2.50.1-1.fc39 -> 2.50.0-1.fc39
    bind-libs 32:9.18.21-2.fc39 -> 32:9.18.20-1.fc39
    bind-license 32:9.18.21-2.fc39 -> 32:9.18.20-1.fc39
    bind-utils 32:9.18.21-2.fc39 -> 32:9.18.20-1.fc39
    brcmfmac-firmware 20240115-2.fc39 -> 20231211-1.fc39
    buildah 1.34.0-1.fc39 -> 1.33.2-1.fc39
    container-selinux 2:2.228.1-1.fc39 -> 2:2.227.0-1.fc39
    coreutils 9.3-5.fc39 -> 9.3-4.fc39
    coreutils-common 9.3-5.fc39 -> 9.3-4.fc39
    crun 1.13-1.fc39 -> 1.12-1.fc39
    epiphany-runtime 1:45.2-1.fc39 -> 1:45.1-1.fc39
    freerdp-libs 2:2.11.4-1.fc39 -> 2:2.11.2-3.fc39
    grub2-common 1:2.06-116.fc39 -> 1:2.06-110.fc39
    grub2-efi-ia32 1:2.06-116.fc39 -> 1:2.06-110.fc39
    grub2-efi-x64 1:2.06-116.fc39 -> 1:2.06-110.fc39
    grub2-pc 1:2.06-116.fc39 -> 1:2.06-110.fc39
    grub2-pc-modules 1:2.06-116.fc39 -> 1:2.06-110.fc39
    grub2-tools 1:2.06-116.fc39 -> 1:2.06-110.fc39
    grub2-tools-minimal 1:2.06-116.fc39 -> 1:2.06-110.fc39
    gtk-update-icon-cache 3.24.40-1.fc39 -> 3.24.39-1.fc39
    gtk3 3.24.40-1.fc39 -> 3.24.39-1.fc39
    hplip 3.23.12-2.fc39 -> 3.23.8-1.fc39
    hplip-common 3.23.12-2.fc39 -> 3.23.8-1.fc39
    hplip-libs 3.23.12-2.fc39 -> 3.23.8-1.fc39
    intel-gpu-firmware 20240115-2.fc39 -> 20231211-1.fc39
    iwlegacy-firmware 20240115-2.fc39 -> 20231211-1.fc39
    iwlwifi-dvm-firmware 20240115-2.fc39 -> 20231211-1.fc39
    iwlwifi-mvm-firmware 20240115-2.fc39 -> 20231211-1.fc39
    kernel 6.6.12-200.fc39 -> 6.6.11-200.fc39
    kernel-core 6.6.12-200.fc39 -> 6.6.11-200.fc39
    kernel-modules 6.6.12-200.fc39 -> 6.6.11-200.fc39
    kernel-modules-core 6.6.12-200.fc39 -> 6.6.11-200.fc39
    kernel-modules-extra 6.6.12-200.fc39 -> 6.6.11-200.fc39
    krb5-libs 1.21.2-3.fc39 -> 1.21.2-2.fc39
    libblkid 2.39.3-2.fc39 -> 2.39.3-1.fc39
    libblockdev 3.1.0-1.fc39 -> 3.0.4-1.fc39
    libblockdev-crypto 3.1.0-1.fc39 -> 3.0.4-1.fc39
    libblockdev-fs 3.1.0-1.fc39 -> 3.0.4-1.fc39
    libblockdev-loop 3.1.0-1.fc39 -> 3.0.4-1.fc39
    libblockdev-mdraid 3.1.0-1.fc39 -> 3.0.4-1.fc39
    libblockdev-nvme 3.1.0-1.fc39 -> 3.0.4-1.fc39
    libblockdev-part 3.1.0-1.fc39 -> 3.0.4-1.fc39
    libblockdev-swap 3.1.0-1.fc39 -> 3.0.4-1.fc39
    libblockdev-utils 3.1.0-1.fc39 -> 3.0.4-1.fc39
    libertas-firmware 20240115-2.fc39 -> 20231211-1.fc39
    libfdisk 2.39.3-2.fc39 -> 2.39.3-1.fc39
    libinput 1.25.0-1.fc39 -> 1.24.0-1.fc39
    libipa_hbac 2.9.4-1.fc39 -> 2.9.3-1.fc39
    libmount 2.39.3-2.fc39 -> 2.39.3-1.fc39
    libphonenumber 8.13.28-3.fc39 -> 8.13.27-1.fc39
    libsane-hpaio 3.23.12-2.fc39 -> 3.23.8-1.fc39
    libsmartcols 2.39.3-2.fc39 -> 2.39.3-1.fc39
    libsmbclient 2:4.19.4-3.fc39 -> 2:4.19.4-2.fc39
    libsss_certmap 2.9.4-1.fc39 -> 2.9.3-1.fc39
    libsss_idmap 2.9.4-1.fc39 -> 2.9.3-1.fc39
    libsss_nss_idmap 2.9.4-1.fc39 -> 2.9.3-1.fc39
    libsss_sudo 2.9.4-1.fc39 -> 2.9.3-1.fc39
    libuuid 2.39.3-2.fc39 -> 2.39.3-1.fc39
    libva 2.20.0-2.fc39 -> 2.20.0-1.fc39
    libwbclient 2:4.19.4-3.fc39 -> 2:4.19.4-2.fc39
    libwinpr 2:2.11.4-1.fc39 -> 2:2.11.2-3.fc39
    linux-firmware 20240115-2.fc39 -> 20231211-1.fc39
    linux-firmware-whence 20240115-2.fc39 -> 20231211-1.fc39
    mt7xxx-firmware 20240115-2.fc39 -> 20231211-1.fc39
    nvidia-gpu-firmware 20240115-2.fc39 -> 20231211-1.fc39
    orca 45.2-1.fc39 -> 45.1-1.fc39
    pcsc-lite-ccid 1.5.5-1.fc39 -> 1.5.4-1.fc39
    python3-pyatspi 2.46.1-1.fc39 -> 2.46.0-6.fc39
    realtek-firmware 20240115-2.fc39 -> 20231211-1.fc39
    rpm-ostree 2024.1-4.fc39 -> 2023.11-1.fc39
    rpm-ostree-libs 2024.1-4.fc39 -> 2023.11-1.fc39
    rygel 0.42.5-1.fc39 -> 0.42.4-1.fc39
    samba-client 2:4.19.4-3.fc39 -> 2:4.19.4-2.fc39
    samba-client-libs 2:4.19.4-3.fc39 -> 2:4.19.4-2.fc39
    samba-common 2:4.19.4-3.fc39 -> 2:4.19.4-2.fc39
    samba-common-libs 2:4.19.4-3.fc39 -> 2:4.19.4-2.fc39
    skopeo 1:1.14.1-1.fc39 -> 1:1.14.0-1.fc39
    sssd 2.9.4-1.fc39 -> 2.9.3-1.fc39
    sssd-ad 2.9.4-1.fc39 -> 2.9.3-1.fc39
    sssd-client 2.9.4-1.fc39 -> 2.9.3-1.fc39
    sssd-common 2.9.4-1.fc39 -> 2.9.3-1.fc39
    sssd-common-pac 2.9.4-1.fc39 -> 2.9.3-1.fc39
    sssd-ipa 2.9.4-1.fc39 -> 2.9.3-1.fc39
    sssd-kcm 2.9.4-1.fc39 -> 2.9.3-1.fc39
    sssd-krb5 2.9.4-1.fc39 -> 2.9.3-1.fc39
    sssd-krb5-common 2.9.4-1.fc39 -> 2.9.3-1.fc39
    sssd-ldap 2.9.4-1.fc39 -> 2.9.3-1.fc39
    sssd-nfs-idmap 2.9.4-1.fc39 -> 2.9.3-1.fc39
    sssd-proxy 2.9.4-1.fc39 -> 2.9.3-1.fc39
    usbutils 017-1.fc39 -> 016-1.fc39
    util-linux 2.39.3-2.fc39 -> 2.39.3-1.fc39
    util-linux-core 2.39.3-2.fc39 -> 2.39.3-1.fc39
    xorg-x11-server-Xorg 1.20.14-30.fc39 -> 1.20.14-28.fc39
    xorg-x11-server-common 1.20.14-30.fc39 -> 1.20.14-28.fc39
    Removed:
    cirrus-audio-firmware-20240115-2.fc39.noarch
    intel-audio-firmware-20240115-2.fc39.noarch
    nxpwireless-firmware-20240115-2.fc39.noarch
    tiwilink-firmware-20240115-2.fc39.noarch
    Added:
    htop-3.3.0-1.fc39.x86_64
    hwloc-libs-2.10.0-1.fc39.x86_64
    Changes queued for next boot. Run "systemctl reboot" to start a reboot
  3. Ends up with broken commit, ostree fsck complains about broken files. And btrfs scrub don't report any corrupted file.

    sudo ostree fsck
    [sudo] yan 的密码:
    Validating refs...
    Validating refs in collections...
    Enumerating commits...
    Verifying content integrity of 102 commit objects...
    fsck objects (78426/106897) [=========    ]  73%
    error: In commits dde239538a36ad61250b3d8f067a87f4ea9e4f7f2790a30025e833ed30d7765a, a62ee896eeaed0be74b930fd4132d5d6260b7b0c77e802d6da4119ae989120bd: fsck content object 4b6e816eb9ba53ca8ec49bb1953654ad18660d867510ea84d1e58ca6b0c1a0a5: Corrupted file object; checksum expected='4b6e816eb9ba53ca8ec49bb1953654ad18660d867510ea84d1e58ca6b0c1a0a5' actual='bf13d8fe63612425b89f526a906f06359a1a71ac71902caef8a7160558d91187'

Expected behavior

It should not break the internal status.

Actual behavior

Got corrupted file.

System details

State: busy
AutomaticUpdates: stage; rpm-ostreed-automatic.timer: no runs since boot
Transaction: rebase ostree-unverified-registry:quay.io/karuboniru/ostree-container:master 
  Initiator: client(id:cli dbus:1.181 unit:run-r9b250ced442745ab9b05f736add55d3b.scope uid:1000)
BootedDeployment:
● fedora:fedora/39/x86_64/testing/silverblue
                  Version: 39.20240120.0 (2024-01-20T00:37:56Z)
               BaseCommit: 0076da5b7ad2786c932a905346d5f76e20a1bda56135c6f0012cd1bb167c2dfd
             GPGSignature: Valid signature by E8F23996F23218640CB44CBE75CF5AC418B8E74C
      RemovedBasePackages: firefox firefox-langpacks 121.0.1-1.fc39
          LayeredPackages: fcitx5 fcitx5-autostart fcitx5-chinese-addons fcitx5-configtool fcitx5-gtk fcitx5-mozc fcitx5-qt zsh

Additional information

Doing a reset before rebasing will workaround this issue. If rebasing from dirty image is not supported I believe a warning before the actual execution is needed to avoid confusion.


This is not a filesystem error since btrfs scrub return no error.

And the file being broken is

ostree ls 44eb1b1af8bad1129fbe70ccf0b092d54514867d01d4b17b4d5d27808113befb -CR|grep 944eec75e5b011eabe3643394c7bab36b5cb531d1e4ccab905a9e49f3ffcb70e
-00644 0 0  32768 944eec75e5b011eabe3643394c7bab36b5cb531d1e4ccab905a9e49f3ffcb70e /usr/share/rpm/rpmdb.sqlite-shm

Also, I somehow break a working system without doing layering, maybe during playing with rpm-ostree compose, but that is hard to reproduce. I will update in this issue if found any pattern behind this.


refs: https://github.com/fedora-silverblue/issue-tracker/issues/528

cgwalters commented 10 months ago

Hmmm... so /usr/share/rpm/rpmdb.sqlite-shm is a somewhat special file I think because it may be memory mapped. I wonder if we're somehow leaking writes to it through rofiles-fuse or so?

How reproducible is this?

karuboniru commented 10 months ago

Not quite, I only noticed this when doing the rebase for the first time and once after playing with rpm-ostree compose. I will try to do simliar steps in virtual machine when had time.

LukeShortCloud commented 6 months ago

I just ran into this same exact issue with Fedora Silverblue 40 and various overrides applied. I went from a rpm-ostree repository to a container native image with rpm-ostree rebase.