containers / bootc

Boot and upgrade via container images
https://containers.github.io/bootc/
Apache License 2.0
791 stars 84 forks source link

File attributes are not correctly hashed or preserved by `ostree container encapsulate` resulting in a failure to pull images with bootc and rpm-ostree #920

Closed prydom closed 15 hours ago

prydom commented 1 week ago

Recently I've observed pulling down images created with rpm-ostree compose container-encapsulate results in errors similar to this when running bootc upgrade:

Importing: Unencapsulating base: Importing commit: Importing object 4c/2f5284359a960f16ae5f3cfb70534035ee0620c14234855975da1cda95ccc7.file: Processing content object 4c2f5284359a960f16ae5f3cfb70534035ee0620c14234855975da1cda95ccc7: Importing regfile small: Writing content object: Corrupted file object; checksum expected='4c2f5284359a960f16ae5f3cfb70534035ee0620c14234855975da1cda95ccc7' actual='35a978c60a57d76b632d293c84b731e5e6e312fad73520888152a84d798d7472'

I determined on my images the content objects causing issues were files where dnf5 wrote "user.librepo.checksum.mtime" and "user.librepo.checksum.sha256" extended attributes. These are written on RPMs installed from a (local to the container) repository.

It is possible this is a regression due to https://github.com/ostreedev/ostree-rs-ext/pull/679 because it's the only recent ostree-rs-ext/bootc commit touching extended attributes. It is also possible this is a bug that's always existed but is uncovered by preserving extended attributes from the container image tar stream. Regardless, this only started happening during the past week or so after rpm-ostree 2024.9-1 got released to rawhide.

I have written a minimal reproduction using ostree container encapsulate and ostree container image pull. Review and run this script with the working directory set to an empty directory or one you don't care about. It creates small ostree repos, commits, and container images in the current directory.

There are 3 changes to this script you can make to cause the test to pass with some degree of success (container pull succeeds and possibly ostree fsck as well; ideally both would succeed):

  1. Remove the --selinux-policy argument on the ostree commit command. This is not a viable workaround because I need to relabel at this stage of building an image.
  2. Remove all user.* extended attributes from the image. This is the workaround I went with for now, only user.librepo.* attributes are triggering this bug for me.
  3. Encapsulate to OCI an ostree commit stored in a bare-user repo instead of a bare repo. I haven't tested this with a real bootable image yet. This allows the ostree container image pull to proceed but ostree fsck -a in a bare repo we pull into later doesn't pass.

Reproduction script

Expand this to see reproduction script ```bash #!/usr/bin/bash # Echo executed commands and stop on the first error set -xe # This script should be running as root to use bare ostree repos. If you use bare-user then you can drop. # Run in a VM or rootless container (as root in the mount namespace) if you don't trust the script. if [ "$EUID" -ne 0 ] then echo "You may consider running as root for bare ostree repos" fi # Cleanup previous runs # DO NOT RUN THIS SCRIPT IN A WORKING DIRECTORY YOU CARE ABOUT rm -rf image repo commit checkout # Create a new bare ostree repo as root ostree init --repo=repo --mode=bare # Create a new commit with just a single file with an extended attribute mkdir commit cd commit echo "hello world" > test.txt # This is what my system policy labels a file called test.txt on / as, comment out or change if this differs for you. # I just have it here to play a game of spot the difference setfattr -n security.selinux -v "system_u:object_r:etc_runtime_t:s0" test.txt setfattr -n user.librepo.checksum.mtime -v 0000 test.txt getfattr -d -m - -- * stat test.txt cd .. # Commit that to the new ostree repo with ref xattrtest # Note that in my real workflow I use an selinux policy from the image. # I use the system one because this isn't a real bootable image. # I usually use --consume but don't here because this is not from an existing ostree commit ostree commit \ --branch=xattrtest \ --repo=repo \ "--selinux-policy=/" \ commit rm -rf commit # fsck the repo # this is the first time the behavior is a bit unexpected as the selinux policy relabel causes a content hash mismatch ostree fsck --repo=repo -a || true # checkout to show that ostree is keeping the xattrs ostree checkout --repo=repo \ --require-hardlinks \ "xattrtest" checkout # Print the xattrs cd checkout getfattr -d -m - -- * stat test.txt cd .. rm -rf checkout # encapsulate to a OCI container ostree container encapsulate \ --repo=repo \ --cmd="/usr/bin/bash" \ --label="containers.bootc=1" \ --label="ostree.bootable=true" \ xattrtest "dir:$PWD/image" # reset the ostree repo rm -rf repo ostree init --repo=repo --mode=bare # Pull the container - THIS MIGHT FAIL AND END THE SCRIPT due to `set -xe` ostree container image pull repo "ostree-unverified-image:dir:$PWD/image" # If the above doesn't fail, this may still fail # fsck the repo to show that everything is preserved or not ostree fsck --repo=repo -a || true # Checkout the pulled container REF="$(ostree refs --repo=repo ostree/container/image)" ostree checkout --repo=repo \ --require-hardlinks \ "ostree/container/image/$REF" checkout # print the xattrs cd checkout getfattr -d -m - -- * stat test.txt cd .. # Clean up rm -rf checkout image repo ```

Running this script on the rpm-ostree versions I detail below results in the following output.

Expand this to see reproduction output ``` + '[' 0 -ne 0 ']' + rm -rf image repo commit checkout + ostree init --repo=repo --mode=bare + mkdir commit + cd commit + echo 'hello world' + setfattr -n security.selinux -v system_u:object_r:etc_runtime_t:s0 test.txt + setfattr -n user.librepo.checksum.mtime -v 0000 test.txt + getfattr -d -m - -- test.txt # file: test.txt security.selinux="system_u:object_r:etc_runtime_t:s0" user.librepo.checksum.mtime="0000" + stat test.txt File: test.txt Size: 12 Blocks: 8 IO Block: 4096 regular file Device: 0,46 Inode: 25915273 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Context: system_u:object_r:etc_runtime_t:s0 Access: 2024-11-25 13:59:21.973942016 -0800 Modify: 2024-11-25 13:59:21.973942016 -0800 Change: 2024-11-25 13:59:21.974942015 -0800 Birth: 2024-11-25 13:59:21.973942016 -0800 + cd .. + ostree commit --branch=xattrtest --repo=repo --selinux-policy=/ commit 6bfe549dca7a60f6dbfc44ac1a5568bebbcba3f437299c48407078fa0acb4b5f + rm -rf commit + ostree fsck --repo=repo -a Validating refs... Validating refs in collections... Enumerating commits... Verifying content integrity of 1 commit objects... fsck objects (1/4) [=== ] 25%In commits 6bfe549dca7a60f6dbfc44ac1a5568bebbcba3f437299c48407078fa0acb4b5f: fsck content object 4c2f5284359a960f16ae5f3cfb70534035ee0620c14234855975da1cda95ccc7: Corrupted file object; checksum expected='4c2f5284359a960f16ae5f3cfb70534035ee0620c14234855975da1cda95ccc7' actual='35a978c60a57d76b632d293c84b731e5e6e312fad73520888152a84d798d7472' fsck objects (4/4) [=============] 100% error: Repository corruption encountered + true + ostree checkout --repo=repo --require-hardlinks xattrtest checkout + cd checkout + getfattr -d -m - -- test.txt # file: test.txt security.selinux="system_u:object_r:etc_runtime_t:s0" user.librepo.checksum.mtime="0000" + stat test.txt File: test.txt Size: 12 Blocks: 8 IO Block: 4096 regular file Device: 0,46 Inode: 25915280 Links: 2 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Context: system_u:object_r:etc_runtime_t:s0 Access: 2024-11-25 13:59:22.024941996 -0800 Modify: 1969-12-31 16:00:00.000000000 -0800 Change: 2024-11-25 13:59:22.030941993 -0800 Birth: 2024-11-25 13:59:21.994942007 -0800 + cd .. + rm -rf checkout + ostree container encapsulate --repo=repo --cmd=/usr/bin/bash --label=containers.bootc=1 --label=ostree.bootable=true xattrtest dir:/var/home/USER/ostree-tests/check-corrupt/image sha256:b87004ace3ab73bb3c95f1fcfa5fe3ba6cdb31c2bd945cf88ef42355bd8df55f + rm -rf repo + ostree init --repo=repo --mode=bare + ostree container image pull repo ostree-unverified-image:dir:/var/home/USER/ostree-tests/check-corrupt/image layers already present: 0; layers needed: 1 (3.3 kB) error: Importing: Unencapsulating base: Importing commit: Importing object 4c/2f5284359a960f16ae5f3cfb70534035ee0620c14234855975da1cda95ccc7.file: Processing content object 4c2f5284359a960f16ae5f3cfb70534035ee0620c14234855975da1cda95ccc7: Importing regfile small: Writing content object: Corrupted file object; checksum expected='4c2f5284359a960f16ae5f3cfb70534035ee0620c14234855975da1cda95ccc7' actual='35a978c60a57d76b632d293c84b731e5e6e312fad73520888152a84d798d7472' ```

Here's what the extended attributes look like in the container image:

Expand this to see interrogating the container ``` $ podman pull dir:$PWD/image Getting image source signatures Copying blob 02641bd1db4f done | Copying config 75b22a9daa done | Writing manifest to image destination 75b22a9daa1864a0f37b577d82c11e0acd61572e3f9f5d9ddb816120c7689d99 $ podman create --name ostree-check 75b22a9daa1864a0f37b577d82c11e0acd61572e3f9f5d9ddb816120c7689d99 34300a50f27a5e43627ede137f9194c298894dfd8e9d36ae7c387d9cc691ad3c $ cd `podman mount ostree-check` $ getfattr -d -m - -- test.txt # file: test.txt security.selinux="system_u:object_r:container_file_t:s0:c195,c980" $ stat test.txt File: test.txt Size: 12 Blocks: 8 IO Block: 4096 regular file Device: 0,105 Inode: 25916335 Links: 2 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Context: system_u:object_r:container_file_t:s0:c195,c980 Access: 1969-12-31 16:00:00.000000000 -0800 Modify: 1969-12-31 16:00:00.000000000 -0800 Change: 2024-11-25 14:24:20.525324527 -0800 Birth: 2024-11-25 14:24:20.525324527 -0800 $ strings sysroot/ostree/repo/objects/4c/2f5284359a960f16ae5f3cfb70534035ee0620c14234855975da1cda95ccc7.file-xattrs-link security.selinux system_u:object_r:etc_runtime_t:s0 user.librepo.checksum.mtime 0000 ```

Software versions and environment

The underlying filesystem is btrfs. I use the Fedora 41 kernel, not the rawhide one.

$ cat /etc/os-release | head -n2
NAME="Fedora Linux"
VERSION="Rawhide.20241124.n.0 (Kinoite Prerelease)"

$ uname -a
Linux hostname 6.11.8-300.fc41.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Nov 14 20:37:39 UTC 2024 x86_64 GNU/Linux

$ bootc --version
bootc 1.1.2

$ rpm-ostree --version
rpm-ostree:
 Version: '2024.9'
 Git: bd152998332c2b11af671447d3fd311319d5b3a8
 Features:
  - rust
  - compose
  - container
  - fedora-integration

$ ostree --version
libostree:
 Version: '2024.9'
 Git: 14e29775963d44bc533715efcf2362d6051756de
 Features:
  - inode64
  - initial-var
  - libcurl
  - libsoup3
  - gpgme
  - composefs
  - ex-fsverity
  - libarchive
  - selinux
  - openssl
  - sign-ed25519
  - libmount
  - systemd
  - release
  - p2p
cgwalters commented 1 week ago

Hmmmm.... https://github.com/ostreedev/ostree-rs-ext/pull/679 should only be affecting "layered" (i.e. non-ostree) paths. To be clear this container image is a "pure ostree" one, you haven't created any additional layers?

Remove all user. extended attributes from the image. This is the workaround I went with for now, only user.librepo. attributes are triggering this bug for me.

That seems sane as a workaround - there's no real reason to ship those xattrs anyways, right?

But yes...it does look like there's some sort of bug here.

prydom commented 1 week ago

This reproduces with the same result on rpm-ostree layered images as I'm running into this on my own image builds. This narrows the scope for the bug a bit since rpm-ostree calls the "pure ostree" path: https://github.com/coreos/rpm-ostree/blob/main/rust/src/container.rs#L519C36-L519C47. My workflow is to create a "pure ostree" image (layered/chunked with rpm-ostree) and then layer further software on top with buildah. Both the base image and the later layered images fail to pull.

It is just much harder to write a small reproduction script to share here using a real bootable image because I need a RPM database or to download a bootstrap image.

prydom commented 1 week ago

Also, on later images that build on the workaround base image that can be pulled using dnf5 - where the "user.librepo.*" xattrs are in the container tar stream itself - I get failures on ostree fsck -a for the RPM files and repodata/ XML files on the deployed machines.

The machines boot fine and the file contents are OK, so that doesn't matter as much but might cause users to become concerned if they layer xattrs during a container build.

EDIT: This is a different bug when I layer xattrs on a working base image. I can separately file that one if you want @cgwalters.

prydom commented 6 days ago

I think I can answer why this happens for the "pure ostree" case, @cgwalters. It seems the root cause has been in https://github.com/ostreedev/ostree for a while and it may just be some coincidental changes in libdnf5, librpm or librepo which caused this to start failing container pulls now.

libglnx will always sort the extended attributes returned from glnx_dfd_name_get_all_xattrs with canonicalize_xattrs(). This ensures we should always get the same content hash as long as the same file contents + uid/gid/perms/xattrs match even if the underlying filesystem returns a different order.

However, during commit the way get_final_xattrs() works is to first delete the security.selinux attribute with _ostree_filter_selinux_xattr() and then append the new label to the end of the gvariant array.

Linux capabilities have been working because "security.capabilities" always comes before "security.selinux". However the existence of any "user.*" attributes means that list will no longer be sorted and therefore the hash calculated at commit time will not match what is stored in the bare repo and what is copied into the encapsulated commit container image.

So, preserving system.* (e.g. ACLs) or trusted.* extended attributes would be bit by this same bug. I have reproduced the same failure to pull the container by adding an ACL.

# Create a new commit with just a single file with an extended attribute
mkdir commit
cd commit
echo "hello world" > test.txt
setfattr -n security.selinux -v "system_u:object_r:etc_runtime_t:s0" test.txt
setfacl -m "u:root:rw-" test.txt
setcap 'cap_net_bind_service=ep' test.txt
getfattr -d -m - -- *
stat test.txt
cd ..
prydom commented 15 hours ago

I have completed some end-to-end testing to confirm the behavior of bootc that I was encountering are root caused in and fully resolved by https://github.com/ostreedev/ostree/pull/3346. Extended attributes of all types stored in bare, bare-user, and bare-split-xattrs ostree repositories and those in derived layers' tar streams will be properly handled by both OSTree and Composefs when the commit from the PR is merged. There were no backwards compatibility concerns encountered during testing.

See https://github.com/ostreedev/ostree/pull/3346#issuecomment-2505266225 for details of the testing. I'm closing this issue now since there is nothing that needs to be changed in bootc or ostree-rs-ext.

The issue can be tracked in https://github.com/ostreedev/ostree/issues/3343 until the PR is merged.