osbuild / bootc-image-builder

A container for deploying bootable container images.
https://osbuild.org
Apache License 2.0
134 stars 57 forks source link

centos-bootc/bootc-image-builder:latest build fedora-bootc:40 aarch64 ami image on x86_64 machine failed #619

Open henrywang opened 2 months ago

henrywang commented 2 months ago

centos-bootc/bootc-image-builder:latest build fedora-bootc:40 aarch64 ami image on x86_64 machine failed.

Run sudo podman run --rm -it --privileged --pull=newer --tls-verify=false --security-opt label=type:unconfined_t -v /var/lib/containers/storage:/var/lib/containers/storage --env AWS_ACCESS_KEY_ID=***** --env AWS_SECRET_ACCESS_KEY=***** quay.io/centos-bootc/bootc-image-builder:latest --type ami --target-arch aarch64 --aws-ami-name bootc-bib-fedora-40-aarch64-6pjc --aws-bucket bootc-bib-images-test --aws-region us-west-2 --rootfs xfs quay.io/bootc-test/*****:6pjc failed with error:

org.osbuild.ostree.deploy.container: df91eb12effd7700c49a6285f6f8f991d0e4949bcfcb68fedf80de475fa37a40 {
  "osname": "default",
  "kernel_opts": [
    "rw",
    "console=tty0",
    "console=ttyS0"
  ],
  "target_imgref": "ostree-unverified-registry:quay.io/bootc-test/*****:6pjc",
  "rootfs": {
    "label": "root"
  },
  "mounts": [
    "/boot",
    "/boot/efi"
  ]
}
ostree container image deploy --imgref=ostree-unverified-image:containers-storage:[overlay@/run/osbuild/containers/storage+/run/containers/storage]5e4fa291f946c587a8372bdeb862a2c3041ee8253824da387c7ef4ca9c8b1452 --stateroot=default --target-imgref=ostree-unverified-registry:quay.io/bootc-test/*****:6pjc --karg=rw --karg=console=tty0 --karg=console=ttyS0 --karg=root=LABEL=root --sysroot=/run/osbuild/tree
error: Performing deployment: Creating importer: Function not implemented (os error 38)
Traceback (most recent call last):
  File "/run/osbuild/bin/org.osbuild.ostree.deploy.container", line 72, in <module>
    r = main(stage_args["tree"],
        ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/run/osbuild/bin/org.osbuild.ostree.deploy.container", line 67, in main
    ostree_container_deploy(tree, inputs, osname, target_imgref, kopts)
  File "/run/osbuild/bin/org.osbuild.ostree.deploy.container", line 41, in ostree_container_deploy
    ostree.cli("container", "image", "deploy",
  File "/run/osbuild/lib/osbuild/util/ostree.py", line 205, in cli
    return subprocess.run(["ostree"] + args,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.12/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ostree', 'container', 'image', 'deploy', '--imgref=ostree-unverified-image:containers-storage:[overlay@/run/osbuild/containers/storage+/run/containers/storage]5e4fa291f946c587a8372bdeb862a2c3041ee8253824da387c7ef4ca9c8b1452', '--stateroot=default', '--target-imgref=ostree-unverified-registry:quay.io/bootc-test/*****:6pjc', '--karg=rw', '--karg=console=tty0', '--karg=console=ttyS0', '--karg=root=LABEL=root', '--sysroot=/run/osbuild/tree']' returned non-zero exit status 1.

⏱  Duration: 26s
manifest - failed

bib image digest: sha256:86d0088e161db6a189d89065a233c83f8c4e63414f1ab50b4b36075ad60966db

Detailed log: https://artifacts.osci.redhat.com/testing-farm/6ba684cb-76ba-4034-b817-74990d9bcbd7/

mvo5 commented 2 months ago

I looked at this today and it turns out to be another missing syscall in qemu-user:

$ git diff
diff --git a/osbuild/buildroot.py b/osbuild/buildroot.py
index 02b1b9f2..7dad22e8 100644
--- a/osbuild/buildroot.py
+++ b/osbuild/buildroot.py
@@ -306,6 +306,7 @@ class BuildRoot(contextlib.AbstractContextManager):

         # Setup a new environment for the container.
         env = {
+            "QEMU_LOG": "unimp",
             "container": "bwrap-osbuild",
             "LC_CTYPE": "C.UTF-8",
             "PATH": "/usr/sbin:/usr/bin",
$ sudo python3 -m osbuild --libdir . /tmp/c9-arm64.manifest  --output-directory /tmp/out --export image
...
Unsupported syscall: 437
error: Performing deployment: Creating importer: Function not implemented (os error 38)
...
$ scmp_sys_resolver -a aarch64 437
openat2

I will look into providing an upstream fix to qemu.

mvo5 commented 2 months ago

I made some progress on qemu-user and https://github.com/qemu/qemu/compare/master...mvo5:support-openat2-clean?expand=1 is my current WIP branch. With that I can do a cross arch build again and the test works:

$ sudo pytest -s -vv './test/test_build.py::test_image_boots[quay.io/centos-bootc/centos-bootc:stream9,raw,arm64]'
...
CentOS Stream 9
Kernel 5.14.0-503.el9.aarch64 on an aarch64

enp0s1: 10.0.2.15 fe80::c432:7d1f:4347:2a52
localhost login: 
...
PASSED

[edit: also send to the qemu-devel list now link]

henrywang commented 2 months ago

fedora-bootc:41 corss arch build failed with this error. Should be the same reason. Thank!

org.osbuild.ostree.deploy.container: c3b6f76397267352af8b8372dfced2464874109e7a5f7897f5a2fe8118686fc8 {
  "osname": "default",
  "kernel_opts": [
    "rw",
    "console=tty0",
    "console=ttyS0"
  ],
  "target_imgref": "ostree-unverified-registry:quay.io/bootc-test/*****:5jij",
  "rootfs": {
    "label": "root"
  },
  "mounts": [
    "/boot",
    "/boot/efi"
  ]
}
ostree container image deploy --imgref=ostree-unverified-image:containers-storage:[overlay@/run/osbuild/containers/storage+/run/containers/storage]505e426390213d8f8b2cf4578e86c8f413b242a241bb3dfa6103de76a13bd820 --stateroot=default --target-imgref=ostree-unverified-registry:quay.io/bootc-test/*****:5jij --karg=rw --karg=console=tty0 --karg=console=ttyS0 --karg=root=LABEL=root --sysroot=/run/osbuild/tree
error: Performing deployment: Creating importer: Function not implemented (os error 38)
Traceback (most recent call last):
  File "/run/osbuild/bin/org.osbuild.ostree.deploy.container", line 72, in <module>
    r = main(stage_args["tree"],
             stage_args["inputs"],
             stage_args["options"])
  File "/run/osbuild/bin/org.osbuild.ostree.deploy.container", line 67, in main
    ostree_container_deploy(tree, inputs, osname, target_imgref, kopts)
    ~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/run/osbuild/bin/org.osbuild.ostree.deploy.container", line 41, in ostree_container_deploy
    ostree.cli("container", "image", "deploy",
    ~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
               *extra_args, sysroot=tree, *kargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/run/osbuild/lib/osbuild/util/ostree.py", line 205, in cli
    return subprocess.run(["ostree"] + args,
           ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^
                          encoding="utf8",
                          ^^^^^^^^^^^^^^^^
                          stdout=subprocess.PIPE,
                          ^^^^^^^^^^^^^^^^^^^^^^^
                          input=_input,
                          ^^^^^^^^^^^^^
                          check=True)
                          ^^^^^^^^^^^
  File "/usr/lib64/python3.13/subprocess.py", line 577, in run
    raise CalledProcessError(retcode, process.args,
                             output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['ostree', 'container', 'image', 'deploy', '--imgref=ostree-unverified-image:containers-storage:[overlay@/run/osbuild/containers/storage+/run/containers/storage]505e426390213d8f8b2cf4578e86c8f413b242a241bb3dfa6103de76a13bd820', '--stateroot=default', '--target-imgref=ostree-unverified-registry:quay.io/bootc-test/*****:5jij', '--karg=rw', '--karg=console=tty0', '--karg=console=ttyS0', '--karg=root=LABEL=root', '--sysroot=/run/osbuild/tree']' returned non-zero exit status 1.
chunfuwen commented 2 months ago

it is easily reproduced on fedora 40 cross build with below command:

sudo podman run --rm -it --privileged --pull=newer --security-opt label=type:unconfined_t -v /var/lib/libvirt/images/output:/output -v /var/lib/libvirt/images/config.json:/config.json   -v /var/lib/libvirt/images/auth.json:/run/containers/0/auth.json  quay.io/centos-bootc/bootc-image-builder:latest  --type qcow2 --tls-verify=true  --config /config.json  --target-arch=aarch64  quay.io/centos-bootc/centos-bootc:stream10
...
ostree container image deploy --imgref=ostree-unverified-image:containers-storage:[overlay@/run/osbuild/containers/storage+/run/containers/storage]f03bba6c34db7fe7454371f32230f12349358da7872bbb461ad72da5048cb01d --stateroot=default --target-imgref=ostree-unverified-registry:quay.io/centos-bootc/centos-bootc:stream10 --karg=rw --karg=console=tty0 --karg=console=ttyS0 --karg=root=LABEL=root --sysroot=/run/osbuild/tree
error: Performing deployment: Creating importer: Function not implemented (os error 38)
Traceback (most recent call last):
  File "/run/osbuild/bin/org.osbuild.ostree.deploy.container", line 72, in <module>
    r = main(stage_args["tree"],
        ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/run/osbuild/bin/org.osbuild.ostree.deploy.container", line 67, in main
    ostree_container_deploy(tree, inputs, osname, target_imgref, kopts)
  File "/run/osbuild/bin/org.osbuild.ostree.deploy.container", line 41, in ostree_container_deploy
    ostree.cli("container", "image", "deploy",
  File "/run/osbuild/lib/osbuild/util/ostree.py", line 205, in cli
    return subprocess.run(["ostree"] + args,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.12/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ostree', 'container', 'image', 'deploy', '--imgref=ostree-unverified-image:containers-storage:[overlay@/run/osbuild/containers/storage+/run/containers/storage]f03bba6c34db7fe7454371f32230f12349358da7872bbb461ad72da5048cb01d', '--stateroot=default', '--target-imgref=ostree-unverified-registry:quay.io/centos-bootc/centos-bootc:stream10', '--karg=rw', '--karg=console=tty0', '--karg=console=ttyS0', '--karg=root=LABEL=root', '--sysroot=/run/osbuild/tree']' returned non-zero exit status 1.

⏱  Duration: 68s
manifest - failed
Failed
2024/08/31 09:29:06 error: cannot run osbuild: running osbuild failed: exit status 1
mvo5 commented 2 months ago

Thanks, I'm 75-80% confident that https://lists.nongnu.org/archive/html/qemu-devel/2024-09/msg00976.html will fix this, to be sure I would have to run it again with "QEMU_LOG": "unimp", which unfortunately we cannot make default as it will complain about some "harmless" unimplemented ioctl/syscalls but that is enough to taint the output.

cdrage commented 2 months ago

Getting the same error when not doing cross-arch as well in here: https://github.com/osbuild/bootc-image-builder/issues/641

~@cgwalters unsure if this is also related too?~

~would it be 2 fixes, 1 for fixing bootc in centos image, other is the qemu update?~

EDIT: You are right. cross-arch is the only part that is failing. Creating the arch native to the system works fine. Only when it's building amd64 it runs into the qemu issue.

mvo5 commented 1 month ago

Fwiw, I added a copr repo with a patched qemu in https://copr.fedorainfracloud.org/coprs/michaelvogt/qemu-user-with-openat2/ - with that package installed cross build from amd64->arm64 works again for me (but no word from upstream yet unfortunately about integration into qemu proper)

henrywang commented 1 month ago

Fwiw, I added a copr repo with a patched qemu in https://copr.fedorainfracloud.org/coprs/michaelvogt/qemu-user-with-openat2/ - with that package installed cross build from amd64->arm64 works again for me (but no word from upstream yet unfortunately about integration into qemu proper)

Thanks for fixing this issue. I'll update qemu with your copr build and try again.

cdrage commented 1 month ago

@henrywang @mvo5

Just an update, after testing, I was able to build an amd64 image with Rosetta disabled on macOS within Podman Machine settings.

It is MUCH slower to build, but at least it is building!

Unsure why disabling it is making it work, but I'm able to finally make the amd64 binaries.

Ideally it'd be good to see why this isn't working with Rosetta enabled / we should debug qemu output.

MoralCode commented 1 week ago

Seems like this change has been merged upstream according to https://github.com/osbuild/bootc-image-builder/issues/639#issuecomment-2415945399

mvo5 commented 1 week ago

Yes, this is fixed upstream and got cherry-picked into f41 via https://src.fedoraproject.org/rpms/qemu/pull-request/70 so this should work with a f41 machine again.