Closed zhlhahaha closed 11 months ago
cc: @andreabolognani @rmohr @dhiller @xpivarc @brianmcarey
Problems on build fedora-with-test-tooling image, now the image failed to build with following error message:
virt-sysprep: error: libguestfs error: inspect_os: mount exited with status 32: mount: /tmp/btrfsJWefGz: unknown filesystem type 'btrfs'.
The root cause is that we might lack support for btrfs in the host kernel, while virt-sysprep requires the VM to be mounted as btrfs.
During the preparation of the VM image using virt-sysprep, the tool examines the guest and utilizes libguestfs to mount all volumes of the guest VM. Here is the filesystem within the guest VM:
[root@localhost fedora]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sr0 11:0 1 366K 0 rom
zram0 251:0 0 1.9G 0 disk [SWAP]
vda 252:0 0 5G 0 disk
├─vda1 252:1 0 1M 0 part
├─vda2 252:2 0 1000M 0 part /boot
├─vda3 252:3 0 100M 0 part /boot/efi
├─vda4 252:4 0 4M 0 part
└─vda5 252:5 0 3.9G 0 part /home
[root@localhost fedora]# cat /etc/fstab
UUID=a280b604-6023-4ba5-bb9e-80d612f84b0d / btrfs subvol=root,compress=zstd:1 0 0
UUID=c2457f56-74ee-4fb3-9748-b79bb5f6c1bc /boot ext4 defaults 1 2
UUID=6C81-19BE /boot/efi vfat defaults,uid=0,gid=0,umask=077,shortname=winnt 0 2
UUID=a280b604-6023-4ba5-bb9e-80d612f84b0d /home btrfs subvol=home,compress=zstd:1 0 0
As you can observe, the filesystem of some volumes is btrfs. However, it appears that btrfs is not included in the list of /proc/filesystems within the builder container, which is supposed to be the same as the host's list. For more information, please refer to the build log https://prow.ci.kubevirt.io/view/gs/kubevirt-prow/pr-logs/pull/kubevirt_kubevirtci/1060/check-provision-fedora-with-test-tooling/1683766811591446528
We may need to update or add btrfs related mod on the hosts to make the script works. Do you have any suggestion @andreabolognani ?
@zhlhahaha I'm not familiar with the actual hardware behind prow, but it sure looks like the job might be running on a host that doesn't have btrfs support.
If that's the case, I don't think there's much we can do except trying to get it reprovisioned with some OS that includes btrfs support.
I think we're only running into this now because the cloud images that we've been using so far are for Fedora 32, when btrfs was not yet the default.
I think we're only running into this now because the cloud images that we've been using so far are for Fedora 32, when btrfs was not yet the default.
Yes, you are right, I do not get this issue when build fedora-realtime
which is based on Fedora 32. And the issue happens on both Fedora 35 and Fedora 38.
Hi, @brianmcarey, is it possible to add btrfs mod into host OS behind prow? Or do you have any suggestion?
Hi, @brianmcarey , here is the script I run in my local system to build and publish image. Hopefully, this will help.
$ docker run --privileged --rm -d --name builder -e CONSOLE=true -e DEBUG=false quay.io/kubevirtci/vm-image-builder:v20230607-9021afd sleep infinity
$ docker exec -it builder /bin/bash
$ git clone https://github.com/kubevirt/kubevirtci.git
$ cd kubevirtci/cluster-provision/images/vm-image-builder/
# the current image-url for x86_64 is not available, it need to update the url,
# here I use the patch in https://github.com/kubevirt/kubevirtci/pull/1060
# or you can update the base image to fedora 38.
$ wget https://github.com/kubevirt/kubevirtci/pull/1060/commits/f1477b5cf00fca6139007d6f84c8a671ac727ee7.patch
# config the git user.name and user.email
$ git am f1477b5cf00fca6139007d6f84c8a671ac727ee7.patch
# login to the container image repository
$ podman login docker.io/zhlhahaha
# build and push the multi-arch image
$ start_libvirtd.sh
$ ./publish-multiarch-containerdisk.sh example zhlhahaha docker.io
I think we're only running into this now because the cloud images that we've been using so far are for Fedora 32, when btrfs was not yet the default.
Yes, you are right, I do not get this issue when build
fedora-realtime
which is based on Fedora 32. And the issue happens on both Fedora 35 and Fedora 38.Hi, @brianmcarey, is it possible to add btrfs mod into host OS behind prow? Or do you have any suggestion?
@zhlhahaha sorry for only getting back to you now but I was looking into this during the day. The workloads cluster is openshift based which doesn't have btrfs support.
For getting these published in the immediate term - I can publish them from here locally.
We may have to look at moving these images to CentOS stream going forward.
@zhlhahaha The following images have been built and published:
quay.io/kubevirtci/fedora-with-test-tooling:v20230726-3a66690
quay.io/kubevirtci/fedora-realtime:v20230726-3a66690
@zhlhahaha The following images have been built and published:
quay.io/kubevirtci/fedora-with-test-tooling:v20230726-3a66690
quay.io/kubevirtci/fedora-realtime:v20230726-3a66690
Thanks Brian!
@brianmcarey does OpenShift not support btrfs at all, or is that a limitation that could be addressed by e.g. upgrading to a newer release?
Either way, let's make sure we don't lose track of this. Uploading locally-built images obviously works fine in a pinch, but I think we want to move away from that as much as possible and build everything in a controlled environment as part of a formally defined CI job.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
/lifecycle stale
@zhlhahaha does anything still need to happen here, or can we close the issue?
I think we can close this issue, thank @andreabolognani /close
@zhlhahaha: Closing this issue.
@brianmcarey does OpenShift not support btrfs at all, or is that a limitation that could be addressed by e.g. upgrading to a newer release?
@andreabolognani Sorry I meant to get back to you on this but let it slip - I looked into it at the time and I don't believe there is support for btrfs in RHCOS so I think that is where the limitation is.
Either way, let's make sure we don't lose track of this. Uploading locally-built images obviously works fine in a pinch, but I think we want to move away from that as much as possible and build everything in a controlled environment as part of a formally defined CI job.
I will add a task to our backlog to look at where we could run these builds or if there is some way of working around this issue.
@brianmcarey don't worry about it :)
The CentOS Stream 9 cloud images are using xfs instead of btrfs, so maybe switching over could be an alternative way of handling things? I'm not sure whether all software that we want to be in the test images is available in CentOS Stream / EPEL though.
Another approach could be to ditch the cloud images and create our own from scratch using virt-install and a kickstart file. That way we'd have full control over the contents, including the filesystem used. That'd require a non-trivial amount of work though.
If we could make the problem go away by just changing the host OS, that would of course be a lot more convenient ;)
In order to have a well rounded e2e tests on Arm64 platform, we need to build multi-arch fedora-realtime and fedora-with-test-tooling image which are used in many e2e tests.
I have submit a patch serial to make this works. There are some discuss on this in https://github.com/kubevirt/project-infra/pull/2630. To make the build process work, here are steps and corresponding PR links: