Open katexochen opened 1 year ago
Thanks for creating this issue @katexochen. Personally I feel all the goals that you mentioned are important - isolation, reproducibility, usability
We likely should agree on what those goals mean to us.
Like, when we are talking about reproducible builds, are we talking about bit-by-bit reproducibility? Do we want to archive this for all components? Should be easy for binaries. More effort will be needed for the container images, but doable. Reproducible podvm image? I don't think this is possible with packer.
I'm adding this to the agenda of the next peerpod meeting. :)
Related issue regarding the organization of podvm files/templates: https://github.com/confidential-containers/cloud-api-adaptor/issues/899
I've created https://github.com/confidential-containers/cloud-api-adaptor/pull/1057 which is one way to pin the versions used in the podvm build process. But this is more of a PoC at this point
Reviving this thread.
With merging #1388 we have the ability to start using the versions.yaml instead of having hard coded values everywhere. Good step 1. I'd like to discuss what we do next.
Points of discussion:
3.1 Do we want to each component to be cached separately? 3.2 Do we want to try setting up a cache for the rust build packages? 3.3 Do we want all the build code to live in github actions and use ACT for development work instead of our current docker style?
Thoughts? @bpradipt @katexochen @stevenhorsman @mkulke @surajssd
Thanks for picking this up. To make this actionable, I would suggest to untangle as much as possible.
1) We probably have to deal with the BUSL license issue (replacing packer, terraform), to be CNCF compliant, but we should address this separately.
1) My understanding is that there is a particular problem with building images using QEMU + x360 arch + Ubuntu 22.04+. libguestfs looks intriguing, if we can limit ourselves to just provisioning a bunch of static binaries and configuration files on top of an image. At the moment there is some conditional logic in scripts which run on the vm during image build (misc-settings.sh
) This would need to be extracted and I'm not sure whether we can get around installing some dependencies.
2) I agree that a caa_src is rather confusing and not having the option would reduce complexity in the build process.
3.1) Unless that's a requirement, I wouldn't say so. Maybe it would speed up uncached builds, since the individual bins can be built in parallel, but I don't think that justifies the additional complexity. Having the static images in a container that is tagged by a versions.yaml hash would be fine, imo.
3.2) It's not hard to implement, but it would add clutter. Once we have a reliable way to cache the bins we don't need it probably.
3.3) I would prefer build code in makefiles and Dockerfiles, the debug cycle is pretty tough (never tried ACT, but it's less intuitive to use/setup than e.g. Docker builds I suppose)
- [...] Should we move from packer to tools like guestfish/virt-customize or linuxkit?
We have made good experience with mkosi. We recently upstreamed some patches an are now able to build bit-by-bit reproducible OS images for fedora and ubuntu. Builds are highly configurable and config can include conditionals to support different distros in one config.
3.1) Having the static images in a container that is tagged by a versions.yaml hash would be fine, imo.
Would say to do this and replace both builder and binaries stages with directly importing the right prebuilt artifacts.
3.3) I would prefer build code in makefiles and Dockerfiles, the debug cycle is pretty tough (never tried ACT, but it's less intuitive to use/setup than e.g. Docker builds I suppose)
Agreed.
3.3) I would prefer build code in makefiles and Dockerfiles, the debug cycle is pretty tough (never tried ACT, but it's less intuitive to use/setup than e.g. Docker builds I suppose)
Just a quick add that I have tried using act
and it was pretty painful and/or the learning curve was a bit steep, whereas most of us have been through that with docker, so it makes sense to stay on that to me.
I use act
for locally running the github actions and verifying any gh-action workflow if needed. For regular development I don't think it's needed. And it does have a steep learning curve. Using plain dockerfiles/makefiles is simple and easier to work with.
@katexochen does mkosi allow building of cloud provider specific images ? A quick search didn't help.
@katexochen does mkosi allow building of cloud provider specific images ? A quick search didn't help.
mkosi produced a plain image.raw. Upload to the cloud provider needs to be done via the cloud provider CLI (we use a small Go tool and the cloud provider SDKs to facilitate this at Edgeless). Sounds like a drawback, but it is much better this way, no need for CSP credentials in that process. I think you could even build the image fully offline in case you prefetch the packages that should be installed.
mkosi produced a plain image.raw. Upload to the cloud provider needs to be done via the cloud provider CLI (we use a small Go tool and the cloud provider SDKs to facilitate this at Edgeless). Sounds like a drawback, but it is much better this way, no need for CSP credentials in that process. I think you could even build the image fully offline in case you prefetch the packages that should be installed.
I see.. So the workflow is similar to the qcow2 images we create with packer (we can switch to mkosi) and then use the cloud provider specific methods to create native images from the raw image. Do you recommend we start looking into using mkosi?
@katexochen does mkosi
support a foreign target arch? I quickly scanned the docs but couldn't see an answer either way
~hmm, apparently not~ https://github.com/systemd/mkosi/issues/138 ~I guess the same applies to all non-qemu based buildtools like libguestfs~ but it doesn't seem straight forward
I didn't try yet, but there is
Architecture=
,--architecture=
: The architecture to build the image for. A number of architectures can be specified, but which ones are actually supported depends on the distribution used and whether a bootable image is requested or not. When building for a foreign architecture, you'll also need to install and register a user mode emulator for that architecture.
: The following architectures can be specified:
alpha
,arc
,arm
,arm64
,ia64
,loongarch64
,mips64-le
,mips-le
,parisc
,ppc
,ppc64
,ppc64-le
,riscv32
,riscv64
,s390
,s390x
,tilegx
,x86
,x86-64
.
in the docs: https://github.com/systemd/mkosi/blob/main/mkosi/resources/mkosi.md#distribution-section
I did some experiments today, but couldn't get cross-builds working yet. There are definitely some smaller upstream fixes needed. Will investigate further tomorrow.
There is currently quite some movement in the build system:
994 moves the build system to Docker first
1018 improves building local code (with some discussion about isolation in https://github.com/confidential-containers/cloud-api-adaptor/pull/994#issuecomment-1563755080)
598 aims to improve versioning (with the goal of supply-chain security and reproducible builds)
to only name a few.
I propose we discuss the goals we want to reach with the build system (isolation, reproducibility, usability, etc.) and the tools to reach these goals as well as the greater organization of our build in this thread.