confidential-containers / cloud-api-adaptor

Ability to create Kata pods using cloud provider APIs aka the peer-pods approach
Apache License 2.0
44 stars 73 forks source link

Build system organization #1035

Open katexochen opened 1 year ago

katexochen commented 1 year ago

There is currently quite some movement in the build system:

to only name a few.

I propose we discuss the goals we want to reach with the build system (isolation, reproducibility, usability, etc.) and the tools to reach these goals as well as the greater organization of our build in this thread.

bpradipt commented 1 year ago

Thanks for creating this issue @katexochen. Personally I feel all the goals that you mentioned are important - isolation, reproducibility, usability

katexochen commented 1 year ago

We likely should agree on what those goals mean to us.

Like, when we are talking about reproducible builds, are we talking about bit-by-bit reproducibility? Do we want to archive this for all components? Should be easy for binaries. More effort will be needed for the container images, but doable. Reproducible podvm image? I don't think this is possible with packer.

katexochen commented 1 year ago

I'm adding this to the agenda of the next peerpod meeting. :)

katexochen commented 1 year ago

Related issue regarding the organization of podvm files/templates: https://github.com/confidential-containers/cloud-api-adaptor/issues/899

tumberino commented 1 year ago

I've created https://github.com/confidential-containers/cloud-api-adaptor/pull/1057 which is one way to pin the versions used in the podvm build process. But this is more of a PoC at this point

tumberino commented 10 months ago

Reviving this thread.

With merging #1388 we have the ability to start using the versions.yaml instead of having hard coded values everywhere. Good step 1. I'd like to discuss what we do next.

Points of discussion:

  1. 1326 there is on going discussion about the use of HashiCorp products after the license change, and we also have difficulties with our current approach and upgrading our ubuntu image (see #1211). Should we move from packer to tools like guestfish/virt-customize or linuxkit?

  2. 1148 suggests to add the ability to use the local files instead of specifying a remote reference for the caa source. @katexochen has suggested that we should instead switch to only using the local file approach to avoid indirection of the version controls. #1388

  3. 1391 we are attempting to build the podvm more regularly, this is a time/resource expensive task to do from scratch each time, so it would be very beneficial to cache and share as much as possible.

    3.1 Do we want to each component to be cached separately? 3.2 Do we want to try setting up a cache for the rust build packages? 3.3 Do we want all the build code to live in github actions and use ACT for development work instead of our current docker style?

Thoughts? @bpradipt @katexochen @stevenhorsman @mkulke @surajssd

mkulke commented 10 months ago

Thanks for picking this up. To make this actionable, I would suggest to untangle as much as possible.

1) We probably have to deal with the BUSL license issue (replacing packer, terraform), to be CNCF compliant, but we should address this separately. 1) My understanding is that there is a particular problem with building images using QEMU + x360 arch + Ubuntu 22.04+. libguestfs looks intriguing, if we can limit ourselves to just provisioning a bunch of static binaries and configuration files on top of an image. At the moment there is some conditional logic in scripts which run on the vm during image build (misc-settings.sh) This would need to be extracted and I'm not sure whether we can get around installing some dependencies. 2) I agree that a caa_src is rather confusing and not having the option would reduce complexity in the build process. 3.1) Unless that's a requirement, I wouldn't say so. Maybe it would speed up uncached builds, since the individual bins can be built in parallel, but I don't think that justifies the additional complexity. Having the static images in a container that is tagged by a versions.yaml hash would be fine, imo. 3.2) It's not hard to implement, but it would add clutter. Once we have a reliable way to cache the bins we don't need it probably. 3.3) I would prefer build code in makefiles and Dockerfiles, the debug cycle is pretty tough (never tried ACT, but it's less intuitive to use/setup than e.g. Docker builds I suppose)

katexochen commented 10 months ago
  1. [...] Should we move from packer to tools like guestfish/virt-customize or linuxkit?

We have made good experience with mkosi. We recently upstreamed some patches an are now able to build bit-by-bit reproducible OS images for fedora and ubuntu. Builds are highly configurable and config can include conditionals to support different distros in one config.

3.1) Having the static images in a container that is tagged by a versions.yaml hash would be fine, imo.

Would say to do this and replace both builder and binaries stages with directly importing the right prebuilt artifacts.

3.3) I would prefer build code in makefiles and Dockerfiles, the debug cycle is pretty tough (never tried ACT, but it's less intuitive to use/setup than e.g. Docker builds I suppose)

Agreed.

stevenhorsman commented 10 months ago

3.3) I would prefer build code in makefiles and Dockerfiles, the debug cycle is pretty tough (never tried ACT, but it's less intuitive to use/setup than e.g. Docker builds I suppose)

Just a quick add that I have tried using act and it was pretty painful and/or the learning curve was a bit steep, whereas most of us have been through that with docker, so it makes sense to stay on that to me.

bpradipt commented 10 months ago

I use act for locally running the github actions and verifying any gh-action workflow if needed. For regular development I don't think it's needed. And it does have a steep learning curve. Using plain dockerfiles/makefiles is simple and easier to work with.

@katexochen does mkosi allow building of cloud provider specific images ? A quick search didn't help.

katexochen commented 10 months ago

@katexochen does mkosi allow building of cloud provider specific images ? A quick search didn't help.

mkosi produced a plain image.raw. Upload to the cloud provider needs to be done via the cloud provider CLI (we use a small Go tool and the cloud provider SDKs to facilitate this at Edgeless). Sounds like a drawback, but it is much better this way, no need for CSP credentials in that process. I think you could even build the image fully offline in case you prefetch the packages that should be installed.

bpradipt commented 10 months ago

mkosi produced a plain image.raw. Upload to the cloud provider needs to be done via the cloud provider CLI (we use a small Go tool and the cloud provider SDKs to facilitate this at Edgeless). Sounds like a drawback, but it is much better this way, no need for CSP credentials in that process. I think you could even build the image fully offline in case you prefetch the packages that should be installed.

I see.. So the workflow is similar to the qcow2 images we create with packer (we can switch to mkosi) and then use the cloud provider specific methods to create native images from the raw image. Do you recommend we start looking into using mkosi?

tumberino commented 10 months ago

@katexochen does mkosi support a foreign target arch? I quickly scanned the docs but couldn't see an answer either way

mkulke commented 10 months ago

~hmm, apparently not~ https://github.com/systemd/mkosi/issues/138 ~I guess the same applies to all non-qemu based buildtools like libguestfs~ but it doesn't seem straight forward

katexochen commented 10 months ago

I didn't try yet, but there is

Architecture=, --architecture=

: The architecture to build the image for. A number of architectures can be specified, but which ones are actually supported depends on the distribution used and whether a bootable image is requested or not. When building for a foreign architecture, you'll also need to install and register a user mode emulator for that architecture.

: The following architectures can be specified: alpha, arc, arm, arm64, ia64, loongarch64, mips64-le, mips-le, parisc, ppc, ppc64, ppc64-le, riscv32, riscv64, s390, s390x, tilegx, x86, x86-64.

in the docs: https://github.com/systemd/mkosi/blob/main/mkosi/resources/mkosi.md#distribution-section

katexochen commented 10 months ago

I did some experiments today, but couldn't get cross-builds working yet. There are definitely some smaller upstream fixes needed. Will investigate further tomorrow.