confidential-containers / cloud-api-adaptor

Ability to create Kata pods using cloud provider APIs aka the peer-pods approach
Apache License 2.0
44 stars 71 forks source link

podvm: Understand and reduce podvm permutations #1890

Open stevenhorsman opened 4 days ago

stevenhorsman commented 4 days ago

At the moment we have a matrix of 4 possible options for podvm (mkosi/packer) x (cloud-init/process-user-data). We then multiple this by base OSs too (ubuntu/fedora/rhel) (we will ignore OS version at the moment on the assumption we can sync on that?) and cloud-providers that can support it and it explodes quite a lot and becomes complicated to understand and test We want to reduce this, so we can minimise differences and duplicated code. One possible plan is:

  1. Identify all the variants of podvm builds we have and track who is using what
  2. Try and switch all podvm builds to use mkosi
  3. Switch the CI to mkosi and then deprecate packer
  4. Later investigate removing cloud-init and getting process-user-data to read the data from there if possible?
stevenhorsman commented 4 days ago

Part 1 - Identify podvm builds

Note: I think we decided a while ago that mkosi didn't work so well with ubuntu, so we wanted to switch to a fedora-like stack and deprecate the Ubuntu based podvm builds upstream?

Base OS Architecture Cloud provider(s) mkosi/packer cloud-init/process-user-data Being used? Being tested Notes
Ubuntu amd64 aws packer cloud-init
Ubuntu amd64 azure packer cloud-init
Ubuntu amd64 docker packer cloud-init ✅ ?
Ubuntu amd64 ibmcloud packer cloud-init
Ubuntu amd64 libvirt packer cloud-init
Ubuntu amd64 powervs packer cloud-init
Ubuntu amd64 vsphere packer cloud-init Deprecate this?
Ubuntu s390x ibmcloud/libvirt packer cloud-init
Fedora amd64 aws mkosi process-user-data>
Fedora amd64 azure mkosi process-user-data>
Fedora amd64 docker mkosi? <cloud-init/process-user-data>
Fedora amd64 ibmcloud mkosi? cloud-init?
Fedora amd64 libvirt mkosi? cloud-init?
Fedora amd64 powervs mkosi/packer <cloud-init/process-user-data>
Fedora amd64 vsphere packer? cloud-init Deprecate this?
Fedora s390x ibmcloud/libvirt mkosi? cloud-init?
RHEL amd64 aws mkosi? <cloud-init/process-user-data> Any upstream testing?
RHEL amd64 azure mkosi? <cloud-init/process-user-data> Any upstream testing?
RHEL amd64 docker mkosi? <cloud-init/process-user-data> Any upstream testing?
RHEL amd64 ibmcloud mkosi? cloud-init? Any upstream testing?
RHEL amd64 libvirt mkosi? cloud-init? Any upstream testing?
RHEL amd64 powervs mkosi/packer <cloud-init/process-user-data> Any upstream testing?
RHEL amd64 vsphere packer cloud-init Deprecate this?
RHEL s390x ibmcloud/libvirt <mkosi/packer> cloud-init? Any upstream testing?
mkulke commented 4 days ago

did you mean "Any downstream testing?"

maybe we can have a "being tested" column

mkosi_x86_64 should work on both AWS + Azure.

Afaik all the packer images use cloud-init?

stevenhorsman commented 4 days ago

did you mean "Any downstream testing?"

For RHEL I meant testing of the upstream podvm build, but that testing itself could be manual testing, or testing in a downstream environment (as I'm pretty confident we don't have any upstream automated testing for RHEL). We have some documentation for it though. I hope that helps clarify?

maybe we can have a "being tested" column

Will do

mkosi_x86_64 should work on both AWS + Azure.

These are both using process-user-data I believe and primarly fedora based in the upstream testing?

mkulke commented 4 days ago

did you mean "Any downstream testing?"

From RHEL I meant testing of the upstream podvm build, but that testing itself could be manual testing, or testing in a downstream environment (as I'm pretty confident we don't have automated testing for RHEL). We have some documentation for it though. I hope that helps clarify?

not quite :) I guess we have either (automated) testing in the project or potentially "downstream" (e.g a vendor product that uses CAA). One could argue that untested images, if they are consumed and tested downstream should also be maintained downstream?

These are both using process-user-data I believe and primarly fedora based in the upstream testing?

yes. I think we can just check for "cloud-init" yes/no. cloud-init will not work on dm-verity protected root-fs's. so we could also just check for dm-verity yes/no?

mkulke commented 4 days ago

none of the mkosi image is being tested atm, afaict

mkulke commented 4 days ago

amd64 azure packer image is being tested

stevenhorsman commented 4 days ago

not quite :) I guess we have either (automated) testing in the project or potentially "downstream" (e.g a vendor product that uses CAA). One could argue that untested images, if they are consumed and tested downstream should also be maintained downstream?

So the grey area that I was hinting at was for when pure upstream versions were tested internall. e.g. for ibmcloud, we tested the pure upstream version, but due to lack of publicly available resources those tests were done internally. I agree that if the versions are downstream then the downstream teams are responsible for maintenance (though we want to do our best to not break them, so it's potentially interesting). Sorry, I think I'm mostly overcomplicating an already complicated chart!