lavabit / robox

The tools needed to robotically create/configure/provision a large number of operating systems, for a variety of hypervisors, using packer.
626 stars 139 forks source link

Use the proper QEMU binaries and don't enforce KVM acceleration #187

Closed timschumi closed 2 years ago

timschumi commented 3 years ago

With alternative architectures getting more widespread and roboxes potentially getting support for those alternative architectures in the future, we should probably make sure that we're always using the correct QEMU binary.

The original qemu-kvm is deprecated since 2013 (according to the QEMU wiki), when its functionality got merged into upstream QEMU. The (slightly more modern) wrapper that is usually called qemu-kvm is not necessarily available on all systems (not to mention the full path), and it's only purpose is to add a KVM flag to the command line.

Since Packer is able to automatically determine whether KVM support is available (provided that it is neither enforced by a wrapper or by the accelerator option in the JSON) and enables it if possible, we don't need the wrapper or the option to explicitly enable that. This also helps when building on machines that don't support KVM for the chosen architecture.

ladar commented 3 years ago

@timschumi this seems trivial, but could be tricky. The Packer configs were all created on RHEL/CentOS. When I originally created them (years ago), I found that if I didn't point to the qemu-kvm binary, then acceleration wasn't being used, and everything ran slower.

Furthermore, my current primary desktop is running v8.3, which doesn't appear to even have the qemu-system-x86_64 binary. The Linux build machines are all running v7.9 and appear to have the qemu-system-x86_64 binary but it's provided by an EPEL package. On the build robots, I use the QEMU packages from the Advanced Virtualization SIG, which provides QEMU 2..12 (versus 1.5.3 provided by the official repo). Again, based on my testing years ago, this version was faster, and had support for desired features the main repo version lacked. I could be wrong, but I think the biggest feature was better support TRIM support on SSDs, which makes a HUGE difference in zero disk performance.

I'd be more concerned with making sure KVM support isn't hard coded into the Vagrant boxes themselves, so that they work properly on all platforms. In particular on systems with hvf acceleration, and or what happens when trying to run the X64 Vagrant boxes on ARM systems. But haven't had time to test any hvf platforms (MacOS is the only one I know of that might offer this), and I don't have easy access to ARM hardware.

timschumi commented 3 years ago

Back when I created this Pull Request I specifically meant to ask you to confirm whether this is possible in your build setup in the first place, but I'm not sure where that comment went or if it even existed in the first place.

Anyways, this is in no way needed at the moment (hence the RFC), and even at the point where multiple (incompatible) architectures are supported it would just be a safeguard to prevent people from trying to build on the wrong architecture or using the wrong QEMU binary. Additionally, qemu-kvm is just something that I wanted to get rid of because it appears to be specific to certain distributions and packer seems to have reasonable KVM detection nowadays.

Regarding which distributions offer which QEMU version/binary and since when packer supports KVM detection I'll look up at some point in the future.

ladar commented 3 years ago

I agree, that removing the option entirely seems like a viable/best option, but it will need to be carefully tested before I deploy. It already takes ~24 hours to build all of the libvirt boxes, and that will only increase with the new box variants, so I'm wary to make this kind of change, until I know it won't have a big impact on the build time.

I was just skimming the open PRs, issues now, but I'm hoping to devote a few days to the Robox project later in the month. Time that I can use to test add all these changes, and test/debug them. I'll leave this PR open for then, so I remember to test removing the qemu_binary option.

In the same vein, @timschumi if there is anything you want added/changed, submit the PR in the next 2-3 weeks, and I'll be sure to review/add it when I make all the other pending changes/updates. I'm planning to bump the box build version 3.3.x as well, once all the updates/changes are incorporated.

ladar commented 2 years ago

@timschumi I saw you updated this PR. Do you know if VM acceleration is still being used if the qemu-system-x86_64 binary is used.

timschumi commented 2 years ago

I just got a notification that my PR supposedly had merge conflicts (it indeed had some, but those have been there for months), so I figured I should just rebase it to get it out of my notifications menu.

I'm reasonably certain that KVM will be autodetected and usable on this binary since I tested the "Stop enforcing KVM" change on this, but to make sure I'll get some build logs tomorrow.

ladar commented 2 years ago

@timschumi of particular importance is how it will work on a CentOS 7 with the updated Virt Sig QEMU packages, since that's what the build robots use.

Of secondary importance is whether any embedded box config data will cause the vagrant-libvirt plugin to stop using acceleration by default.

That's why I haven't merged this yet. I just haven't had time to really do any testing.

timschumi commented 2 years ago

So I have tested this PR on CentOS 7 now. Freshly set up, and everything else was installed through ./res/providers/providers.sh.

It doesn't look like the Virtualization SIG provides anything else apart from /usr/libexec/qemu-kvm, as that binary is using QEMU 2.12.0 from 2017. The default qemu-system-* binaries are QEMU 2.0.0 instead, from 2008.

That said, the QEMU 2.0.0 binaries do have support for KVM, and packer is able to automatically recognize and use that.

On the other hand, support for the detect-zeroes option is missing, which was only introduced shortly after QEMU 2.0.0 was released. Sadly, it doesn't seem like the Virtualization SIG plans to update those packages any further, so (assuming that we want to pull through with this change) the only way that I can see how we can get "proper" QEMU binaries with all the features that we want is by just compiling them ourselves (i.e. create ./res/providers/qemu.sh and then using that to provision the build servers).

ladar commented 2 years ago

@timschumi I'll load it up on a server and build the generic boxes as 3.6.13 as a test.

ladar commented 2 years ago

Don't worry about the conflicts. I added a bunch of new magma variants (long overdue), and I'm doing the initial build/test using QEMU binaries built using your scripts. It's curently running. If it works, the sed commands are easy enough.

sed -i '/accelerator/d' *libvirt*json
sed -i 's/\/usr\/libexec\/qemu-kvm/qemu-system-x86_64/g' *libvirt.json
sed -i 's/\/usr\/libexec\/qemu-kvm/qemu-system-i386/g' *libvirt-x32.json
evgeni commented 2 years ago

I've just stumbled over this while trying to build a box on my Fedora 35 box, and I think we don't need to set qemu_binary explicitly at all. Packer should be able to figure that out automatically. At least for me, sed -i '/qemu_binary/d' generic-libvirt.json did work just fine to build ubuntu2204 ;)

Edit: according to https://www.packer.io/plugins/builders/qemu#qemu_binary, the default is qemu-system-x86_64 anyways (and code supports this thesis: https://github.com/hashicorp/packer-plugin-qemu/blob/7dd01a2c78a95eb89374de2ef8aa6130a3567482/builder/qemu/config.go#L501-L503)

ladar commented 2 years ago

We currently testing this change. Unfortunately on the build robots, the default QEMU binaries don't support unmapping empty space/zeros, which slows down the builds, which is why the KVM binary is forced. The change we're testing can be replicated with:

sed -i 's/\/usr\/libexec\/qemu-kvm/qemu-system-i386/g' *libvirt-x32.json
sed -i 's/\/usr\/libexec\/qemu-kvm/qemu-system-x86_64/g' *libvirt.json

Using manually compiled QEMU 6.01 binaries. If everything passes, it will be part of the 4.0.0 release (along with the new box configs).

ladar commented 2 years ago

Which reminds me, @timschumi (et al) with all the new box configs, and this change, I thought it would be appropriate to bump the version to 4.0.0. Which makes now a good time to submit/consider any other major changes. They'll be reflected by the major version number bump.

timschumi commented 2 years ago

@evgeni Another issue that we are looking to solve is that we want to ensure that each box gets built for the architecture that it's meant to be built for. The x32 boxes should only include i386-compatible stuff, so it gets built on qemu-system-i386. Similarly, the normal boxes should always be amd64, no matter what the host is running on (so I don't really want to rely on defaults).

@ladar I have worked on a small script overhaul about a year ago, which enabled strict error checking (i.e. -e and -u) everywhere (that includes fixing some stuff that we should have really known about). It wasn't finished (and it still isn't), but I feel like this could be a worthy candidate for inclusion in a 4.0.0 release. That said, I have no idea when I will be able to get it finished, so we might have to delay this PR a bit further.

ladar commented 2 years ago

@timschumi @evgeni I recently realized the following is currently being added to the bundled Vagrantfiles by packer. Not sure if this will change with the modified box files:

# The contents below were provided by the Packer Vagrant post-processor

Vagrant.configure("2") do |config|
  config.vm.provider :libvirt do |libvirt|
    libvirt.driver = "kvm"
  end
end

I know the Vagrant libvirt plugin has limited support for hvf as an accelerator, so not sure what happens in the different scenarios.

The successfully built boxes with modified config were uploaded this morning as 3.6.13 if anyone wants to test.

ladar commented 2 years ago

@ladar I have worked on a small script overhaul about a year ago, which enabled strict error checking (i.e. -e and -u) everywhere (that includes fixing some stuff that we should have really known about). It wasn't finished (and it still isn't), but I feel like this could be a worthy candidate for inclusion in a 4.0.0 release. That said, I have no idea when I will be able to get it finished, so we might have to delay this PR a bit further.

Are you referring to the management scripts (robox.sh/check.sh/providers.sh and the various upload scripts)? Or box the config modules/scripts?

I originally used bash -eux for almost all the scripts, but quickly got tired ephemeral warnings/problems causing boxes to fail. I've since migrated a large portion of the scripts to using a retry function, especially scripts with network dependent commands like apt/apk/dnf/yum/pkg/zypper etc.

For the scripts where retrying isn't really relevant you might see "error" function. That I way I append ; error to just the lines which should cause a fatal error.

Unfortunately the tput command inside some of those functions has been causing problems, depending on how I run things. But I may finally have some shell script logic which will fix it. See the recently overhauled direct.sh for a possible template. I just want to test it more before pasting it everywhere, since it's rather long.

The pipeline has gotten so big at this point though, that it's hard to catch problems unless they go fatal. At the same time, given the automated nature of building, I want to minimize failures. Ultimately I'd like to add more box tests to to the check script, and have it check the output for critical problems, rather than fail on every warning.

I recently tried using shellcheck to lint the direct.sh script, and cleaned it up quite a bit. In a perfect world, I'd add that to the CI pipeline and start verifying all the bash scripts. But based on the effort it took too fix the issues in direct.sh - it will take several hours (perhaps longer) to get all the scripts passing that link checker. But if we did, it should fix a lot, like typos I create when my mouse decides to randomly paste my clipboard into a part of the file that isn't visible. Or other various shell script issues which I don't realize are issues because they aren't triggered under normal circumstances.

Anyways, I figure you have at least a week before I kick off the 4.0.0 run. I have a lot of issues with the new Magma configs that I need to fix (I think most of the problems are with JSON/HTTP configs being copied forward and not updated). I'd also like to include Fedora 36 in the 4.0.0 release, and that keeps getting pushed back. It's currently slated for May 10.

So if you do want to cleanup the config modules under scripts - it shouldn't conflict with what I'm doing (beyond possible changes to the magma.sh modules).

Also, I'm not opposed to removing some distro specific scripts in favor of more common scripts that use logic in the script to adjust. My current take is that whatever a module does: you'd either need to consolidate all the variations into a single common script, or keep all of them separated. I want to avoid a situation where some boxes use a common vagrant.sh script, while some use distro specific versions.

timschumi commented 2 years ago

Regarding the driver setting in the Vagrantfile, it seems like packer still adds that, even if we don't enforce KVM at build time (now, it would be possible that it doesn't add that part if KVM wasn't in use during the build process).

As for the remaining boxes, I'll look into testing a few of them within the next week. Have there been any failed box builds? If not, that should be a pretty good indicator already that we haven't missed enabling any crucial features while building QEMU.

As for the script changes, my plan (or rather, my plan a year ago) was indeed to make the box configuration scripts propagate any errors. That said, I don't have that much time to extensively work on making all the box builds fully work with error checking (and the scripts without error checking don't have any obvious breakages either), so I don't mind keeping it on ice if you want to release 4.0.0 within the next week or two. But maybe I'll find a bit of time to consolidate some of the scripts that have more copy-paste than actual changes.

ladar commented 2 years ago

All of the generic libvirt boxes appear to be building fine after the change. I added new magma configs, which I need to finish/fix before kicking off the 4.0.0 run.

I have a couple of tasks high on my list right now, but hopefully I'll get those done, and still get the build kicked off before the end of the week.

ladar commented 2 years ago

@timschumi looks the like OpenBSD 6 config doesn't work as-is with the new QEMU binaries. It failed during install, so we'll have to run it and check the see what error is showing up on the console. My guess is it's a minor hardware compatability issue. Just mentioning it here in case you can look into it before I can.

timschumi commented 2 years ago

@ladar I just test-built generic-openbsd6-libvirt, and I can't spot anything that is out of order. My setup should also match the build servers quite closely (at least in regards to the distribution and the installed packages). Maybe it's just a temporary failure?

ladar commented 2 years ago

@timschumi looks like the failure wasn't related to QEMU. The mirror server those configs were using removed all the files for older than OpenBSD 7.0.

timschumi commented 2 years ago

@ladar Mind if I rebase and reopen this? The build script for QEMU went missing.

timschumi commented 2 years ago

Eh, it doesn't seem like I'm able to reopen this myself. The commit is pushed, so lets see if it gets picked up automatically once the PR is open again.

EDIT: Alternatively, git fetch https://github.com/lavabit/robox abf09d710d7b5ff3de260826a3573f5f4ce73230 && git cherry-pick FETCH_HEAD should do the trick as well.

ladar commented 2 years ago

@timschumi there have been soooo many box additions/ISO changes, that I figured it was easier to use my sed commands to fix the templates then try to update PR.

But I did want to merge in your QEMU scrpt, it just slipped my mind because I ran it 2 weeks ago forgot about it. It also seems I can't reopen the PR. Would like to open up a new one with just the script?

As I recall there 3 things I planned to do with the QEMU script befor merging. Add a #!/bin/bash line at the top (for my text editors, etc), and have it change to the res/providers/ (technically it changes to wherever the script is actually locoated). You can copy/paste the logic for this out of providers.sh. This way the tarball/build directory get placed in that directory, rather than wherever the script is run from. Finally have it, should cleanup after itself, ie delete the tarball/build directory, if everythng goes smoothly.

Not sure if we should add the tarball/build path to the .gitignore file. I didn't do it for the packer and vagrant download/install functions those are designed to grab the latrest version, which means the file name is constantly changing. But your script used hard coded versions, which is probably better for this situation, so I could land either way in this instance.

Either way, having those files in res/providers/ should reduce the chance they get committed by accident.