canonical / lxd

Powerful system container and virtual machine manager
https://canonical.com/lxd
GNU Affero General Public License v3.0
4.38k stars 930 forks source link

Introduce a `raw.qemu.config` option to override `qemu.conf` #9766

Closed amcduffee closed 2 years ago

amcduffee commented 2 years ago

Required information

Feature Request

LXD is very close to being able to host macOS virtual machines. A few changes are required, in the generated qemu.conf, in order to support a macOS guest: 1) Add an ich9-ahci SATA controller and connect the block devices to it instead of SCSI. Remove the qemu_pcie8 device since it conflicts with the SATA controller slot & function. 2) Allow an alternative vGPU driver (e.g. VGA or qxl-vga) instead of the PCIe virtio-vga. 3) Enable USB and add usb-kbd and usb-tablet devices so keyboard and mouse work over SPICE.

Maybe a few more LXD configuration options can be added to enable these alternative behaviors, for example:

config:
  qemu.gpu.driver = "qxl-vga"
  qemu.block.driver = "sata"
  qemu.usb = true
  qemu.usb.keyboard = true
  qemu.usb.tablet = true

I am not sure the above configuration keys are the best and/or cleanest way to do this? I am open to other ideas that may allow more control over the generated qemu.conf and/or some way to remove/override the generated INI sections.

Here is the discussion thread with a lot more information about my discovery process on how to get macOS working in LXD: https://discuss.linuxcontainers.org/t/macos-can-i-test-custom-qemu-system-x86-64-command-inside-lxd-snap/13063/3

stgraber commented 2 years ago

In general, we're really not keen on adding knobs to control all the different virtualized devices as each of those multiply the number of combinations to validate and support. It also increases the attack surface by introducing fully virtualized devices (rather than relying on the much simpler virtio).

One approach I mentioned in similar issues in the past is that we could add a raw.qemu.config or something along those lines which would let you override the entire qemu.conf with whatever you want.

amcduffee commented 2 years ago

In general, we're really not keen on adding knobs to control all the different virtualized devices as each of those multiply the number of combinations to validate and support. It also increases the attack surface by introducing fully virtualized devices (rather than relying on the much simpler virtio).

One approach I mentioned in similar issues in the past is that we could add a raw.qemu.config or something along those lines which would let you override the entire qemu.conf with whatever you want.

I understand your concerns about too many knobs and the added support complexity, the same thing occurred to me when I was mulling over possible ways I would implement the changes myself.

I did think about a configuration option that allows overriding the entire qemu.conf and thought it might be too heavy handed of an approach to suggest. It is certainly the most flexible because then a user could even set raw.qemu.config to an empty string and setup all devices using CLI flags via raw.qemu if they are more familiar with configuring VMs that way.

I am alright with the ability to override the entire qemu.conf with a configuration option provided that it can support some form of variables or placeholders that LXD replaces with the correct path for block devices.

amcduffee commented 2 years ago

@stgraber So, I decided to take a look at the qemu.conf generation logic in lxd/instance/drivers/driver_qemu.go and it made me realize that a configuration option that overrides the entire qemu.conf is heavy handed and loses quite a lot of useful capabilities. One simple example is being able to change the limits.cpu or limits.memory via lxc config ... commands.

However, there are more significant issues that will arise if an instance is renamed, copied or moved around between storage pools and/or LXD servers because hard-coded paths (e.g. NVRAM file, monitor socket, etc.) in the raw.qemu.config setting won't adjust. Of course, these shortcoming make sense if the ultimate behavior is to make the option a static override.

I am now mixed on the idea of a full raw.qemu.config override and instead really hope there might be an acceptable compromise somewhere in-between? Maybe instead of overriding the entire qemu.conf a more generic mechanism that allows overriding particular device templates (e.g. qemuGPU or qemuDrive) could work? Any other ideas?

I am willing to attempt this modification myself, but would like to do so in a way that doesn't lose all of the benefits of the generated qemu.conf. In particular, the concerns I mentioned above about copying and renaming are specific behaviors that I want to maintain so I can still use LXD to deploy and create multiple instances. Any guidance?

stgraber commented 2 years ago

The other thing to keep in mind is that we're slowly moving away from qemu.conf in favor of assembling the VM config through internal QMP calls. That's because qemu upstream is slowly deprecating the use and support for qemu.conf.

amcduffee commented 2 years ago

Yea, I read about that on other discussion threads. I did also notice a number of QMP calls throughout the code.

I am not really sure what to do about this because the only two things preventing me from using LXD exclusively is the ability to change the primary GPU driver and storage driver. It may be my own use-case bias, but I honestly feel like those are the two most significant "problem" devices when getting more specialized VM configurations to run, so it seems worthwhile to have them more configurable to allow LXD to host more guest OSes.

Is there a way to make raw.qemu.config a partial override, only for certain sections? Can it be easily templated to address the hard-coded paths issue?