stackhpc / ansible-role-libvirt-vm

This role configures and creates VMs on a KVM hypervisor.
129 stars 68 forks source link

Domain xml template doesn't include guest agent channel #17

Open JPvRiel opened 5 years ago

JPvRiel commented 5 years ago

Suggestion: Include guest channel when appropriate?

Noticed from some testing that vm.xml.j2 differs to a virt-install --os_varient rhe7 in that the template does not include guest agent channel, where virt-install creates a domain definition which affectivly has:

<channel type='unix'>
         <target type='virtio' name='org.qemu.guest_agent.0' state='connected'/>
</channel>

In addition, some virtio serial controller and channel options are not set.

Full example for virt-install command:

virt-install --name test --memory 2048 --vcpus 4 --disk size=512 --network none --accelerate --location /var/lib/libvirt/boot/rhel7.6.iso --os-variant rhel7 --nographics --extra-args \"ks=cdrom:/ks.cfg console=tty0 console=ttyS0,115200n8\"

Tested on Red Hat 7.6.

markgoddard commented 5 years ago

Would be happy to see a PR for this if it helps you. I've not knowingly used guest agents before, is that something we should always enable or should it be optional?

JPvRiel commented 5 years ago

is that something we should always enable or should it be optional?

If defaulted, the VM guest should work fine if the channel is added but not used - i.e. agent not installed. So it would be fine to default.

As per Red Hat docs:

The QEMU guest agent runs inside the guest and allows the host machine to issue commands to the guest operating system using libvirt, helping with functions such as freezing and thawing filesystems.

I believe it can help let libvirt gracefully shut down guests instead of suspending them between host reboots, etc.

By the way, I'll expand on other differences I noticed.

To default power management features or not?

Also not sure, noticed ACPI features weren't included in the default template, e.g.:

  <features>
    <acpi/>
    <apic/>
  </features>

Would need to dig in source code/docs more to find out libvirt manages guests via the agent or ACPI to get them shutdown via libvirt.

Clock optimisations

I think the default may or may not be missing some clock optimizations

<clock offset> option is harder to default given the different behaviour of Linux vs Windows, but suffice to say, virt-install did this for Linux RHEL7 guest:

  <clock offset='utc'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>

Allow using own XML template?

Regardless, I think it might be good to allow the play calling the role to overwrite the default xml template the role supplies to the virt module (or maybe I missed how to do so). I'll try put together a PR to provide an option for this to allow people to tweak / provide their own xml.

I find virt-install does quite a good job of an optimal setup given the right --os-variant flag used, but supplying XML requires the user to do this in a more custom way.

Would be painful, but could try see how virt-install creates/uses different templates depending on OS family and variant.

markgoddard commented 5 years ago

is that something we should always enable or should it be optional?

If defaulted, the VM guest should work fine if the channel is added but not used - i.e. agent not installed. So it would be fine to default.

Sounds reasonable.

As per Red Hat docs:

The QEMU guest agent runs inside the guest and allows the host machine to issue commands to the guest operating system using libvirt, helping with functions such as freezing and thawing filesystems.

I believe it can help let libvirt gracefully shut down guests instead of suspending them between host reboots, etc.

By the way, I'll expand on other differences I noticed.

To default power management features or not?

Also not sure, noticed ACPI features weren't included in the default template, e.g.:

  <features>
    <acpi/>
    <apic/>
  </features>

Would need to dig in source code/docs more to find out libvirt manages guests via the agent or ACPI to get them shutdown via libvirt.

ACPI seems like a reasonable request.

Clock optimisations

I think the default may or may not be missing some clock optimizations

<clock offset> option is harder to default given the different behaviour of Linux vs Windows, but suffice to say, virt-install did this for Linux RHEL7 guest:

  <clock offset='utc'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>

Just looking at what OpenStack nova does here, it only adds timer config for KVM. It does add those timers, with the same config, with an optional HPET.

Allow using own XML template?

Regardless, I think it might be good to allow the play calling the role to overwrite the default xml template the role supplies to the virt module (or maybe I missed how to do so). I'll try put together a PR to provide an option for this to allow people to tweak / provide their own xml.

I find virt-install does quite a good job of an optimal setup given the right --os-variant flag used, but supplying XML requires the user to do this in a more custom way.

Would be painful, but could try see how virt-install creates/uses different templates depending on OS family and variant.

I like this idea. Trying to support every libvirt feature is never going to work, so a bring your own template approach would make this role very flexible.

JPvRiel commented 5 years ago

I was hoping that simply having a custom template at ./templates/vm.xml.j2 in the playbook directory might supercede the template supplied with the role, but TL;DR seems not, so adding an explicit option to supply own template.

Ansible docs aren't that clear to me about how this behaves either after reading Documentation of which directories a template is searched for and Search paths in Ansible.

So I'm working on a PR to allow overriding the default via an option xml_file to pass per VM defined.

More about using guest agent and clock time settings

See KVM GUEST TIMING MANAGEMENT

When a guest is resumed after a pause or a restoration process, a command to synchronize the guest clock to a specified value should be issued by the management software (such as virt-manager). This synchronization works only if the QEMU guest agent is installed in the guest and supports the feature.

Also Recommended default clock/timer settings has <timer name="kvmclock" present="yes"/> as a suggested choice, but only if kvm is enabled, otherwise not.

JPvRiel commented 5 years ago

Okay, so PR for custom xml done.

Should we enhance the current default template? Before submitting another PR, wanted to confirm if the following adaptations of the current default template are desirable - including an extra switch for interfaces to handle bridged mode. I think 1 to 3 are okay, but unsure about 4 - clock settings.

1. Guest agent support

Add guest agent channel - as above, no risk/issue if VM guest doesn't have the agent installed. In the devices section:

    <!-- support qemu guest agent -->
    <controller type='virtio-serial' index='0'>
      <alias name='virtio-serial0'/>
    </controller>
    <channel type='unix'>
      <target type='virtio' name='org.qemu.guest_agent.0'/>
      <address type='virtio-serial' controller='0'/>
    </channel>

2. ACPI

  <features>
    <acpi/>
    <apic/>
  </features>

3. Bridged interface type/mode

As per libvirt wiki: Bridged networking (aka "shared physical device") - Guest configuration

Proposed example config:

      interfaces:
        - type: bridge
          source:
            bridge: br0

Proposed new interfaces jinja2 templating:

{% for interface in interfaces %}
{% if interface.type is not defined or interface.type == 'network' %}
    <interface type='network'>
      <source network='{{ interface.network }}'/>
{% elif interface.type == 'direct' %}
    <interface type='direct'>
      <source dev='{{ interface.source.dev }}' mode='{{ interface.source.mode | default('vepa') }}'/>
{% elif interface.type == 'bridge' %}
    <interface type='bridge'>
      <source bridge='{{ interface.source.bridge }}'/>
{% endif %}
      <model type='virtio'/>
    </interface>
{% endfor %}

And builds this:

   <interface type='bridge'>
      <source bridge='br0'/>
      <model type='virtio'/>
    </interface>

FYI, above isn't essential. Technically, it's also possible for users to externally define and configured bridged interfaces and a libvirt network in bridged mode such that the network config way targeting a libvirt network setup in bridge mode works, e.g.:

This for domain xml:

    <interface type='network'>
      <mac address='52:54:00:ba:2c:fb'/>
      <source network='uat'/>
      <model type='virtio'/>
    </interface>

And this for network xml:

<network>
  <name>uat</name>
  <forward mode='bridge'/>
  <bridge name='br0' macTableManager='libvirt'/>
</network>

However, doesn't seem like stackhpc.libvirt-host role supports setting up a bridge type libvirt network yet - so above way of setting the type as bridge on the guest helps.

4. clock (but unsure):

Linux guests

Assuming linux VMs, this is how virt-install does it:

  <clock offset='utc'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>

Adding <timer name="kvmclock" present="yes"/> will only work for fairly recent versions of enterprise linux as I recall reading, but good to include by default for modern hosts and linux guests.

Windows guests?

All of the above clock settings aim at linux guests - if a safe default is wanted for windows, not sure that's doable given <clock offset='utc'>

Current default <clock sync="localtime"/> seems to get disregarded as I recall, since when I first run the role, libvirt seemed to have replaced it with <clock offset='utc'>

Maybe have two balanced default options and switch between the templates if guest is Linux vs Windows?

For windows guests, I've used this:

  <features>
    <acpi/>
    <apic/>
    <hyperv>
      <relaxed state='on'/>
      <vapic state='on'/>
      <spinlocks state='on' retries='8191'/>
    </hyperv>
  </features>
  <!-- ... -->
  <clock offset='localtime'>
    <timer name='hypervclock' present='yes'/>
  </clock>
markgoddard commented 5 years ago

1. Guest agent support

Add guest agent channel - as above, no risk/issue if VM guest doesn't have the agent installed. In the devices section:

    <!-- support qemu guest agent -->
    <controller type='virtio-serial' index='0'>
      <alias name='virtio-serial0'/>
    </controller>
    <channel type='unix'>
      <target type='virtio' name='org.qemu.guest_agent.0'/>
      <address type='virtio-serial' controller='0'/>
    </channel>

LGTM

2. ACPI

  <features>
    <acpi/>
    <apic/>
  </features>

LGTM

3. Bridged interface type/mode

As per libvirt wiki: Bridged networking (aka "shared physical device") - Guest configuration

Proposed example config:

      interfaces:
        - type: bridge
          source:
            bridge: br0

Proposed new interfaces jinja2 templating:

{% for interface in interfaces %}
{% if interface.type is not defined or interface.type == 'network' %}
    <interface type='network'>
      <source network='{{ interface.network }}'/>
{% elif interface.type == 'direct' %}
    <interface type='direct'>
      <source dev='{{ interface.source.dev }}' mode='{{ interface.source.mode | default('vepa') }}'/>
{% elif interface.type == 'bridge' %}
    <interface type='bridge'>
      <source bridge='{{ interface.source.bridge }}'/>
{% endif %}
      <model type='virtio'/>
    </interface>
{% endfor %}

And builds this:

   <interface type='bridge'>
      <source bridge='br0'/>
      <model type='virtio'/>
    </interface>

FYI, above isn't essential. Technically, it's also possible for users to externally define and configured bridged interfaces and a libvirt network in bridged mode such that the network config way targeting a libvirt network setup in bridge mode works, e.g.:

This for domain xml:

    <interface type='network'>
      <mac address='52:54:00:ba:2c:fb'/>
      <source network='uat'/>
      <model type='virtio'/>
    </interface>

And this for network xml:

<network>
  <name>uat</name>
  <forward mode='bridge'/>
  <bridge name='br0' macTableManager='libvirt'/>
</network>

However, doesn't seem like stackhpc.libvirt-host role supports setting up a bridge type libvirt network yet - so above way of setting the type as bridge on the guest helps.

We typically use the latter approach you've described here, with libvirt networks in the bridge forwarding mode. The network config required for stackhpc.libvirt-host is like this:

      libvirt_host_networks:
        - name: br-example
          mode: bridge
          bridge: br-example

If using guest bridged networking is useful to you, I'd be happy to accept support for it.

4. clock (but unsure):

Linux guests

Assuming linux VMs, this is how virt-install does it:

  <clock offset='utc'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>

Adding <timer name="kvmclock" present="yes"/> will only work for fairly recent versions of enterprise linux as I recall reading, but good to include by default for modern hosts and linux guests. Again, just comparing with OpenStack nova, it only adds those timers for KVM guests, not qemu. I don't know if there would be any downside to adding them for qemu, perhaps they would just not work? We know which we're using, based on libvirt_vm_engine. OpenStack doesn't enable kvmclock, but the name implies that is KVM-only also.

Windows guests?

All of the above clock settings aim at linux guests - if a safe default is wanted for windows, not sure that's doable given <clock offset='utc'>

Current default <clock sync="localtime"/> seems to get disregarded as I recall, since when I first run the role, libvirt seemed to have replaced it with <clock offset='utc'>

Maybe have two balanced default options and switch between the templates if guest is Linux vs Windows?

For windows guests, I've used this:

  <features>
    <acpi/>
    <apic/>
    <hyperv>
      <relaxed state='on'/>
      <vapic state='on'/>
      <spinlocks state='on' retries='8191'/>
    </hyperv>
  </features>
  <!-- ... -->
  <clock offset='localtime'>
    <timer name='hypervclock' present='yes'/>
  </clock>

We're not currently Windows users, but happy to accept anything (within reason) that improves the role for those who are. Perhaps each VM could take an optional attibute that describes the OS type (linux, windows, etc).

JPvRiel commented 5 years ago

OS type (linux, windows, etc).

We might not have to add an extra switch... Since we've added the option to specify custom XML domain definitions, perhaps a safer way is to leave just your minimal template as the default and pre-package 2 other templates in the template folder, one optimized for Linux, the other Windows, both with more optimal clock settings and update the README to point users who care for the details.

JPvRiel commented 5 years ago

network config required for stackhpc.libvirt-host is like this:

      libvirt_host_networks:
        - name: br-example
          mode: bridge
          bridge: br-example

Thanks, will try it - missed that somehow and did my own config outside of the role.