hamburger-software / ansible-role-vmware_ubuntu_cloud_image

Ansible role for creating virtual machines based on the Ubuntu Cloud Image in a vSphere environment.
https://galaxy.ansible.com/hamburger_software/vmware_ubuntu_cloud_image
MIT License
19 stars 7 forks source link

Direct to esx host, no VCenter #6

Open beano38 opened 2 years ago

beano38 commented 2 years ago

Have you tried this playbook with communicating directly with ESXi host? Everything works, but the user-data (cloud config) does not modify the guest host. Here's my modified play:

ansible 2.10.16
  config file = /mnt/c/Users/Ben/Home/ansible.cfg
  configured module search path = ['/home/ben/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /home/ben/.local/lib/python3.6/site-packages/ansible
  executable location = /home/ben/.local/bin/ansible
  python version = 3.6.9 (default, Jan 26 2021, 15:33:00) [GCC 8.4.0]
- name: Deploy a Ubuntu Cloud Image Virtual Appliance
  hosts: esxi01
  gather_facts: no

  roles:
    - role: ubuntu
      vars:
        vcenter_hostname: redacted
        vcenter_username: redacted
        vcenter_password: redacted
        vcenter_validate_certs: no
        vmware_datacenter: ha-datacenter
        vmware_datastore: Local SAS RAID6
        # vmware_folder: your-datacenter/vm/some-folder
        ova_file: ubuntu-20.04-server-cloudimg-amd64.ova
        hardware:
          num_cpus: 4
          memory_mb: 4096
        annotation: 'sample VM based on Ubuntu Cloud Image'
        # this avoids excessive syslog messages from multipathd under Ubuntu 20.04
        advanced_settings:
          - key: disk.EnableUUID
            value: 'TRUE'
        customvalues:
          - key: 'yourkey'
            value: 'yourvalue'
        disk:
          - size_gb: 32
            datastore: Local SAS RAID6
            scsi_controller: 0
            unit_number: 0
        static_ip:
          netmask: 24
          gateway: 192.168.1.1
          dns_servers: [8.8.8.8, 4.4.4.4]
          dns_search:
          - home.net
        password: passw0rd

I did modify the main.yml task just a bit to use my network, I wanted to leave everything alone as much as possible:

- name: deploy OVA file
  tags: deploy-ova
  vmware_deploy_ovf:
    hostname: "{{ vcenter_hostname | default(omit) }}"
    username: "{{ vcenter_username | default(omit) }}"
    password: "{{ vcenter_password | default(omit) }}"
    validate_certs: "{{ vcenter_validate_certs | default(omit) }}"
    datacenter: "{{ vmware_datacenter }}"
    datastore: "{{ vmware_datastore }}"
    folder: "{{ vmware_folder | default(omit) }}"
    resource_pool: "{{ vmware_resource_pool | default(omit) }}"
    networks: 
      VM Network: Internet
    allow_duplicates: no
    ova: "{{ ova_file }}"
    name: "{{ vm_guestname }}"
    properties:
      hostname: "{{ vm_hostname }}"
      user-data: "{{ lookup('template', 'user-data.j2') | b64encode }}"
    power_on: no
  delegate_to: localhost

I even took out the user-data variable and base64 encoded my own working file and used the hash instead.

Here is Ansible output

PLAY [Deploy a Ubuntu Cloud Image Virtual Appliance] *******************************************************************

TASK [ubuntu : deploy OVA file] ****************************************************************************************
[WARNING]: Problem validating OVF import spec: Line 107: Unable to parse 'enableMPTSupport' for attribute 'key' on
element 'Config'.
changed: [esxi01 -> localhost]

TASK [ubuntu : configure VM] *******************************************************************************************
[WARNING]: Currently connected to ESXi. customvalues are a vCenter feature, this parameter will be ignored.
changed: [esxi01 -> localhost]

TASK [ubuntu : configure disks] ****************************************************************************************
changed: [esxi01 -> localhost]

TASK [ubuntu : start VM] ***********************************************************************************************
fatal: [esxi01 -> localhost]: FAILED! => {"changed": false, "msg": "Waiting for IP address timed out"}

PLAY RECAP *************************************************************************************************************
esxi01                     : ok=3    changed=3    unreachable=0    failed=1    skipped=1    rescued=0    ignored=0

This works If I import the OVA, modify it to my need and take the the encoded hash and manually edit the VM -> VM Options -> Advanced -> Configuration Parameters and add two key value strings of: guestinfo.userdata - b64 hash of file guestinfo.userdata.encoding - base64

I notice that your play essentially replaces variable of 'user-data' in the OVF

albers commented 2 years ago

@beano38 I never tried to run this role against an ESXi instance. I will take a closer look at this.

By the way, you can specify your network with a custom network mapping such as

vmware_networks: {"VM Network": "Internet"}
beano38 commented 2 years ago

Thanks, I was able to get the networks working. I also was able to modify the VM in the second step with custom userdata and metadata. The metadata allows you to modify the netplan config (essentially the 50-cloud-init.yaml) and applies it during the boot of the server, so you do not need the second play of setting the static IP.

Tested with esxi and the focal cloud-init image. Here's the updated play.

- name: configure VM
  community.vmware.vmware_guest:
    hostname: "{{ ansible_host }}"
    username: "{{ ansible_user }}"
    password: "{{ ansible_password }}"
    validate_certs: no
    name: "{{ vm_guestname }}"
    datacenter: "{{ datacenter }}"
    annotation: "{{ annotation | default(omit) }}"
    hardware: "{{ hardware }}"
    advanced_settings:
      - key: guestinfo.userdata
        value: "{{ lookup('template', 'userdata.j2') | b64encode }}"
      - key: guestinfo.userdata.encoding
        value: base64
      - key: guestinfo.metadata
        value: "{{ lookup('template', 'metadata.j2') | b64encode }}"
      - key: guestinfo.metadata.encoding
        value: base64
    state: present
  delegate_to: localhost

Here is my userdata.j2 and metadata.j2 templates for reference, I've added the host private and public keys so you could "trust" the server when it is online. I had plans of generating the private/public keys dynamically each time and setting them as facts to later be used here and also add to the ansible (and other workstations) ".ssh/known_hosts", but haven't gotten around to it yet.

#cloud-config

timezone: {{ timezone | default('UTC') }}

# Allow users to login with a password
ssh_pwauth: true

# Update cache and all installed packages
package_update: true
package_upgrade: true
package_reboot_if_required: false
packages:
  - tree
  - python3-pip

users:
{% for user in users %}
  - name: {{ user.name }}
    groups: {{ user.groups }}
    gecos: {{ user.full_name }}
    sudo: ALL=(ALL) NOPASSWD:ALL
    shell: {{ user.shell | default('/bin/bash') }}
    passwd: {{ user.password | password_hash('sha512') }} 
    lock_passwd: false
{% if user.ssh_authorized_keys is defined %}
    ssh_authorized_keys:
{% for key in user.ssh_authorized_keys %}
      - {{ key }}
{% endfor %}
{% endif %}
{% endfor %}

# Deploy defined SSH keys on server
{% if ssh_keys is defined %}
ssh_deletekeys: false
ssh_keys:
{% if ssh_keys.rsa_private is defined %}
  rsa_private: |
    {{ ssh_keys.rsa_private | indent(width=4) -}}
{% endif %}
{% if ssh_keys.rsa_public is defined %}
  rsa_public: {{ ssh_keys.rsa_public }}
{% endif %}
{% if ssh_keys.ecdsa_private is defined %}
  ecdsa_private: |
    {{ ssh_keys.ecdsa_private | indent(width=4) -}}
{% endif %}
{% if ssh_keys.ecdsa_public is defined %}
  ecdsa_public: {{ ssh_keys.ecdsa_public }}
{% endif %}
{% if ssh_keys.dsa_private is defined %}
  dsa_private: |
    {{ ssh_keys.dsa_private | indent(width=4) }}
{% endif %}
{% if ssh_keys.dsa_public is defined %}
  dsa_public: {{ ssh_keys.dsa_public }}
{% endif %}
{% if ssh_keys.ed25519_private is defined %}
  ed25519_private: |
    {{ ssh_keys.ed25519_private | indent(width=4) }}
{% endif %}
{% if ssh_keys.ed25519_public is defined %}
  ed25519_public: {{ ssh_keys.ed25519_public }}
{% endif %}
{% endif %}
instance-id: {{ vm_guestname }}
local-hostname: {{ vm_guestname }}

network:
  version: 2
  ethernets:
    ens192:
      addresses: 
        - {{ vm_guest_ipv4 }}/{{ cidr | default('24') }}
      gateway4: {{ gateway_v4 }}
      dhcp4: false
      nameservers:
        addresses:
{% for name_server in name_servers %}
          - {{ name_server }}
{% endfor %}
        search:
{% for search_domain in search_domains %}
          - {{ search_domain }}
{% endfor %}
      dhcp6: true
      optional: true
albers commented 2 years ago

The Ubuntu Cloud images ship with a special cloud-init datasource:

/run/cloud-init/cloud.cfg
   datasource_list: [ OVF, None ]

The OVF datasource checks for the existence of a cd-rom with the file ovf-env.xml. If it exists, it reads this file and - based on the content of this file - performs configuration during first system startup, see this example cloud-init log file:

2018-06-05 12:54:37,559 - util.py[DEBUG]: Running command ['mount', '-o', 'ro,sync', '-t', 'iso9660', '/dev/sr0', '/run/cloud-init/tmp/tmp1wjdeptz'] with allowed return codes [0] (shell=False, capture=True)
2018-06-05 12:54:37,582 - util.py[DEBUG]: Reading from /run/cloud-init/tmp/tmp1wjdeptz/ovf-env.xml (quiet=False)
2018-06-05 12:54:37,584 - util.py[DEBUG]: Read 1466 bytes from /run/cloud-init/tmp/tmp1wjdeptz/ovf-env.xml
2018-06-05 12:54:37,584 - util.py[DEBUG]: Running command ['umount', '/run/cloud-init/tmp/tmp1wjdeptz'] with allowed return codes [0] (shell=False, capture=True)

The VM's cdrom is automatically connected to an ISO image file every time the VM boots. This ISO image contains an ovf-env.xml file that contains the configuration parameters originally defined in the OVA's *.mf file:

$ tar tf ubuntu-20.04-server-cloudimg-amd64.ova
ubuntu-focal-20.04-cloudimg.ovf
ubuntu-focal-20.04-cloudimg.mf
ubuntu-focal-20.04-cloudimg.vmdk

*This magic relies on the template customization mechanism, which recognizes the parameters in the _.mf_ file, prompts for the parameters defined therein, and creates and connects the iso image containing the ovf-env.xml.**

I tried a manual template import of an ubuntu 20.04 cloud image ova on vCenter 7.0.3 and got a "customize template" step prompting for the template parameters.

When I try the same directly against an ESXi 7.0.3 host, there is no such configuration step. The image cannot be customized during import.

My assumption is that the missing configuration step is not just a limitation of the import wizard but also applies to the backing APIs.

Edit: The difference in behaviour is due to vCenter adding a database for customizations, see here:

vCenter Server saves the customized configuration parameters in the vCenter Server database.

and here,

If you are running ovftool on an ESXi host, you must “inject” the parameters into the resulting VM when it is powered on. This is because the ESXi host lacks a cache to store the OVF parameters, as in vCenter Server.

albers commented 2 years ago

I also was able to modify the VM in the second step with custom userdata and metadata. The metadata allows you to modify the netplan config (essentially the 50-cloud-init.yaml) and applies it during the boot of the server, so you do not need the second play of setting the static IP.

@beano38 This is great work that clearly exceeds my rudimentary cloud-init expertise. Would you consider contributing a PR for this? Perhaps you could start with a slimmed-down feature set first.

ArielLahiany commented 2 years ago

@beano38 Thank you very much. I was able to inject the user-data file into the machine. But for some reason, the meta-data file is not injected. May is ask how did you do that?