RedHatOfficial / ocp4-vsphere-upi-automation

Automates most of the manual steps of deploying OCP4.x cluster on vSphere
MIT License
131 stars 107 forks source link

Failed to create a virtual machine #89

Closed nsu700 closed 1 year ago

nsu700 commented 2 years ago

Hi, I hit a bug on vSphere 6.7u3, the error msg is

 msg: 'Failed to create a virtual machine : Customization of the guest operating system ''rhel7_64Guest'' is not supported in this configuration. Microsoft Vista (TM) and Linux guests with Logical Volume Manager are supported only for recent ESX host and VMware Tools versions. Refer to vCenter documentation for supported configurations.'

Below is my groups/all.yml file

helper_vm_ip: 192.168.87.180
bootstrap_ignition_url: "http://{{helper_vm_ip}}:8080/ignition/bootstrap.ign"
config:
  provider: vsphere
  base_domain: ocp.com
  cluster_name: ocp4
  fips: false
  networkType: OVNKubernetes
  isolationMode: Multitenant
  installer_ssh_key: "{{ lookup('file', '~/.ssh/helper_rsa.pub') }}"
  pull_secret: "{{ lookup('file', '~/pull-secret.yml') }}"
vcenter:
  ip: 192.168.87.140
  datastore: ssd
  network: Internal Network
  service_account_username: administrator@vsphere.local
  service_account_password: 123456
  admin_username: administrator@vsphere.local
  admin_password: 123456
  datacenter: homelab
  folder_absolute_path:
  vm_power_state: poweredon
  template_name: rhcos-vmware
  hw_version: 14
download:
  clients_url: https://mirror.openshift.com/pub/openshift-v4/clients/ocp/latest
  dependencies_url: https://mirror.openshift.com/pub/openshift-v4/x86_64/dependencies/rhcos/latest
  govc: https://github.com/vmware/govmomi/releases/download/v0.27.4
bootstrap_vms:
  - { name: "bootstrap", macaddr: "52:54:00:60:72:67", ipaddr: "192.168.87.20", cpu: 4, ram: 16384}
master_vms:
  - { name: "master0", macaddr: "52:54:00:e7:9d:67", ipaddr: "192.168.87.21", cpu: 4, ram: 16384}
  - { name: "master1", macaddr: "52:54:00:80:16:23", ipaddr: "192.168.87.22", cpu: 4, ram: 16384}
  - { name: "master2", macaddr: "52:54:00:d5:1c:39", ipaddr: "192.168.87.23", cpu: 4, ram: 16384}
worker_vms:
  - { name: "worker0", macaddr: "52:54:00:f4:26:a1", ipaddr: "192.168.87.11", cpu: 4, ram: 16384}
  - { name: "worker1", macaddr: "52:54:00:82:90:00", ipaddr: "192.168.87.12", cpu: 4, ram: 16384}
static_ip:
  gateway: 192.168.87.1
  netmask: 255.255.255.0
  dns: "{{ helper_vm_ip }}"
  network_interface_name: ens192
network_modifications:
  enabled: true
  clusterNetwork:
  - cidr: 10.128.0.0/14
    hostPrefix: 23
  serviceNetwork:
  - cidr: 172.30.0.0/16
  machineNetwork:
  - cidr: 192.168.29.0/24
proxy:
  enabled: false
  http_proxy: http://helper.ocp4.example.com:3129
  https_proxy: http://helper.ocp4.example.com:3129
  no_proxy: example.com
  cert_content: |
    -----BEGIN CERTIFICATE-----
        <certficate content>
    -----END CERTIFICATE-----
registry:
  enabled: true
  product_repo: openshift-release-dev
  product_release_name: ocp-release
  product_release_version: 4.11.1-x86_64
  username: ansible
  password: ansible
  email: user@awesome.org
  cert_content:
  host: registry.ocp4.ocp.com
  port: 5000
  repo: ocp4/openshift4
ntp:
  custom: enable
  ntp_server_list:
    - 0.rhel.pool.ntp.org
    - 1.rhel.pool.ntp.org

I found that the OVF is always deployed as version 13 and RHEL7 rather than version 14 and RHEL8, may I know whether it is correct?

❯ govc vm.info /homelab/vm/rhcos-vmware
Name:           rhcos-vmware
  Path:         /homelab/vm/rhcos-vmware
  UUID:         42020eff-63b1-ad74-b97a-0e6022e8750f
  Guest name:   Red Hat Enterprise Linux 7 (64-bit)

I tried to manually change the OVF to RHEL8 and version 14, but still fail to create new VM

mallmen commented 2 years ago

I see you are installing 4.11. HW15 is required and you have HW14 defined.

https://docs.openshift.com/container-platform/4.11/installing/installing_vsphere/preparing-to-install-on-vsphere.html#installation-vsphere-infrastructure_preparing-to-install-on-vsphere

mallmen commented 2 years ago

RHCOS is based off RHEL8 or RHEL7 for older version. When running 4.11 as you are, it is based on RHEL8.

nsu700 commented 2 years ago

I tried to install 4.10.16 with HW_Version:15, but it still deploy the OVF as 13, and the same error when creating bootstrap node Below is the log of deploying OVF

TASK [static_ips_ova : Deploy the OVF template into the folder] **********************************************************************************
task path: /root/ocp4-vsphere-upi-automation/roles/static_ips_ova/tasks/main.yml:8
redirecting (type: modules) ansible.builtin.vmware_deploy_ovf to community.vmware.vmware_deploy_ovf
<localhost> ESTABLISH LOCAL CONNECTION FOR USER: root
<localhost> EXEC /bin/sh -c 'echo ~root && sleep 0'
<localhost> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo /root/.ansible/tmp `"&& mkdir "` echo /root/.ansible/tmp/ansible-tmp-1662169009.6666331-8451-249787860128366 `" && echo ansible-tmp-1662169009.6666331-8451-249787860128366="` echo /root/.ansible/tmp/ansible-tmp-1662169009.6666331-8451-249787860128366 `" ) && sleep 0'
redirecting (type: modules) ansible.builtin.vmware_deploy_ovf to community.vmware.vmware_deploy_ovf
Using module file /usr/lib/python3.8/site-packages/ansible_collections/community/vmware/plugins/modules/vmware_deploy_ovf.py
<localhost> PUT /root/.ansible/tmp/ansible-local-71492zyba8kz/tmp5cosf5bu TO /root/.ansible/tmp/ansible-tmp-1662169009.6666331-8451-249787860128366/AnsiballZ_vmware_deploy_ovf.py
<localhost> EXEC /bin/sh -c 'chmod u+x /root/.ansible/tmp/ansible-tmp-1662169009.6666331-8451-249787860128366/ /root/.ansible/tmp/ansible-tmp-1662169009.6666331-8451-249787860128366/AnsiballZ_vmware_deploy_ovf.py && sleep 0'
<localhost> EXEC /bin/sh -c 'PATH=/root/ocp4-vsphere-upi-automation/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin GOVC_USERNAME=administrator@vsphere.local GOVC_PASSWORD='"'"'I3core1024m!@#'"'"' GOVC_URL=https://192.168.87.140 GOVC_DATACENTER=homelab GOVC_INSECURE=1 /usr/libexec/platform-python /root/.ansible/tmp/ansible-tmp-1662169009.6666331-8451-249787860128366/AnsiballZ_vmware_deploy_ovf.py && sleep 0'
<localhost> EXEC /bin/sh -c 'rm -f -r /root/.ansible/tmp/ansible-tmp-1662169009.6666331-8451-249787860128366/ > /dev/null 2>&1 && sleep 0'
changed: [localhost] => changed=true
  instance:
    advanced_settings:
      hpet0.present: 'TRUE'
      migrate.hostLog: rhcos-vmware-3f6dac3b.hlog
      migrate.hostLogState: none
      migrate.migrationId: '0'
      nvram: rhcos-vmware.nvram
      pciBridge0.present: 'TRUE'
      pciBridge4.functions: '8'
      pciBridge4.present: 'TRUE'
      pciBridge4.virtualDev: pcieRootPort
      pciBridge5.functions: '8'
      pciBridge5.present: 'TRUE'
      pciBridge5.virtualDev: pcieRootPort
      pciBridge6.functions: '8'
      pciBridge6.present: 'TRUE'
      pciBridge6.virtualDev: pcieRootPort
      pciBridge7.functions: '8'
      pciBridge7.present: 'TRUE'
      pciBridge7.virtualDev: pcieRootPort
      svga.present: 'TRUE'
      vmware.tools.internalversion: '0'
      vmware.tools.requiredversion: '11333'
    annotation: ''
    current_snapshot: null
    customvalues: {}
    guest_consolidation_needed: false
    guest_question: null
    guest_tools_status: guestToolsNotRunning
    guest_tools_version: '0'
    hw_cluster: null
    hw_cores_per_socket: 1
    hw_datastores:
    - ssd
    hw_esxi_host: 192.168.31.141
    hw_eth0:
      addresstype: assigned
      ipaddresses: null
      label: Network adapter 1
      macaddress: 00:50:56:82:d8:0e
      macaddress_dash: 00-50-56-82-d8-0e
      portgroup_key: null
      portgroup_portkey: null
      summary: Internal Network
    hw_files:
    - '[ssd] rhcos-vmware/rhcos-vmware.vmx'
    - '[ssd] rhcos-vmware/rhcos-vmware.vmsd'
    - '[ssd] rhcos-vmware/rhcos-vmware.vmdk'
    hw_folder: /homelab/vm/ocp4-6h6pz
    hw_guest_full_name: null
    hw_guest_ha_state: null
    hw_guest_id: null
    hw_interfaces:
    - eth0
    hw_is_template: false
    hw_memtotal_mb: 4096
    hw_name: rhcos-vmware
    hw_power_status: poweredOff
    hw_processor_count: 2
    hw_product_uuid: 4202d199-af60-a4fc-2be8-549cdf2597a9
    hw_version: vmx-13
    instance_uuid: 5002a13c-05f0-3cf8-0f08-8fb2bd885a96
    ipv4: null
    ipv6: null
    module_hw: true
    moid: vm-4444
    snapshots: []
    tpm_info:
      provider_id: null
      tpm_present: false
    vimref: vim.VirtualMachine:vm-4444
    vnc: {}
  invocation:
    module_args:
      allow_duplicates: false
      cluster: null
      datacenter: homelab
      datastore: ssd
      deployment_option: null
      disk_provisioning: thin
      esxi_hostname: null
      fail_on_spec_warnings: false
      folder: /homelab/vm/ocp4-6h6pz
      hostname: 192.168.87.140
      inject_ovf_env: false
      name: rhcos-vmware
      networks:
        VM Network: Internal Network
      ova: /root/ocp4-vsphere-upi-automation/downloads/rhcos-vmware.ova
      ovf: /root/ocp4-vsphere-upi-automation/downloads/rhcos-vmware.ova
      password: VALUE_SPECIFIED_IN_NO_LOG_PARAMETER
      port: 443
      power_on: false
      properties: null
      proxy_host: null
      proxy_port: null
      resource_pool: Resources
      username: administrator@vsphere.local
      validate_certs: false
      wait: true
      wait_for_ip_address: false
ddreggors commented 2 years ago

Same here, using OVA 4.10.16 the vmx specifies rhel7_64Guest and HW version as 13. What's worse is the use of customvalues to set the ignition fails badly as VMWare reports that is not supported in this configuration without guest tools installed

"Customization of the guest operating system 'rhel7_64Guest' is not supported in this configuration. Microsoft Vista (TM) and Linux guests with Logical Volume Manager are supported only for recent ESX host and VMware Tools versions. Refer to vCenter documentation for supported configurations."

ddreggors commented 2 years ago

Sorry this is a bit conflated with a secondary issue we found. The customvalues does not cause the error I mention, that issue is that they just get silently ignored and never set without any error.

The error I mentioned was cause by using the following key/values in my clone task:

networks:
    - name: "{{ vcenter.network }}"
      mac: "{{ item.macaddr | default(omit) }}"

These variables are defined and valid, but the OVA does not have guest tools so setting these these network configuration parameters are not supported directly.

ddreggors commented 2 years ago

To be clear the OVA used does have guest tools installed however they are not recognized in the ignition phase and this seems to be the reason for this error. We bypassed this issue by removing the network keys from vmware_guest task and create a new task using command/govc tasks to set these .

ddreggors commented 2 years ago

Example:

- name: Add vmxnet3 Network to bootstrap
  command: "govc vm.network.add -vm '{{ vcenter.folder_absolute_path }}/{{ item.name }}' -net {{ vcenter.network }} -net.adapter=vmxnet3"
  loop: "{{ bootstrap_vms }}"

- name: Change network on bootstrap
  command: "govc vm.network.change -vm '{{ vcenter.folder_absolute_path }}/{{ item.name }}' -net.address {{ item.macaddr | default(omit) }} -net={{ vcenter.network }} ethernet-0"
  loop: "{{ bootstrap_vms }}"
mallmen commented 1 year ago

this issue is due to differences in Ansible version being used, follow the README for version requirement, and this should no longer be an issue