hashicorp / terraform-provider-vsphere

Terraform Provider for VMware vSphere
https://registry.terraform.io/providers/hashicorp/vsphere/
Mozilla Public License 2.0
612 stars 449 forks source link

resource vsphere_virtual_machine - ovf_deploy (local) after v1.23.0 removes NIC and default hardware config #1435

Closed AdrianBegg closed 2 years ago

AdrianBegg commented 3 years ago

Terraform Version : 1.0.0

vSphere Provider Version : v1.24.x, v1.25.x, v1.26.x, v2.x

Affected Resource(s)

Terraform Configuration Files

resource "vsphere_virtual_machine" "ova_local" {
  name             = "ABC"
  resource_pool_id = data.vsphere_resource_pool.default.id
  datastore_id     = data.vsphere_datastore.target_ds.id
  host_system_id   = data.vsphere_host.host.id
  datacenter_id    = data.vsphere_datacenter.dc.id
  annotation = "Just an example"
  wait_for_guest_net_timeout = 0
  wait_for_guest_ip_timeout  = 0

  ovf_deploy {
    local_ovf_path    = "C:\example.ova"
    disk_provisioning = "thin"
    ip_protocol       = "IPv4"
    ovf_network_map = {
      "VM Network" = data.vsphere_network.network.id
    }
  }

  vapp {
    properties = {
      "guestinfo.cis.appliance.root.password" = "Justanexample!!",
      "guestinfo.cis.appliance.ssh.enabled"   = "True",
      "guestinfo.cis.appliance.net.ntp"       = var.ntp_servers,
      "hostname"                              = var.vm_hostname,
      "address"                               = var.ip_address,
      "gateway"                               = var.gateway_address,
      "mtu"                                   = var.mtu
      "dnsServers"                            = var.dnsServers,
      "searchDomains"                         = var.dnsSearchDomains
    }
  }

Debug Output

Panic Output

Expected Behavior

OVA deploys with a Network Adapter defined in NetworkSection mapped to Port Group defined in the ovf_network_map attribute, the vCPU and Memory assigned to the machine match the defaults defined in the OVF file. This is the behaviour for all versions up-to v 1.23.0

Actual Behavior

In v1.24.0+the OVA is deployed and during the deployment in vSphere Network Adapter defined in NetworkSection mapped to Port Group defined in the ovf_network_map attribute, the vCPU and Memory assigned to the machine match the defaults defined in the OVF file however just before Power On a second Reconfigure VM task is observed which removes the Network Adapter, sets the vCPU to 1 and sets the Memory to 1024MB.

Steps to Reproduce

Important Factoids

It appears that the default properties from the object (num_cpus, memory & network_interface) are being updated after the OVA/OVF deploy however expected behavior would be that network_interface is ignored if ovf_deploy is set.

References

Community Note

koikonom commented 3 years ago

Hello @AdrianBegg!

Please take a look at https://github.com/hashicorp/terraform-provider-vsphere/pull/1339 and the second example here: https://registry.terraform.io/providers/hashicorp/vsphere/latest/docs/resources/virtual_machine#deploying-vm-from-an-ovfova-template . There were some changes around the way OVF templates are handled.

tenthirtyam commented 2 years ago

@koikonom has the correct reference to changes in the way OVF templates are now handled.

Here is an example that should help: https://github.com/tenthirtyam/terrafom-examples-vmware/blob/main/vmware-cloud/vmc-deploy-vra-cexp/main.tf

Recommend: close/not-a-bug

Ryan

tenthirtyam commented 2 years ago

The configration should look similar to the below:

terraform {
  required_providers {
    vsphere = {
      source  = "hashicorp/vsphere"
      version = ">= 2.0.2"
    }
  }
  required_version = ">= 1.0.8"
}

provider "vsphere" {
  vsphere_server       = var.vsphere_server
  user                 = var.vsphere_username
  password             = var.vsphere_password
  allow_unverified_ssl = var.vsphere_insecure
}

data "vsphere_datacenter" "datacenter" {
  name = var.vsphere_datacenter
}

data "vsphere_resource_pool" "pool" {
  name          = format("%s%s", var.vsphere_cluster, "/Resources")
  datacenter_id = data.vsphere_datacenter.datacenter.id
}

data "vsphere_datastore" "datastore" {
  name          = var.vsphere_datastore
  datacenter_id = data.vsphere_datacenter.datacenter.id
}

data "vsphere_network" "network" {
  name          = var.vsphere_network
  datacenter_id = data.vsphere_datacenter.datacenter.id
}

data "vsphere_host" "host" {
  name          = var.vsphere_host
  datacenter_id = data.vsphere_datacenter.datacenter.id
}

data "vsphere_ovf_vm_template" "ovf" {
  name             = var.name
  resource_pool_id = data.vsphere_resource_pool.pool.id
  datastore_id     = data.vsphere_datastore.datastore.id
  host_system_id   = data.vsphere_host.host.id
  local_ovf_path   = var.local_ovf_path 
  ovf_network_map = {
    "VM Network" : data.vsphere_network.network.id
  }
}

resource "vsphere_virtual_machine" "vm" {
  name                 = var.name
  resource_pool_id     = data.vsphere_resource_pool.pool.id
  datastore_id         = data.vsphere_datastore.datastore.id
  datacenter_id        = data.vsphere_datacenter.datacenter.id
  host_system_id       = data.vsphere_host.host.id
  num_cpus             = data.vsphere_ovf_vm_template.ovf.num_cpus
  num_cores_per_socket = data.vsphere_ovf_vm_template.ovf.num_cores_per_socket
  memory               = data.vsphere_ovf_vm_template.ovf.memory
  guest_id             = data.vsphere_ovf_vm_template.ovf.guest_id
  dynamic "network_interface" {
    for_each = data.vsphere_ovf_vm_template.ovf.ovf_network_map
    content {
      network_id = network_interface.value
    }
  }
  wait_for_guest_net_timeout = 0
  wait_for_guest_ip_timeout  = 0
  ovf_deploy {
    local_ovf_path    = var.local_ovf_path 
    disk_provisioning = "thin"
    ip_protocol       = "IPv4"
    ovf_network_map   = data.vsphere_ovf_vm_template.ovf.ovf_network_map
  }
  vapp {
    properties = {
      "guestinfo.cis.appliance.root.password" = var.root_password,
      "guestinfo.cis.appliance.ssh.enabled"   = "True",
      "guestinfo.cis.appliance.net.ntp"       = var.ntp_servers,
      "hostname"                              = var.vm_hostname,
      "address"                               = var.ip_address,
      "gateway"                               = var.gateway_address,
      "mtu"                                   = var.mtu
      "dnsServers"                            = var.dnsServers,
      "searchDomains"                         = var.dnsSearchDomains
    }
  }
}

@AdrianBegg - can you test and update the issue accordingly?

Ryan @tenthirtyam

AdrianBegg commented 2 years ago

@tenthirtyam : The specific OVA deployment which fails is for the VMware Cloud Availability 4.2.1 Tenant Appliance - if deployed using the above after v 1.23.0 the OVA fails to boot with Unable to apply network adapter settings. and Failed to start vCAv network initializer during first boot.

This appears to still be the case also with deploying with the approach you have outlined above. This is probably an edge case and maybe something that can be addressed with the product team to change how they perform the first boot network configuration but not sure if other products might be impacted ?

Working config (with the v1.2.3):

resource "vsphere_virtual_machine" "vcda_onprem" {
  name                       = var.vm_hostname
  resource_pool_id           = data.vsphere_resource_pool.default.id
  datastore_id               = data.vsphere_datastore.target_ds.id
  host_system_id             = data.vsphere_host.host.id
  datacenter_id              = data.vsphere_datacenter.dc.id
  wait_for_guest_net_timeout = 0
  wait_for_guest_ip_timeout  = 0
  num_cpus                   = 4
  memory                     = 10240
  sync_time_with_host        = true

  ovf_deploy {
    local_ovf_path    = var.vcda_onprem_ova
    disk_provisioning = "thin"
    ip_protocol       = "IPv4"
    ovf_network_map = {
      "VM Network" = data.vsphere_network.network.id
    }
  }

  vapp {
    properties = {
      "guestinfo.cis.appliance.root.password" = "Ex@mple!12", # A temporary password only for build time (must be changed at first boot) random_password.vcda_root_pwd.result
      "guestinfo.cis.appliance.ssh.enabled"   = "True",
      "guestinfo.cis.appliance.net.ntp"       = var.ntp_servers,
      "hostname"                              = var.vm_hostname,
      "address"                               = var.ip_address,
      "gateway"                               = var.gateway_address,
      "mtu"                                   = var.mtu
      "dnsServers"                            = var.dnsServers,
      "searchDomains"                         = var.dnsSearchDomains
    }
  }

  lifecycle {
    ignore_changes = [
      storage_policy_id,
      host_system_id,
      vapp[0]
    ]
  }
}

Failing config (with the v2.0.2):

data "vsphere_ovf_vm_template" "vcda_onprem" {
  name             = var.vm_hostname
  resource_pool_id = data.vsphere_resource_pool.default.id
  datastore_id     = data.vsphere_datastore.target_ds.id
  host_system_id   = data.vsphere_host.host.id
  local_ovf_path   = var.vcda_onprem_ova
  ovf_network_map = {
    "VM Network" : data.vsphere_network.network.id
  }
}

resource "vsphere_virtual_machine" "vcda_onprem" {
  name                 = var.vm_hostname
  resource_pool_id     = data.vsphere_resource_pool.default.id
  datastore_id         = data.vsphere_datastore.target_ds.id
  host_system_id       = data.vsphere_host.host.id
  datacenter_id        = data.vsphere_datacenter.dc.id
  num_cpus             = data.vsphere_ovf_vm_template.vcda_onprem.num_cpus
  num_cores_per_socket = data.vsphere_ovf_vm_template.vcda_onprem.num_cores_per_socket
  memory               = data.vsphere_ovf_vm_template.vcda_onprem.memory
  guest_id             = data.vsphere_ovf_vm_template.vcda_onprem.guest_id
  dynamic "network_interface" {
    for_each = data.vsphere_ovf_vm_template.vcda_onprem.ovf_network_map
    content {
      network_id = network_interface.value
    }
  }
  wait_for_guest_net_timeout = 0
  wait_for_guest_ip_timeout  = 0
  ovf_deploy {
    local_ovf_path    = var.vcda_onprem_ova
    disk_provisioning = "thin"
    ip_protocol       = "IPv4"
    ovf_network_map   = data.vsphere_ovf_vm_template.vcda_onprem.ovf_network_map
  }

  vapp {
    properties = {
      "guestinfo.cis.appliance.root.password" = "Ex@mple!12", # A temporary password only for build time (must be changed at first boot) random_password.vcda_root_pwd.result
      "guestinfo.cis.appliance.ssh.enabled"   = "True",
      "guestinfo.cis.appliance.net.ntp"       = var.ntp_servers,
      "hostname"                              = var.vm_hostname,
      "address"                               = var.ip_address,
      "gateway"                               = var.gateway_address,
      "mtu"                                   = var.mtu
      "dnsServers"                            = var.dnsServers,
      "searchDomains"                         = var.dnsSearchDomains
    }
  }

  lifecycle {
    ignore_changes = [
      storage_policy_id,
      host_system_id,
      vapp[0]
    ]
  }
}
tenthirtyam commented 2 years ago

@AdrianBegg - It does appear that the issue is, in fact, specifically related to the VMware Cloud Director Availability OVA first-boot. I've taken some time to look at it this morning with VCDA v4.2.1 and also get the same issue. There's a KB pertaining to this error being seen in the traditional UI-based deployment.

What's interesting is that if I deploy one manually and another via Terraform the vApp Options are exactly the same outside of the moref and mac address. I'm curious if it's an issue how the variables are being passed and if the same issue would occur with ovftool to rule out an issue with the provider.

Ryan @tenthirtyam

tenthirtyam commented 2 years ago

Hi @AdrianBegg,

I took a fresh look at the issue this afternoon and discovered the cause of the issue.

The reason that the VMware Cloud Director Availability appliance is having issues is related to the SCSI Controller. The default controller in the Terraform provider is pvscsi if no scsi_type is provided.

It you take a look at the VCDA .ovf file you'll see the following:

      <Item>
        <rasd:Address xmlns:rasd="http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/CIM_ResourceAllocationSettingData">0</rasd:Address>
        <rasd:ElementName xmlns:rasd="http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/CIM_ResourceAllocationSettingData">SCSI Controller 0  - lsilogic</rasd:ElementName>
        <rasd:InstanceID xmlns:rasd="http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/CIM_ResourceAllocationSettingData">4</rasd:InstanceID>
        <rasd:ResourceSubType>lsilogic</rasd:ResourceSubType>
        <rasd:ResourceType>6</rasd:ResourceType>
      </Item>

This means that lsilogic (LSI Logic Parallel) should be used. However, the configuration you provided is not setting the scsi_type to lsilogic , so the provider defaults to pvscsi when none is provided/specified. PhotonOS will attempt to boot but the first-boot configurations will fail, and thus, you'll see the issue as your reported

To resolve this issue, add the following:

resource "vsphere_virtual_machine" "vcda_onprem" {
  # ... other configurations ...
  scsi_type  = data.vsphere_ovf_vm_template.vcda_onprem.scsi_type
  # ... other configurations ...
}

Results:

> terraform apply --auto-approve

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # vsphere_virtual_machine.vcda_onprem will be created
  + resource "vsphere_virtual_machine" "vcda_onprem" {
      + boot_retry_delay                        = 10000
      + change_version                          = (known after apply)
      + cpu_limit                               = -1
      + cpu_share_count                         = (known after apply)
      + cpu_share_level                         = "normal"
      + datacenter_id                           = "datacenter-3"
      + datastore_id                            = "datastore-11"
      + default_ip_address                      = (known after apply)
      + ept_rvi_mode                            = "automatic"
      + firmware                                = "bios"
      + force_power_off                         = true
      + guest_id                                = "other3xLinux64Guest"
      + guest_ip_addresses                      = (known after apply)
      + hardware_version                        = (known after apply)
      + host_system_id                          = "host-10"
      + hv_mode                                 = "hvAuto"
      + id                                      = (known after apply)
      + ide_controller_count                    = 2
      + imported                                = (known after apply)
      + latency_sensitivity                     = "normal"
      + memory                                  = 4096
      + memory_limit                            = -1
      + memory_share_count                      = (known after apply)
      + memory_share_level                      = "normal"
      + migrate_wait_timeout                    = 30
      + moid                                    = (known after apply)
      + name                                    = "vcda-terraform"
      + num_cores_per_socket                    = 1
      + num_cpus                                = 4
      + poweron_timeout                         = 300
      + reboot_required                         = (known after apply)
      + resource_pool_id                        = "resgroup-6046"
      + run_tools_scripts_after_power_on        = true
      + run_tools_scripts_after_resume          = true
      + run_tools_scripts_before_guest_shutdown = true
      + run_tools_scripts_before_guest_standby  = true
      + sata_controller_count                   = 0
      + scsi_bus_sharing                        = "noSharing"
      + scsi_controller_count                   = 1
      + scsi_type                               = "lsilogic"
      + shutdown_wait_timeout                   = 3
      + storage_policy_id                       = (known after apply)
      + swap_placement_policy                   = "inherit"
      + uuid                                    = (known after apply)
      + vapp_transport                          = (known after apply)
      + vmware_tools_status                     = (known after apply)
      + vmx_path                                = (known after apply)
      + wait_for_guest_ip_timeout               = 0
      + wait_for_guest_net_routable             = true
      + wait_for_guest_net_timeout              = 0

      + disk {
          + attach            = (known after apply)
          + controller_type   = (known after apply)
          + datastore_id      = (known after apply)
          + device_address    = (known after apply)
          + disk_mode         = (known after apply)
          + disk_sharing      = (known after apply)
          + eagerly_scrub     = (known after apply)
          + io_limit          = (known after apply)
          + io_reservation    = (known after apply)
          + io_share_count    = (known after apply)
          + io_share_level    = (known after apply)
          + keep_on_remove    = (known after apply)
          + key               = (known after apply)
          + label             = (known after apply)
          + path              = (known after apply)
          + size              = (known after apply)
          + storage_policy_id = (known after apply)
          + thin_provisioned  = (known after apply)
          + unit_number       = (known after apply)
          + uuid              = (known after apply)
          + write_through     = (known after apply)
        }

      + network_interface {
          + adapter_type          = "vmxnet3"
          + bandwidth_limit       = -1
          + bandwidth_reservation = 0
          + bandwidth_share_count = (known after apply)
          + bandwidth_share_level = "normal"
          + device_address        = (known after apply)
          + key                   = (known after apply)
          + mac_address           = (known after apply)
          + network_id            = "network-16"
        }

      + ovf_deploy {
          + allow_unverified_ssl_cert = false
          + disk_provisioning         = "thin"
          + enable_hidden_properties  = false
          + local_ovf_path            = "/Users/johnsonryan/Downloads/VMware-Cloud-Director-Availability-On-Premises-4.2.1.2610807-98aa4437ed_OVF10.ova"
          + ovf_network_map           = {
              + "VM Network" = "network-16"
            }
        }

      + vapp {
          + properties = {
              + "address"                               = "172.16.11.100/24"
              + "dnsServers"                            = "172.16.11.11,172.16.11.12"
              + "gateway"                               = "172.16.11.1"
              + "guestinfo.cis.appliance.net.ntp"       = "172.16.11.11,172.16.11.12"
              + "guestinfo.cis.appliance.root.password" = (sensitive)
              + "guestinfo.cis.appliance.ssh.enabled"   = "True"
              + "hostname"                              = "vcda-terraform"
              + "mtu"                                   = "1500"
              + "searchDomains"                         = "rainpole.io"
            }
        }
    }

Plan: 1 to add, 0 to change, 0 to destroy.
vsphere_virtual_machine.vcda_onprem: Creating...
vsphere_virtual_machine.vcda_onprem: Still creating... [10s elapsed]
# ... bend space and time....
vsphere_virtual_machine.vcda_onprem: Creation complete after 3m18s [id=42024fb6-31b5-e849-5d81-5db27256a98b]

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
image

You can find a full example here: https://github.com/tenthirtyam/terrafom-examples-vmware/tree/main/vsphere/vsphere-virtual-machine/clone-ovf-vcda

Hope this helps, Ryan @tenthirtyam

Recommend: close/not-a-bug

tenthirtyam commented 2 years ago

Hi @AdrianBegg,

Did the above example and information help to address your issue?

Specifically, to resolve this issue, add the following to set the correct scsi_type from the data source.

resource "vsphere_virtual_machine" "vcda_onprem" {
  # ... other configurations ...
  scsi_type  = data.vsphere_ovf_vm_template.vcda_onprem.scsi_type
  # ... other configurations ...
}

Ryan

tenthirtyam commented 2 years ago

Hi @AdrianBegg - checking in to see if the previous comment helped to resolve your issue deploying the VCDA using Terraform. Would you mind updating the issue?

Thanks! Ryan

tenthirtyam commented 2 years ago

cc @iBrandyJackson and @appilon for review and closure consideration.

appilon commented 2 years ago

Closing based on suggestion and time since last response. Please open a new issue if needed.

github-actions[bot] commented 2 years ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.