hashicorp / terraform-provider-vsphere

Terraform Provider for VMware vSphere
https://registry.terraform.io/providers/hashicorp/vsphere/
Mozilla Public License 2.0
619 stars 452 forks source link

Terraform customizations just sit waiting. #988

Closed bishopbm1 closed 1 month ago

bishopbm1 commented 4 years ago

Terraform Version

# terraform -v
Terraform v0.12.2

Your version of Terraform is out of date! The latest version
is 0.12.23. You can update by downloading from www.terraform.io/downloads.html

vSphere Provider Version

# terraform providers
.
└── provider.vsphere

# ll .terraform/plugins/linux_amd64/
total 39940
-rwxrwxr-x. 1 root st2packs       83 Mar  7 10:29 lock.json
-rwxr-xr-x. 1 root st2packs 40894112 Mar  4 09:51 terraform-provider-vsphere_v1.16.1_x4

Affected Resource(s)

Terraform Configuration Files

# This file contains the resource that clones a windows template
resource "vsphere_virtual_machine" "windows_vm" {
  count = var.vm_os_type == "windows" ? 1 : 0

  name                 = format("%s.%s", var.vm_hostname, var.vm_domain)
  resource_pool_id     = data.vsphere_resource_pool.pool.id

  datastore_id         = data.vsphere_datastore.datastore.id

  num_cpus = var.vm_num_cpu
  memory   = var.vm_memory_mb

  # enable hot-add for memory and CPU
  cpu_hot_add_enabled    = true
  memory_hot_add_enabled = true

  guest_id  = data.vsphere_virtual_machine.template.guest_id
  scsi_type = data.vsphere_virtual_machine.template.scsi_type

  # Replace the literal '\n' with the escape sequence for a new line
  annotation = replace(var.vm_description, "\\n", "\n")
  folder = var.vm_folder

  # NIC definition, same as the template (hardware info)
  network_interface {
    network_id   = data.vsphere_network.network.id
    adapter_type = data.vsphere_virtual_machine.template.network_interface_types[0]
  }

  # Disk information, same as the template (hardware info)
  dynamic "disk" {
    for_each = data.vsphere_virtual_machine.template.disks

    content {
      label = "disk${disk.key}"
      size = disk.value.size
      eagerly_scrub = disk.value.eagerly_scrub
      thin_provisioned = disk.value.thin_provisioned
      unit_number = disk.key
    }
  }

  # Clone from a template
  clone {
    template_uuid = data.vsphere_virtual_machine.template.id

    # Perform customization after the clone
    customize {
      windows_options {
        admin_password = var.windows_admin_password
        computer_name = var.vm_hostname
        join_domain   = var.vm_domain
        domain_admin_user = var.windows_ad_username
        domain_admin_password = var.windows_ad_password
      }

      network_interface {
        ipv4_address = var.vm_ip_address
        ipv4_netmask = var.vm_ip_netmask
      }

      ipv4_gateway = var.vm_ip_gateway
      dns_server_list = var.vm_dns_server_list

      timeout = var.tf_timeout
    }
  }
}

Debug Output

Terraform_Provision_Logs.zip

Expected Behavior

I would expect that Terraform waits for customization to complete and turns on VM after finishes.

Actual Behavior

This environment is very loaded and can take a while to do even simple tasks such as snapshots, etc. So as a result the Clone/Customizations/PowerOn tasks can take a while to complete. Because of this it seems that Terraform hits a race condition and just waits. Even though on the VMWare side the vm is up and has been successfully customized.

Seems as if the condition occurs on the following lines: https://github.com/terraform-providers/terraform-provider-vsphere/blob/60243ec5db6dfcd192a2a719d1be8a59aec7c791/vsphere/resource_vsphere_virtual_machine.go#L1313-L1344

I see in the uploaded soap calls that we get the system time but i do not see that it ever checks to see if the customizations are complete. I would think that there needs to be some sort of check every so often in the wait processing to see if customizations are finished or not.

Steps to Reproduce

  1. terraform apply

Workaround

A workaround for this issue is to set the timeout to 0 then handle the wait in a different way.

azbpa commented 4 years ago

Facing quite a similar issue when using customization. The key issue is, that sometimes TF is not recognizing that the VM was provisioned and customized successfully (and the behavior is quite random to me).

If it is happening, TF is still in the state "Waiting for VM customization to complete" but on vSphere I can see the event outlining that the customization has finished successfully. After 10 minutes of waiting TF will timeout (although the VM cloned and customized after less than 3 minutes).

bishopbm1 commented 4 years ago

Yes this is the exact issue we had. If you set the timeout to 0 and handle the waiting some other way (if you need to handle the waiting) it will work.

Marshall-Hallenbeck commented 8 months ago

@bishopbm1 did you ever find a workaround in the last 4 years?

burnsjared0415 commented 1 month ago

@bishopbm1 i am having issues reproducing this issue, I have a smaller environment though, are you asking for a wait timer to be added to the code?

github-actions[bot] commented 1 week ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.