hashicorp / terraform-provider-vsphere

Terraform Provider for VMware vSphere
https://registry.terraform.io/providers/hashicorp/vsphere/
Mozilla Public License 2.0
612 stars 450 forks source link

Add support for pulling OVF/OVA directly from remote_ovf_url to ESXI Host #2134

Open cbascom opened 6 months ago

cbascom commented 6 months ago

Community Guidelines

Description

Starting with the vSphere 6.7 API, there is an option to upgrade the current NfcLease from push to pull mode which allows the ESXi host to pull file(s) directly from the remote URL.

This would allow for getting rid of the download being proxied through the host that is running terraform. This would be a big win in cases where the machine running terraform has a slow connection to either the remote URL or the vSphere server.

Use Case(s)

When the ESXi host has direct connectivity to the remote URL, this option would allow for must faster file transfer speeds especially when the host running terraform has a slow connection to the remote URL, the ESXi host, or both.

Potential Terraform Provider Configuration

The following new parameters would be added to the vsphere_virtual_machine resource:

resource "vsphere_virtual_machine" "vmFromRemoteOvf" {
  name                 = "remote-foo"
  datacenter_id        = data.vsphere_datacenter.datacenter.id
  datastore_id         = data.vsphere_datastore.datastore.id
  host_system_id       = data.vsphere_host.host.id
  resource_pool_id     = data.vsphere_resource_pool.default.id

  wait_for_guest_net_timeout = 0
  wait_for_guest_ip_timeout  = 0

  ovf_deploy {
    allow_unverified_ssl_cert = false
    remote_ovf_url                  = "https://example.com/foo.ova"
    pull_upload_mode            = true
    pull_ssl_thumbprint          = "BA:C6:4E:D9:AD:D4:53:B5:86:5A:5D:70:36:CF:89:93:D1:6C:F9:63"
    disk_provisioning              = "thin"
    ip_protocol                        = "IPV4"
    ip_allocation_policy         = "STATIC_MANUAL"
    ovf_network_map = {
      "Network 1" = data.vsphere_network.network.id
      "Network 2" = data.vsphere_network.network.id
    }
  }
  vapp {
    properties = {
      "guestinfo.hostname"     = "remote-foo.example.com",
      "guestinfo.ipaddress"    = "172.16.11.101",
      "guestinfo.netmask"      = "255.255.255.0",
      "guestinfo.gateway"      = "172.16.11.1",
      "guestinfo.dns"          = "172.16.11.4",
      "guestinfo.domain"       = "example.com",
      "guestinfo.ntp"          = "ntp.example.com",
      "guestinfo.password"     = "VMware1!",
      "guestinfo.ssh"          = "True"
    }
  }
}

References

github-actions[bot] commented 6 months ago

Hello, cbascom! πŸ–

Thank you for submitting an issue for this provider. The issue will now enter into the issue lifecycle.

If you want to contribute to this project, please review the contributing guidelines and information on submitting pull requests.

andrii-korchevnyi-rft commented 1 month ago

Hi @cbascom, i found your issue is really great. But could you please explain me a bit the flow, as im struggling to find any info about it. So we have ESXI host, remote_ovf_url and a VM where we triggered the terraform. I thought, that ESXI host will contact directly the remote_ovf_url and download the file from there without doing any connections to the VM where the tf is triggered. But according to the issue, im completely wrong, and ESXI host tries to download it through the VM, is that correct? and im also can not understand who is resolving the remote_ovf_url fqdn, saying the ESXI host in US, VM is in the UK and remote_ovf_url is a VIP which should return the closest location. In that case, it will be resolved to US, and ESXI host will try to download the ova template (which points to US) through the VM in UK?

cbascom commented 1 month ago

@andrii-korchevnyi-rft

Hi @cbascom, i found your issue is really great. But could you please explain me a bit the flow, as im struggling to find any info about it.

I'm not familiar with the codebase too much, but this is what I see currently:

  1. VM that triggered terraform downloads the full OVA from remote_ovf_url to get the OVF descriptor from the OVA: https://github.com/hashicorp/terraform-provider-vsphere/blob/c61a01a8ccccea1f1d2688016961350d1e9c069e/vsphere/internal/helper/ovfdeploy/ovf_helper.go#L309
  2. That OVF descriptor is used to create the ImportSpec: https://github.com/hashicorp/terraform-provider-vsphere/blob/c61a01a8ccccea1f1d2688016961350d1e9c069e/vsphere/internal/helper/ovfdeploy/ovf_helper.go#L546
  3. The HttpNfcLease is returned by ImportVApp which uses the import spec: https://github.com/hashicorp/terraform-provider-vsphere/blob/c61a01a8ccccea1f1d2688016961350d1e9c069e/vsphere/internal/helper/ovfdeploy/ovf_helper.go#L58
  4. Each OvfFileItem in the import spec is uploaded: https://github.com/hashicorp/terraform-provider-vsphere/blob/c61a01a8ccccea1f1d2688016961350d1e9c069e/vsphere/internal/helper/ovfdeploy/ovf_helper.go#L96
  5. For each disk that needs to be uploaded, the entire OVA is downloaded again to your local VM to extract the vmdk and upload that to the ESXI host: https://github.com/hashicorp/terraform-provider-vsphere/blob/c61a01a8ccccea1f1d2688016961350d1e9c069e/vsphere/internal/helper/ovfdeploy/ovf_helper.go#L249

So the flow is:

  1. VM downloads OVA from remote_ovf_url (which means the VM resolves the FQDN) to get the OVF descriptor and throws away the rest of the download
  2. For each disk specified in the OVF descriptor: a. VM downloads OVA from remote_ovf_url (which means the VM resolves the FQDN) to get the disk file b. VM uploads the disk file to the ESXI host

So basically assuming I didn't miss anything, your terraform VM is downloading the OVA multiple times and uploading files contained in it to the ESXI host. The ESXI host is never doing anything with the remote_ovf_url directly.

andrii-korchevnyi-rft commented 1 month ago

@cbascom thank you so much for a quick answer. We captured the traffic from the VM, that runs a tf and got the same result.

Also interesting thing (maybe it will be useful to someone). We did separated the process (we manually downloaded an OVA to the VM and used local_ovf_path, we could download an OVA in 30 sec and deploy VM to the ESXI host within 4 min 30s, without separating it took more then 1h.