vmware / terraform-provider-vcd

Terraform VMware Cloud Director provider
https://www.terraform.io/docs/providers/vcd/
Mozilla Public License 2.0
146 stars 112 forks source link

Use parallelism during creation of a vApp with multiple VM's #806

Open linuxcrash opened 2 years ago

linuxcrash commented 2 years ago

Community Note

Description

We are often creating a vApp that has up to 24 VM's through Terraform on vCD. The VM's are all cloned and customized from a base VM image and then customized after power-on. Now when I look how the vApp is created it basically creates one VM after the other taking up to 1+ hour to create the vApp.

Since vCD can run tasks in parallel, it would be fantastic if we could profit of this vCD parallelism when creating a vApp where it would allow for some degree of parallelism for the VM creation.

For example if you create 5 VM's at the same time, our total vApp creation time would probably be reduced to less than 30 minutes to create the very same vApp.

New or Affected Resource(s)

Terraform Configuration (if it applies)

References

Didainius commented 2 years ago

Hello @linuxcrash , Apparently this is not so straight forward. The only way VCD can create VMs within a vApp in parallel is when they are all submitted using one API call. The way Terraform works (one resource per one entity) does not allow us to aggregate multiple resources (VMs) into single API call.

THere is other option for spinning up VMs in parallel you could use - use a standalone VM and we have a resource for it (vcd_vm). Could this possibly help you.

Nielszy commented 4 months ago

@Didainius and @dataclouder I have the same question for creating IP-Sets in parallel. When creating multiple IP-Sets with the Terraform provider (with the default Terraform parallelism of 10), it takes about 2m20s per IP-Set (increasing or decreasing the parallelism does not change the total time it takes per IP-Set to be created in VCD). Is it possible to increase the number of IP-Sets that are being created at the same time in VCD? We would like to create 590 IP-Sets on a new Virtual Data Center Edge Gateway but that takes multiple hours the way Terraform and VCD are now treating the creation of the IP-Sets.

We are using the latest version of the provider en the latest stable version of Terraform. VCD version is 10.5.1.1.

Didainius commented 4 months ago

Hello @Nielszy, VCD does not allow creating multiple IP sets in parallel therefore we have a lock internally in Terraform to avoid getting errors.

Nielszy commented 4 months ago

Hi @Didainius thanks for the quick response! At our organization we use the Terraform provider extensively, but we ran into an interesting problem when creating IP Sets on a new Virtual Data Center Edge Gateway. It takes between one to two minutes for an IP Set to be created and the creation is handled sequentially (it would take 5+ hours to create 690 IP Sets as we would like to do in our specific situation).

I wanted to test things a bit more to see if the speed issue could be solved. To rule out problems with Terraform or our environment I created a small Python script that creates 5 IP Sets simultaneously by posting the data directly to the VCD API and it turns out VCD can create IP Sets in parallel just fine! This is the Python code:

all_ip_sets = [
{
  "orgRef": {
    "name": "TEST",
    "id": "urn:vcloud:org:test"
  },
  "ownerRef": {
    "name": "TEST-GLOBAL-DC-GROUP",
    "id": "urn:vcloud:vdcGroup:test"
  },
  "name": "TEST1",
  "type": "IP_SET",
  "typeValue": "IP_SET",
  "ipAddresses": [
    "10.23.23.23/24"
  ]
},
{
  "orgRef": {
    "name": "TEST",
    "id": "urn:vcloud:orgtest"
  },
  "ownerRef": {
    "name": "TEST-GLOBAL-DC-GROUP",
    "id": "urn:vcloud:vdcGroup:test"
  },
  "name": "TEST2",
  "type": "IP_SET",
  "typeValue": "IP_SET",
  "ipAddresses": [
    "10.23.23.23/24"
  ]
},
{
  "orgRef": {
    "name": "TEST",
    "id": "urn:vcloud:orgtest"
  },
  "ownerRef": {
    "name": "TEST-GLOBAL-DC-GROUP",
    "id": "urn:vcloud:vdcGroup:test"
  },
  "name": "TEST3",
  "type": "IP_SET",
  "typeValue": "IP_SET",
  "ipAddresses": [
    "10.23.23.23/24"
  ]
},
{
  "orgRef": {
    "name": "TEST",
    "id": "urn:vcloud:orgtest"
  },
  "ownerRef": {
    "name": "TEST-GLOBAL-DC-GROUP",
    "id": "urn:vcloud:vdcGroup:test"
  },
  "name": "TEST4",
  "type": "IP_SET",
  "typeValue": "IP_SET",
  "ipAddresses": [
    "10.23.23.23/24"
  ]
},
{
  "orgRef": {
    "name": "TEST",
    "id": "urn:vcloud:orgtest"
  },
  "ownerRef": {
    "name": "TEST-GLOBAL-DC-GROUP",
    "id": "urn:vcloud:vdcGroup:test"
  },
  "name": "TEST5",
  "type": "IP_SET",
  "typeValue": "IP_SET",
  "ipAddresses": [
    "10.23.23.23/24"
  ]
}
]

def create_vdc_ip_set(test_cloud_hostname):
    access_token = get_vcd_access_token(test_cloud_hostname, bearer_token_path, vcd_api_token)
    for ip_set in all_ip_sets:
        ip_set_name = ip_set["name"]
        ip_set = json.dumps(ip_set)
        url = f"{test_cloud_hostname}/cloudapi/1.0.0/firewallGroups"
        test_cloud_response = requests.post(
            url,
            data=ip_set,
            timeout=10,
            headers={
                "Accept": "*/*;version=38.1",
                "Authorization": "Bearer" + " " + access_token,
                "Content-Type": "application/json",
            },
        )
        if test_cloud_response.status_code != 202:
            print(f"Ecounted an error while creating the IP Set (HTTP status code == {test_cloud_response.status_code} and the reason == {test_cloud_response.reason}.")
            break
        else:
            print(f"Request is accepted (HTTP status code == {test_cloud_response.status_code}). About to create IP Set with name: '{ip_set_name}' in Data Center Group: {vdc_dc_group}.")

if __name__ == "__main__":
    create_vdc_ip_set(test_cloud_hostname)

In the screenshot below, that is taken from the VCD GUI one second after starting the Python script, you can clearly see all 5 IP Sets are being created simultaneously: VCD GUI IP Sets

It took only 4 seconds to create 5 IP Sets in parallel!

After performing this test we are curious about two things:

Could you maybe help us out? Thanks for your time!

Didainius commented 4 months ago

Hey, 1m30s for single IP Set is odd. This needs checking. Could you enable logging and see how the API calls are going https://registry.terraform.io/providers/vmware/vcd/latest/docs#logging (you could also share with us, password should be obfuscated, but you might want to obfuscate your hostnames as well)

As for python - I haven't coded python for 6+ years now, but is this really doing it in parallel? Is it not doing sequentially one by one as I see a loop for ip_set in all_ip_sets:

Nielszy commented 4 months ago

@Didainius So what happens in the Python script is, for every IP Set, it creates one post call to the VCD API, because the API does not allow multiple IP Sets to be sent in one call. The script makes those 5 calls to the API within 0.5 seconds and VCD starts creating 5 IP Sets at (practically) the same time and after about 4 seconds all 5 of them have been created succesfully. I agree it is not truly parallel, but it is really fast and VCD can handle it well. This is different when using the Terraform provider. Let me explain: I just created the same 5 IP Sets in the same VCD Edge Gateway with Terraform (all default Terraform settings):

Plan: 5 to add, 1 to change, 0 to destroy.

Changes to Outputs:
  ~ ids = {
      + TEST1                                        = (known after apply)
      + TEST2                                        = (known after apply)
      + TEST3                                        = (known after apply)
      + TEST4                                        = (known after apply)
      + TEST5                                        = (known after apply)
        # (43 unchanged attributes hidden)
    }

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

vcd_nsxt_ip_set.ip_sets["TEST4"]: Creating...
vcd_nsxt_ip_set.ip_sets["TEST3"]: Creating...
vcd_nsxt_ip_set.ip_sets["TEST5"]: Creating...
vcd_nsxt_ip_set.ip_sets["TEST1"]: Creating...
vcd_nsxt_ip_set.ip_sets["TEST2"]: Creating...
vcd_nsxt_ip_set.ip_sets["TEST4"]: Still creating... [10s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST3"]: Still creating... [10s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST5"]: Still creating... [10s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST2"]: Still creating... [10s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST1"]: Still creating... [10s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST3"]: Creation complete after 14s [id=urn:vcloud:firewallGroup:eb122b45-8362-4814-94da-d2bd35840740]
vcd_nsxt_ip_set.ip_sets["TEST4"]: Still creating... [20s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST1"]: Still creating... [20s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST2"]: Still creating... [20s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST5"]: Still creating... [20s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST4"]: Creation complete after 24s [id=urn:vcloud:firewallGroup:0ae23394-e393-4cfe-aac8-68143c651759]
vcd_nsxt_ip_set.ip_sets["TEST1"]: Still creating... [30s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST2"]: Still creating... [30s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST5"]: Still creating... [30s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST5"]: Still creating... [40s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST1"]: Still creating... [40s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST2"]: Still creating... [40s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST5"]: Creation complete after 48s [id=urn:vcloud:firewallGroup:c1a6fb80-023f-4207-9d90-20abba6b2a92]
vcd_nsxt_ip_set.ip_sets["TEST1"]: Still creating... [50s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST2"]: Still creating... [50s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST2"]: Still creating... [1m0s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST1"]: Still creating... [1m0s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST2"]: Still creating... [1m10s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST1"]: Still creating... [1m10s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST2"]: Creation complete after 1m11s [id=urn:vcloud:firewallGroup:363d9071-e871-43f6-b837-bf913aaf76c2]
vcd_nsxt_ip_set.ip_sets["TEST1"]: Still creating... [1m20s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST1"]: Still creating... [1m30s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST1"]: Creation complete after 1m34s [id=urn:vcloud:firewallGroup:71512aa5-a6c4-415b-93e6-c0e23330c0c7]

The first one is pretty fast (14s, still slow compared to creating it in the GUI or with the Python script) but the last one is finished only after 1m34s. When creating 100 IP Sets at a time with Terraform, we see the duration increasing until it stabilizes around 2m10s.

I did not code in Go a lot but in the go-vcloud-director.log I see that when starting the creation of IP Set with name TEST2 it waits for other asynchronous tasks to be finished before actually starting the creation of the IP Set:

2024/03/12 16:30:52 [DEBUG] Checking if elevated API versions are defined for endpoint '1.0.0/firewallGroups/'
2024/03/12 16:30:52 [TRACE] skipping fetch of versions because 12 are stored
2024/03/12 16:30:52 [TRACE] checking max API version against constraints '< 34.0'
2024/03/12 16:30:52 [TRACE] API version 38.1.0 does not satisfy constraints '< 34.0'
2024/03/12 16:30:52 [TRACE] checking current API version against constraints '> 34.0'
2024/03/12 16:30:52 [INFO] API version 37.0.0 satisfies constraints '> 34.0'
2024/03/12 16:30:52 [DEBUG] Found '1' (36.0) elevated API versions for endpoint '1.0.0/firewallGroups/' 
2024/03/12 16:30:52 [DEBUG] Checking if elevated version '36.0' is supported by VCD instance for endpoint '1.0.0/firewallGroups/'
2024/03/12 16:30:52 [TRACE] skipping fetch of versions because 12 are stored
2024/03/12 16:30:52 [TRACE] checking max API version against constraints '>= 36.0'
2024/03/12 16:30:52 [INFO] API version 38.1.0 satisfies constraints '>= 36.0'
2024/03/12 16:30:52 [DEBUG] Elevated version '36.0' is supported by VCD instance for endpoint '1.0.0/firewallGroups/'
2024/03/12 16:30:52 [DEBUG] Will use elevated version '36.0 for endpoint '1.0.0/firewallGroups/'
2024/03/12 16:30:52 [TRACE] Posting *types.NsxtFirewallGroup item to endpoint https://test.cloud/cloudapi/1.0.0/firewallGroups/ with expected response of type *types.NsxtFirewallGroup
2024/03/12 16:30:52 [TRACE] skipping fetch of versions because 12 are stored
2024/03/12 16:30:52 [TRACE] checking max API version against constraints '>= 31'
2024/03/12 16:30:52 [INFO] API version 38.1.0 satisfies constraints '>= 31'
2024/03/12 16:30:52 --------------------------------------------------------------------------------
2024/03/12 16:30:52 Request caller: schema.(*Resource).create-->vcd.resourceVcdNsxtIpSetCreate-->vcd.resourceVcdNsxtIpSetCreate-->govcd.(*Client).OpenApiPostItem-->govcd.createNsxtFirewallGroup-->govcd.(*Client).OpenApiPostItemAndGetHeaders-->govcd.(*Client).openApiPerformPostPut-->govcd.(*Client).newOpenApiRequest
2024/03/12 16:30:52 POST https://test.cloud/cloudapi/1.0.0/firewallGroups/
2024/03/12 16:30:52 --------------------------------------------------------------------------------
2024/03/12 16:30:52 Req header:
2024/03/12 16:30:52     X-Vmware-Vcloud-Access-Token: [********]
2024/03/12 16:30:52     Authorization: [********]
2024/03/12 16:30:52     X-Vmware-Vcloud-Token-Type: [Bearer]
2024/03/12 16:30:52     Accept: [application/json;version=36.0]
2024/03/12 16:30:52     Content-Type: [application/json]
2024/03/12 16:30:52     User-Agent: [terraform-provider-vcd/v3.11.0 (darwin/amd64; isProvider:false)]
2024/03/12 16:30:52 Request data: [196]
{
  "name": "TEST2",
  "description": "",
  "ipAddresses": [
    "172.30.102.24"
  ],
  "ownerRef": {
    "id": "urn:vcloud:vdcGroup:eb41e47c-a80b-47b3-90ef-15cf27ca9187"
  },
  "type": "IP_SET"
}
2024/03/12 16:30:52 [TRACE] Asynchronous task detected, tracking task with HREF: https://test.cloud/api/task/2228889e-1c73-442b-a409-f4a7d21589d2
2024/03/12 16:30:52 --------------------------------------------------------------------------------
2024/03/12 16:30:52 Request caller: govcd.(*Task).WaitTaskCompletion-->govcd.(*Task).WaitTaskCompletion-->govcd.(*Task).WaitInspectTaskCompletion-->govcd.(*Task).Refresh-->govcd.(*Client).newRequest
2024/03/12 16:30:52 GET https://test.cloud/api/task/2228889e-1c73-442b-a409-f4a7d21589d2
2024/03/12 16:30:52 --------------------------------------------------------------------------------
2024/03/12 16:30:52 Req header:
2024/03/12 16:30:52     User-Agent: [terraform-provider-vcd/v3.11.0 (darwin/amd64; isProvider:false)]
2024/03/12 16:30:52     X-Vmware-Vcloud-Access-Token: [********]
2024/03/12 16:30:52     Accept: [application/*+xml;version=37.0]
2024/03/12 16:30:52     X-Vmware-Vcloud-Token-Type: [Bearer]
2024/03/12 16:30:52     Authorization: [********]
2024/03/12 16:30:52 ################################################################################
2024/03/12 16:30:52 Response caller vcd.resourceVcdNsxtIpSetCreate-->govcd.(*Client).OpenApiPostItem-->govcd.createNsxtFirewallGroup-->govcd.(*Task).WaitTaskCompletion-->govcd.(*Task).WaitTaskCompletion-->govcd.(*Task).WaitInspectTaskCompletion-->govcd.(*Task).Refresh-->govcd.decodeBody
2024/03/12 16:30:52 Response status 200 OK
2024/03/12 16:30:52 ################################################################################
2024/03/12 16:30:52 Response header:
2024/03/12 16:30:52     X-Vmware-Vcloud-Request-Execution-Time: [27]
2024/03/12 16:30:52     Content-Type: [application/vnd.vmware.vcloud.task+xml;version=37.0]
2024/03/12 16:30:52     Vary: [Accept-Encoding]
2024/03/12 16:30:52     X-Vmware-Vcloud-Ceip-Id: [6a329f64-1f47-45f9-998d-10f8fb5fb05e]
2024/03/12 16:30:52     X-Vmware-Vcloud-Request-Id: [ec2dd1c5-9c41-4713-ad70-dea34b57ee4f]
2024/03/12 16:30:52     Set-Cookie: [SERVERID=; Expires=Thu, 01-Jan-1970 00:00:01 GMT; path=/]
2024/03/12 16:30:52     Cache-Control: [no-store, must-revalidate]
2024/03/12 16:30:52     Content-Length: [1679]
2024/03/12 16:30:52     Date: [Tue, 12 Mar 2024 15:30:52 GMT]
2024/03/12 16:30:52 Response text: [1679]
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Task xmlns="http://www.vmware.com/vcloud/v1.5" xmlns:vmext="http://www.vmware.com/vcloud/extension/v1.5" xmlns:ovf="http://schemas.dmtf.org/ovf/envelope/1" xmlns:vssd="http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/CIM_VirtualSystemSettingData" xmlns:common="http://schemas.dmtf.org/wbem/wscim/1/common" xmlns:rasd="http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/CIM_ResourceAllocationSettingData" xmlns:vmw="http://www.vmware.com/schema/ovf" xmlns:ovfenv="http://schemas.dmtf.org/ovf/environment/1" xmlns:ns9="http://www.vmware.com/vcloud/versions" status="queued" operation="Creating Firewall Group TEST2(363d9071-e871-43f6-b837-bf913aaf76c2)" operationName="createFirewallGroup" serviceNamespace="com.vmware.vcloud" startTime="2024-03-12T15:30:52.088Z" expiryTime="2024-06-10T15:30:52.088Z" cancelRequested="false" name="task" id="urn:vcloud:task:2228889e-1c73-442b-a409-f4a7d21589d2" href="https://test.cloud/api/task/2228889e-1c73-442b-a409-f4a7d21589d2" type="application/vnd.vmware.vcloud.task+xml">
    <Owner href="" id="urn:vcloud:firewallGroup:363d9071-e871-43f6-b837-bf913aaf76c2" type="application/json" name="TEST2"/>
    <User href="https://test.cloud/api/admin/user/7f823373-34f9-442e-b2e9-07e48f3d423b" id="urn:vcloud:user:7f823373-34f9-442e-b2e9-07e48f3d423b" type="application/vnd.vmware.admin.user+xml" name=""/>
    <Organization href="https://test.cloud/api/org/9c096120-d6b6-4396-8ea3-86038144e640" id="urn:vcloud:org:9c096120-d6b6-4396-8ea3-86038144e640" type="application/vnd.vmware.vcloud.org+xml" name=""/>
    <Details></Details>
</Task>

2024/03/12 16:30:55 --------------------------------------------------------------------------------
2024/03/12 16:30:55 Request caller: govcd.(*Task).WaitTaskCompletion-->govcd.(*Task).WaitTaskCompletion-->govcd.(*Task).WaitInspectTaskCompletion-->govcd.(*Task).Refresh-->govcd.(*Client).newRequest
2024/03/12 16:30:55 GET https://test.cloud/api/task/2228889e-1c73-442b-a409-f4a7d21589d2
2024/03/12 16:30:55 --------------------------------------------------------------------------------
2024/03/12 16:30:55 Req header:
2024/03/12 16:30:55     X-Vmware-Vcloud-Access-Token: [********]
2024/03/12 16:30:55     Accept: [application/*+xml;version=37.0]
2024/03/12 16:30:55     X-Vmware-Vcloud-Token-Type: [Bearer]
2024/03/12 16:30:55     Authorization: [********]
2024/03/12 16:30:55     User-Agent: [terraform-provider-vcd/v3.11.0 (darwin/amd64; isProvider:false)]
2024/03/12 16:30:55 ################################################################################
2024/03/12 16:30:55 Response caller vcd.resourceVcdNsxtIpSetCreate-->govcd.(*Client).OpenApiPostItem-->govcd.createNsxtFirewallGroup-->govcd.(*Task).WaitTaskCompletion-->govcd.(*Task).WaitTaskCompletion-->govcd.(*Task).WaitInspectTaskCompletion-->govcd.(*Task).Refresh-->govcd.decodeBody
2024/03/12 16:30:55 Response status 200 OK
2024/03/12 16:30:55 ################################################################################
2024/03/12 16:30:55 Response header:
2024/03/12 16:30:55     Cache-Control: [no-store, must-revalidate]
2024/03/12 16:30:55     Vary: [Accept-Encoding]
2024/03/12 16:30:55     X-Vmware-Vcloud-Ceip-Id: [6a329f64-1f47-45f9-998d-10f8fb5fb05e]
2024/03/12 16:30:55     Date: [Tue, 12 Mar 2024 15:30:55 GMT]
2024/03/12 16:30:55     X-Vmware-Vcloud-Request-Execution-Time: [19]
2024/03/12 16:30:55     X-Vmware-Vcloud-Request-Id: [14fa9090-624d-464e-8690-2b399bd51f79]
2024/03/12 16:30:55     Content-Type: [application/vnd.vmware.vcloud.task+xml;version=37.0]
2024/03/12 16:30:55     Content-Length: [1802]
2024/03/12 16:30:55     Set-Cookie: [SERVERID=; Expires=Thu, 01-Jan-1970 00:00:01 GMT; path=/]
2024/03/12 16:30:55 Response text: [1802]
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Task xmlns="http://www.vmware.com/vcloud/v1.5" xmlns:vmext="http://www.vmware.com/vcloud/extension/v1.5" xmlns:ovf="http://schemas.dmtf.org/ovf/envelope/1" xmlns:vssd="http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/CIM_VirtualSystemSettingData" xmlns:common="http://schemas.dmtf.org/wbem/wscim/1/common" xmlns:rasd="http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/CIM_ResourceAllocationSettingData" xmlns:vmw="http://www.vmware.com/schema/ovf" xmlns:ovfenv="http://schemas.dmtf.org/ovf/environment/1" xmlns:ns9="http://www.vmware.com/vcloud/versions" status="running" operation="Creating Firewall Group TEST2(363d9071-e871-43f6-b837-bf913aaf76c2)" operationName="createFirewallGroup" serviceNamespace="com.vmware.vcloud" startTime="2024-03-12T15:30:52.088Z" expiryTime="2024-06-10T15:30:52.088Z" cancelRequested="false" name="task" id="urn:vcloud:task:2228889e-1c73-442b-a409-f4a7d21589d2" href="https://test.cloud/api/task/2228889e-1c73-442b-a409-f4a7d21589d2" type="application/vnd.vmware.vcloud.task+xml">
    <Link rel="task:cancel" href="https://test.cloud/api/task/2228889e-1c73-442b-a409-f4a7d21589d2/action/cancel"/>
    <Owner href="" id="urn:vcloud:firewallGroup:363d9071-e871-43f6-b837-bf913aaf76c2" type="application/json" name="TEST2"/>
    <User href="https://test.cloud/api/admin/user/7f823373-34f9-442e-b2e9-07e48f3d423b" id="urn:vcloud:user:7f823373-34f9-442e-b2e9-07e48f3d423b" type="application/vnd.vmware.admin.user+xml" name=""/>
    <Organization href="https://test.cloud/api/org/9c096120-d6b6-4396-8ea3-86038144e640" id="urn:vcloud:org:9c096120-d6b6-4396-8ea3-86038144e640" type="application/vnd.vmware.vcloud.org+xml" name=""/>
    <Details></Details>
</Task>

This is also reflected in the task logs in the VCD GUI (creation of 1 IP Set is started and then it waits multiple seconds after it is finished, before creating the next one): Screenshot 2024-03-12 at 16 50 12

Maybe you can see a reason for the continually increasing amount of time it takes to create the new IP Set after the first few are created. I cannot send the entire log file because it contains to much sensitive information. What I can tell is that there are no errors of any kind in the log file. Looking at these logs it is clear to me that the way the creation of IP Sets is implemented in this provider causes only one task to be started at a time instead of, let's say, 5 at a time (which VCD can handle pretty well). I also tested the deletion of 560 IP Sets at the same time and that also finished fast and without any errors using a Python script that makes one API delete call per IP Set!

Let me know what you think and if it is possible to rewrite the Terraform provider code so that it creates multiple IP Sets at a time (like the Python code did).

Thanks for your time again.

EDIT: I just ran a Terraform apply with 10 IP Sets to show you that the time to creation increases to 2m+:


Changes to Outputs:
  ~ ids = {
      + TEST1                                        = (known after apply)
      + TEST10                                       = (known after apply)
      + TEST2                                        = (known after apply)
      + TEST3                                        = (known after apply)
      + TEST4                                        = (known after apply)
      + TEST5                                        = (known after apply)
      + TEST6                                        = (known after apply)
      + TEST7                                        = (known after apply)
      + TEST8                                        = (known after apply)
      + TEST9                                        = (known after apply)
        # (43 unchanged attributes hidden)
    }

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

vcd_nsxt_ip_set.ip_sets["TEST5"]: Creating...
vcd_nsxt_ip_set.ip_sets["TEST7"]: Creating...
vcd_nsxt_ip_set.ip_sets["TEST2"]: Creating...
vcd_nsxt_ip_set.ip_sets["TEST3"]: Creating...
vcd_nsxt_ip_set.ip_sets["TEST10"]: Creating...
vcd_nsxt_ip_set.ip_sets["TEST6"]: Creating...
vcd_nsxt_ip_set.ip_sets["TEST4"]: Creating...
vcd_nsxt_ip_set.ip_sets["TEST1"]: Creating...
vcd_nsxt_ip_set.ip_sets["TEST9"]: Creating...
vcd_nsxt_ip_set.ip_sets["TEST8"]: Creating...
vcd_nsxt_ip_set.ip_sets["TEST5"]: Still creating... [10s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST7"]: Still creating... [10s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST3"]: Still creating... [10s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST10"]: Still creating... [10s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST6"]: Still creating... [10s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST2"]: Still creating... [10s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST8"]: Still creating... [10s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST9"]: Still creating... [10s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST1"]: Still creating... [10s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST4"]: Still creating... [10s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST6"]: Still creating... [20s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST3"]: Still creating... [20s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST7"]: Still creating... [20s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST2"]: Still creating... [20s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST5"]: Still creating... [20s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST10"]: Still creating... [20s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST4"]: Still creating... [20s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST9"]: Still creating... [20s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST8"]: Still creating... [20s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST1"]: Still creating... [20s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST10"]: Creation complete after 20s [id=urn:vcloud:firewallGroup:abc0efe5-88c7-4596-932b-30c9d1f14e9a]
vcd_nsxt_ip_set.ip_sets["TEST2"]: Still creating... [30s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST5"]: Still creating... [30s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST7"]: Still creating... [30s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST3"]: Still creating... [30s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST6"]: Still creating... [30s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST1"]: Still creating... [30s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST9"]: Still creating... [30s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST4"]: Still creating... [30s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST8"]: Still creating... [30s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST2"]: Creation complete after 36s [id=urn:vcloud:firewallGroup:1759973f-c1f2-46d3-a227-8ede170653ce]
vcd_nsxt_ip_set.ip_sets["TEST7"]: Still creating... [40s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST5"]: Still creating... [40s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST3"]: Still creating... [40s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST6"]: Still creating... [40s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST8"]: Still creating... [40s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST9"]: Still creating... [40s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST4"]: Still creating... [40s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST1"]: Still creating... [40s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST6"]: Creation complete after 46s [id=urn:vcloud:firewallGroup:e80d00f4-4df5-490f-832e-8d22ba717d8f]
vcd_nsxt_ip_set.ip_sets["TEST3"]: Still creating... [50s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST7"]: Still creating... [50s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST5"]: Still creating... [50s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST4"]: Still creating... [50s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST8"]: Still creating... [50s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST1"]: Still creating... [50s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST9"]: Still creating... [50s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST5"]: Still creating... [1m0s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST3"]: Still creating... [1m0s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST7"]: Still creating... [1m0s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST9"]: Still creating... [1m0s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST1"]: Still creating... [1m0s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST4"]: Still creating... [1m0s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST8"]: Still creating... [1m0s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST5"]: Creation complete after 1m2s [id=urn:vcloud:firewallGroup:47434cec-2886-4ba3-a670-187093e4f307]
vcd_nsxt_ip_set.ip_sets["TEST3"]: Still creating... [1m10s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST7"]: Still creating... [1m10s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST8"]: Still creating... [1m10s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST1"]: Still creating... [1m10s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST9"]: Still creating... [1m10s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST4"]: Still creating... [1m10s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST1"]: Creation complete after 1m18s [id=urn:vcloud:firewallGroup:816a91c7-7079-4908-8f82-0c73b1a0c0ff]
vcd_nsxt_ip_set.ip_sets["TEST7"]: Still creating... [1m20s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST3"]: Still creating... [1m20s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST4"]: Still creating... [1m20s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST8"]: Still creating... [1m20s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST9"]: Still creating... [1m20s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST7"]: Still creating... [1m30s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST3"]: Still creating... [1m30s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST9"]: Still creating... [1m30s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST4"]: Still creating... [1m30s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST8"]: Still creating... [1m30s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST8"]: Creation complete after 1m38s [id=urn:vcloud:firewallGroup:ee829527-8ce5-46ca-9c3d-bdd872e07a9c]
vcd_nsxt_ip_set.ip_sets["TEST7"]: Still creating... [1m40s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST3"]: Still creating... [1m40s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST4"]: Still creating... [1m40s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST9"]: Still creating... [1m40s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST3"]: Still creating... [1m50s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST7"]: Still creating... [1m50s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST9"]: Still creating... [1m50s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST4"]: Still creating... [1m50s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST4"]: Creation complete after 1m55s [id=urn:vcloud:firewallGroup:a280f386-5e7a-4b6a-a092-83e8ea4a11b9]
vcd_nsxt_ip_set.ip_sets["TEST3"]: Still creating... [2m0s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST7"]: Still creating... [2m0s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST9"]: Still creating... [2m0s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST7"]: Still creating... [2m10s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST3"]: Still creating... [2m10s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST9"]: Still creating... [2m10s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST7"]: Creation complete after 2m18s [id=urn:vcloud:firewallGroup:e6183bfe-0730-40be-9ed7-3868cb30c44e]
vcd_nsxt_ip_set.ip_sets["TEST9"]: Still creating... [2m20s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST3"]: Still creating... [2m20s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST3"]: Still creating... [2m30s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST9"]: Still creating... [2m30s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST9"]: Still creating... [2m40s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST3"]: Still creating... [2m40s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST3"]: Creation complete after 2m40s [id=urn:vcloud:firewallGroup:2ed3d7ad-9e2e-48dd-92d1-b384152b1872]
vcd_nsxt_ip_set.ip_sets["TEST9"]: Still creating... [2m50s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST9"]: Still creating... [3m0s elapsed]
vcd_nsxt_ip_set.ip_sets["TEST9"]: Creation complete after 3m3s [id=urn:vcloud:firewallGroup:46b8df7a-06f8-4ba7-9429-a2d39775f037]

Apply complete! Resources: 10 added, 0 changed, 0 destroyed.
Didainius commented 4 months ago

I see what you mean. Yes - so at the moment it does create them one by one (although Terraform tries to run more operations in parallel - I think 10 by default), but there is some problem.

The task that you see must be tracked until it is finished. Only after the task is finished - the entity is actually created.

A brief info about Terraform code - when developing the provider we only have information about each particular resource in code - that is - if you create 100 IP Sets - the code that handles it has 100 instances (and in the code there is no exchange mechanism) - each parallel thread has knowledge about exactly one item (itself). Also - if you had any other resources - they each only have information about that particular instance. (e.g. if there is a firewall resource and an IP set resource - in the code we don't have access about that combination)

I will double check once I have a spare env (in a day or so), but the problem was - it used to be that the Edge Gateway (or a VDC Group if Edge Gateway belongs to one) cannot accept any other operations while it has a running task. That lock for parent edge gateway is needed so that if a user is creating a huge list of entries - we don't hit "entity busy" errors (and I agree it is not optimal, but VCD does that). And it wasn't only about IP Sets - it was pretty much about any operation that belongs to Edge Gateway.

For reference, here is the code that handles locks https://github.com/vmware/terraform-provider-vcd/blob/main/vcd/resource_vcd_nsxt_ip_set.go#L81-L89

BTW, are you creating IP Sets in a VDC Group or a "standalone" Edge Gateway?

Nielszy commented 4 months ago

@Didainius Thanks for the explanation and I understand the way Terraform has only information about each instance of a specific resource. It looks like this is not the case anymore: it used to be that the Edge Gateway (or a VDC Group if Edge Gateway belongs to one) cannot accept any other operations while it has a running task.

Let me know what you find once you have access to the spare env! I'm curious if we can speed up deploying things by running things in a more parallel way. Thanks again for looking.

EDIT: I am creating the IP Sets on a Edge Gateway that belongs to a VDC Group.

Nielszy commented 3 months ago

@Didainius Hi! Could you maybe share some findings about your tests in the spare environment? Kind regards Niels

Didainius commented 3 months ago

@Didainius Hi! Could you maybe share some findings about your tests in the spare environment? Kind regards Niels

Hello, Sorry it took quite some time. I had to clarify with VCD internal team to be sure it doesn't cause issues. It looks like this locking can be lifted so we should be able to remove these locks.

Nielszy commented 3 months ago

Hi @Didainius, No problem, thanks for checking in and clarifying! Would be a great improvement for speed and scalability if the locking mechanism could be lifted in the future release of the VCD Provider!