vmware / terraform-provider-vcd

Terraform VMware Cloud Director provider
https://www.terraform.io/docs/providers/vcd/
Mozilla Public License 2.0
149 stars 111 forks source link

vcd_vapp: Wait for completion also if no guest_properties set. #649

Open NilsBusche opened 3 years ago

NilsBusche commented 3 years ago

Terraform Version

Terraform v0.13.5

Affected Resource(s)

Terraform Configuration Files

[...]

resource "vcd_vapp" "vapp" {
  name = var.hostname
}

resource "vcd_vapp_access_control" "vappacl" {
  vapp_id               = vcd_vapp.vapp.id
  shared_with_everyone  = true
  everyone_access_level = "ReadOnly"
}

resource "vcd_vapp_org_network" "vappnetattach" {
  vapp_name        = vcd_vapp.vapp.id == "always-not-equal" ? null : var.hostname
  org_network_name = var.network
}

resource "vcd_vapp_org_network" "vappbackupnetattach" {
  vapp_name        = vcd_vapp.vapp.id == "always-not-equal" ? null : var.hostname
  org_network_name = var.backup_network
}

[...]

Debug Output

I have added some debug lines into the provider in resource_vcd_vapp.go:

        vappstatus, err := vapp.GetStatus()
        log.Printf("[TRACE] State before if: %s", vappstatus)
        if _, ok := d.GetOk("guest_properties"); ok {

                vappstatus, err = vapp.GetStatus()
                log.Printf("[TRACE] State before blockwhilestatus: %s", vappstatus)

                // Even though vApp has a task and waits for its completion it happens that it is not ready
                // for operation just after provisioning therefore we wait for it to exit UNRESOLVED state
                err = vapp.BlockWhileStatus("UNRESOLVED", vcdClient.MaxRetryTimeout)
                if err != nil {
                        return fmt.Errorf("timed out waiting for vApp to exit UNRESOLVED state: %s", err)
                }

                vappstatus, err = vapp.GetStatus()
                log.Printf("[TRACE] State after blockwhilestatus: %s", vappstatus)

                guestProperties, err := getGuestProperties(d)
                if err != nil {
                        return fmt.Errorf("unable to convert guest properties to data structure")
                }

                log.Printf("[TRACE] Setting vApp guest properties")
                _, err = vapp.SetProductSectionList(guestProperties)
                if err != nil {
                        return fmt.Errorf("error setting guest properties: %s", err)
                }
        }

Terraform Trace Output without guest_properties set:

 2021-03-02T12:23:34.911+0100 [DEBUG] plugin.terraform-provider-vcd_v3.1.0: 2021/03/02 12:23:34 [TRACE] State before if: UNRESOLVED
2021-03-02T12:23:35.766+0100 [DEBUG] plugin.terraform-provider-vcd_v3.1.0: 2021/03/02 12:23:35 [TRACE] Setting empty properties into statefile because no properties were specified
2021-03-02T12:23:35.939+0100 [DEBUG] plugin.terraform-provider-vcd_v3.1.0: 2021/03/02 12:23:35 [DEBUG] Unlocking "xxxxxxxxxxxxxxxxxxxxx"
2021-03-02T12:23:35.939+0100 [DEBUG] plugin.terraform-provider-vcd_v3.1.0: 2021/03/02 12:23:35 [DEBUG] Unlocked "xxxxxxxxxxxxxxxxxxxxx"
2021/03/02 12:23:35 [WARN] Provider "registry.terraform.io/vmware/vcd" produced an unexpected new value for vcd_vapp.vapp, but we are tolerating it because it is using the legacy plugin SDK.
    The following problems may be the cause of any confusing errors from downstream operations:
      - .description: was null, but now cty.StringVal("")
2021/03/02 12:23:35 [TRACE] eval: *terraform.EvalMaybeTainted
2021/03/02 12:23:35 [TRACE] eval: *terraform.EvalWriteState
2021/03/02 12:23:35 [TRACE] EvalWriteState: recording 0 dependencies for vcd_vapp.vapp
2021/03/02 12:23:35 [TRACE] EvalWriteState: writing current state object for vcd_vapp.vapp
2021/03/02 12:23:35 [TRACE] eval: *terraform.EvalApplyProvisioners
2021/03/02 12:23:35 [TRACE] eval: *terraform.EvalMaybeTainted
2021/03/02 12:23:35 [TRACE] eval: *terraform.EvalWriteState
2021/03/02 12:23:35 [TRACE] EvalWriteState: recording 0 dependencies for vcd_vapp.vapp
2021/03/02 12:23:35 [TRACE] EvalWriteState: writing current state object for vcd_vapp.vapp
2021/03/02 12:23:35 [TRACE] eval: *terraform.EvalIf
2021/03/02 12:23:35 [TRACE] eval: *terraform.EvalIf
2021/03/02 12:23:35 [TRACE] eval: *terraform.EvalWriteDiff
2021/03/02 12:23:35 [TRACE] eval: *terraform.EvalApplyPost
2021/03/02 12:23:35 [TRACE] eval: *terraform.EvalUpdateStateHook
vcd_vapp.vapp: Creation complete after 2s [id=urn:vcloud:vapp:xxxxxxxxxxxxxxxxxxxxxxxx]

Terraform Trace Output with guest_properties set:

2021-03-02T12:17:27.601+0100 [DEBUG] plugin.terraform-provider-vcd_v3.1.0: 2021/03/02 12:17:27 [TRACE] State before if: RESOLVED
2021-03-02T12:17:27.661+0100 [DEBUG] plugin.terraform-provider-vcd_v3.1.0: 2021/03/02 12:17:27 [TRACE] State before blockwhilestatus: RESOLVED
2021-03-02T12:17:28.062+0100 [DEBUG] plugin.terraform-provider-vcd_v3.1.0: 2021/03/02 12:17:28 [TRACE] State after blockwhilestatus: RESOLVED
2021-03-02T12:17:28.062+0100 [DEBUG] plugin.terraform-provider-vcd_v3.1.0: 2021/03/02 12:17:28 [TRACE] Adding guest property: key=key1, value=value1 to object
2021-03-02T12:17:28.062+0100 [DEBUG] plugin.terraform-provider-vcd_v3.1.0: 2021/03/02 12:17:28 [TRACE] Setting vApp guest properties
2021/03/02 12:17:31 [TRACE] dag/walk: vertex "root" is waiting for "meta.count-boundary (EachMode fixup)"
2021/03/02 12:17:31 [TRACE] dag/walk: vertex "provider[\"registry.terraform.io/vmware/vcd\"] (close)" is waiting for "vcd_vapp_org_network.vappbackupnetattach"
2021/03/02 12:17:31 [TRACE] dag/walk: vertex "vcd_vapp_org_network.vappbackupnetattach" is waiting for "vcd_vapp.vapp"
2021/03/02 12:17:31 [TRACE] dag/walk: vertex "vcd_vapp_org_network.vappnetattach" is waiting for "vcd_vapp.vapp"
2021/03/02 12:17:31 [TRACE] dag/walk: vertex "meta.count-boundary (EachMode fixup)" is waiting for "vcd_vapp_access_control.vappacl"
2021/03/02 12:17:31 [TRACE] dag/walk: vertex "vcd_vapp_access_control.vappacl" is waiting for "vcd_vapp.vapp"
2021-03-02T12:17:31.971+0100 [DEBUG] plugin.terraform-provider-vcd_v3.1.0: 2021/03/02 12:17:31 [TRACE] Adding guest property: key=key1, value=value1 to object
2021-03-02T12:17:31.971+0100 [DEBUG] plugin.terraform-provider-vcd_v3.1.0: 2021/03/02 12:17:31 [TRACE] Updating vApp guest properties
2021-03-02T12:17:35.859+0100 [DEBUG] plugin.terraform-provider-vcd_v3.1.0: 2021/03/02 12:17:35 [TRACE] Setting empty properties into statefile because no properties were specified
2021-03-02T12:17:35.859+0100 [DEBUG] plugin.terraform-provider-vcd_v3.1.0: 2021/03/02 12:17:35 [TRACE] Setting properties into statefile
vcd_vapp.vapp: Creation complete after 9s [id=urn:vcloud:xxxxxxxxxxxxxxxxxxxxxxxx]

Expected Behavior

Terraform always should wait until vApp creation is completed.

Actual Behavior

In some cases the next steps depending on successfully created vApp start although the vApp creation is not completed and end in errors like this because Terraform does not wait for completion if optional guest_properties are not set:

Error: error creating vApp org network. &errors.errorString{s:"error updating vApp Network: API Error: 400: [ xxxxxxxxxxxxxxxxxxxxxx ] The entity Ref: com.vmware.vcloud.entity.vapp:xxxxxxxxxxxxxxxx is busy completing an operation VDC_COMPOSE_VAPP. VDC_COMPOSE_VAPP(com.vmware.vcloud.entity.task:xxxxxxxxxxxxxxxxxxxxxxxx)"} 

Steps to Reproduce

Please list the steps required to reproduce the issue, for example:

  1. Create configuration as described above
  2. terraform apply
  3. Add guest_properties section to the configuration
  4. terraform destroy
  5. terraform apply

Possible solution

Move the "wait" block out of the if-condition for guest_properties here: https://github.com/vmware/terraform-provider-vcd/blob/75876b5986d45908a78c1c2acfde29a48c82618a/vcd/resource_vcd_vapp.go#L118

jpbuecken commented 3 years ago

Any news on this one? This is still an issue with VCD 10.3.

This breaks our automated VM deployments.

vcd_vapp.vapp: Creation complete after 2s [id=urn:vcloud:vapp:fee69b93-6fa7-457c-87b4-fd3d36358081]
vcd_vapp_org_network.vappnetattach: Creating...
vcd_vapp_access_control.vappacl: Creating...
vcd_vapp_org_network.vappbackupnetattach[0]: Creating...
vcd_vapp_org_network.vappnetattach: Creation complete after 5s [id=urn:vcloud:network:f11b5312-a231-4ebb-ad85-e887049f1198]
vcd_vapp_access_control.vappacl: Creation complete after 6s [id=urn:vcloud:vapp:fee69b93-6fa7-457c-87b4-fd3d36358081]

Error: error creating vApp org network. &errors.errorString{s:"error updating vApp Network: API Error: 400: [ 4ea08867-49e1-4b74-a76b-676a199f6ab2 ] The entity Ref: com.vmware.vcloud.entity.vapp:fee69b93-6fa7-457c-87b4-fd3d36358081 is busy completing an operation VDC_COMPOSE_VAPP. VDC_COMPOSE_VAPP(com.vmware.vcloud.entity.task:b329b049-8133-4b22-aa17-ff4616964584)"}

Edit: I looks like the issue occur more often if you attach 2 vapp networks. My case uses vcd_vapp_org_network 2 times and a vcd_vapp_access_control.

dataclouder commented 3 years ago

Reproducing this issue is quite difficult. I was able to do it by adding yet another network to the configuration, and running apply/destroy in a loop. After 18 iterations, it failed. We'll look into it.

dataclouder commented 3 years ago

@NilsBusche does the fix you recommend (moving the wait block outside the guest property if) solve the issue for you?

jpbuecken commented 3 years ago

Hello @dataclouder : Since you was able to reproduce the issue with more vcd_vapp_org_network I have another theorie:

We had discussion in the past that "vCd allows only one action per vApp" [1] Maybe this is valid for vcd_vapp_org_network as well? Maybe a vcd_vapp_org_network makes a vapp busy? Does the provider make sure only one vcd_vapp_org_network action is performed at a time instead of in parallel (similar to vapp_vm)?

[1] https://github.com/vmware/terraform-provider-vcd/issues/507#issuecomment-629935300

dataclouder commented 3 years ago

@jpbuecken Thanks for your analysis. This could be one of the contributing causes. However, all functions to create or modify vApp networks are properly locking access to the parent vApp.

There is a line of investigation that I can pursue, however. In vApp, there is a lock on Creation (which invokes Update), but not on update itself. Now, this is something that we must check carefully, because can create issues, but not in this specific test case, where the vApp must be properly created before other resources can start being processed. It's more material to look into, with plenty of hot spots that can be the cause of this issue. We are looking into it

MarioAlexis commented 2 years ago

Any update on this? We are facing the exact same issue on vCD version 10.3.3.19610595 Happening after attaching 2 vcd_vapp_org_network inside a for_each terraform loop.

FYI This issue wasn't there on version 9.7.0 (9.7.0.14534864 )

jpbuecken commented 2 years ago

Hello, I was monitoring https://github.com/hashicorp/terraform-plugin-sdk/issues/67 for a workaround. Idea was I could set a parallism of 1 for resources, that modifies the vAPP (vcd_vapp_access_control , vcd_vapp_org_network)

They mentioned a solution, but it has to be implemented in the provider, not in terraform core: https://github.com/hashicorp/terraform-plugin-sdk/issues/67#issuecomment-1170071965

But I'm sure you are already aware of this?