hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
42.65k stars 9.55k forks source link

Operation of dependencies between resources in different workspaces #17021

Open ionosphere80 opened 6 years ago

ionosphere80 commented 6 years ago

Terraform Version

Terraform v0.11.1
+ provider.vsphere v1.1.1

Terraform Configuration Files

resource "vsphere_virtual_machine" "tf-test1" {
  resource_pool_id = "${data.vsphere_resource_pool.pool.id}"
  datastore_id = "${data.vsphere_datastore.datastore.id}"
  folder = "${vsphere_folder.tf-test.path}"
  name = "${format("tf-test1-%01d", count.index)}"
  count = "${terraform.workspace == "test1" ? 2 : 0}"
  num_cpus  = "${var.vm_medium["num_cpus"]}"
  memory = "${var.vm_medium["memory"]}"
  guest_id = "ubuntu64Guest"
  scsi_type = "pvscsi"
  disk {
    name = "${format("tf-test1-%01d.vmdk", count.index)}"
    size = "${data.vsphere_virtual_machine.template-ubuntu-1604.disks.0.size}"
  }
  network_interface {
    network_id   = "${data.vsphere_network.dev.id}"
  }
  guest_id = "ubuntu64Guest"
  clone {
    template_uuid = "${data.vsphere_virtual_machine.template-ubuntu-1604.id}"
    customize {
      linux_options {
        host_name = "${format("tf-test1-%01d", count.index)}"
        domain    = "local"
      }
      network_interface {
      }
    }
  }
}

resource "vsphere_virtual_machine" "tf-test2" {
  resource_pool_id = "${data.vsphere_resource_pool.pool.id}"
  datastore_id = "${data.vsphere_datastore.datastore.id}"
  folder = "${vsphere_folder.tf-test.path}"
  name = "${format("tf-test2-%01d", count.index)}"
  count = "${terraform.workspace == "test2" ? 2 : 0}"
  num_cpus  = "${var.vm_medium["num_cpus"]}"
  memory = "${var.vm_medium["memory"]}"
  guest_id = "ubuntu64Guest"
  scsi_type = "pvscsi"
  disk {
    name = "${format("tf-test2-%01d.vmdk", count.index)}"
    size = "${data.vsphere_virtual_machine.template-ubuntu-1604.disks.0.size}"
  }
  network_interface {
    network_id   = "${data.vsphere_network.dev.id}"
  }
  guest_id = "ubuntu64Guest"
  depends_on = [ "vsphere_virtual_machine.tf-test1" ]
  clone {
    template_uuid = "${data.vsphere_virtual_machine.template-ubuntu-1604.id}"
    customize {
      linux_options {
        host_name = "${format("tf-test2-%01d", count.index)}"
        domain    = "local"
      }
      network_interface {
      }
    }
  }
}

Debug Output

https://gist.github.com/ionosphere80/a8e1d2d2a2585d1a40c47e3ff53a18f5

Crash Output

Expected Behavior

I'm experimenting with workspaces to reduce the impact of Terraform operations in a single VMware vSphere environment. With all resources in the default workspace, I currently use depends_on with a vsphere_virtual_machine resource to control the order in which Terraform creates VMs containing services that depend on each other. For example, an application VM using a database depends on a database VM. Using depends_on with the application VM, I can tell Terraform to create the database VM before it creates the application VM. If the dependent resource resides in a different workspace and does not exist, or if dependencies cannot span workspaces, I think Terraform should at least return an error during the plan operation.

Actual Behavior

Using the same example, If I place the application VM and database VM in separate workspaces, Terraform seems to ignore the dependency and create the application VM even if the database VM does not exist. I understand that each workspace uses a separate state file, but I think Terraform should use the VMware API to determine actual resources.

Steps to Reproduce

  1. Configure the vSphere provider.
  2. Run terraform init.
  3. Create two workspaces.
  4. In a single configuration file, define one VM resource with a count of 1 or greater in the first workspace and a VM resource with a count of 1 or greater in the second workspace. The latter should contain a dependency on the VM resource in the first workspace.
  5. Select the second workspace.
  6. Run terraform plan and see Terraform say it will only create the VM resource in the second workspace.

Important Factoids

If I change the value of depends_on to a bogus resource (not in any workspace), Terraform returns an error during the plan operation saying the new resource depends on a nonexistent resource.

I'm not sure if this issue pertains to workspace operation in general or, more specifically, the vSphere provider.

References

jbardin commented 6 years ago

Hi @ionosphere80,

Thanks for providing the use case here. A workspace is an independent, isolated state; while a dependency is declared only via a configuration. The depends_on fields is strictly scoped within the configuration, and can't make any external calls to try and resolve the name. Once you switch workspaces, you are operating in a new state which can't have any knowledge of the other workspace's state.

I'm not sure I understand how this configuration would generate a dependency for a resource in another workspace, but the basic premise is that there should be no dependencies between workspaces at all. If you need some dependency between the two states, then workspaces aren't the right tool for the job.

One option is to combine these into a single configuration, and reuse the duplicated pieces by placing them in modules so you can parameterize it as needed. If there is no actual configuration dependency between the resources, you can use a null_resource as an intermediary to place in the depends_on field (though it's preferable to not use depends_on when possible).

ionosphere80 commented 6 years ago

Hi @jbardin,

Thanks for the concise explanation of workspaces. My actual use case is a bit more complex than my example.

I primarily use Terraform with a public cloud that magically provides underlying network services such as DNS and DHCP. I don't need to consider deployment of these services, just how to interface with them if necessary. I also use Terraform to manage VMware VMs in a private cloud that lacks underlying network services. In this environment, I need to consider deployment of these services and the dependencies among them. For example, DHCP depends on authoritative DNS to manage dynamic DNS records, authoritative DNS depends on a database, and all of the underlying network services depend on recursive DNS. Using depends_on, I can control the order (rather grotesquely) in which Terraform deploys VMs providing these services in the case of a clean slate.

Using the default workspace causes some interesting problems, at least with the vSphere provider. For example, recent changes to the provider code trigger Terraform to rebuild all of the VMs. After considerable munging of configuration files, rebuilds became less invasive reboots. Either way, most of the VMs contain services that depend on VMs providing underlying network services. Rebooting all of them simultaneously would likely cause unpredictable problems.

Implementing workspaces, I can effectively limit the type and quantity of VMs that Terraform wants to reboot or rebuild simultaneously. While determining the best way to split the default workspace into multiple workspaces, I came across my use of depends_on in a few places. I didn't expect depends_on to work between workspaces, but the lack of error message (or anything) from Terraform during plan or apply operations left me without a definitive answer. If depends_on references a resource in another workspace, I think Terraform should consider it a resource that does not exist and return an error message rather than proceeding with the operation.

On a side note, possibly worthy of creating another issue for my actual use case...

Assuming two VMs providing each underlying network service, I want Terraform to reboot or rebuild only one of the two VMs rather than all of them simultaneously. I can't find a way to limit how Terraform reboots or rebuilds VMs besides placing all of the "A-side" and "B-side" VMs in separate workspaces.

jbardin commented 6 years ago

Thanks, that helps clarify the problem a bit.

One of the recommended ways to reduce the "blast radius" of disruptive changes is to split infrastructure into entirely separate configurations, and reference dependencies through a remote state data source. This is s little more manual than what you were attempting, but functionally very similar since workspaces aren't much more than a named state. Breaking up infrastructure can also help the config scale in many cases, preventing single configuration trees from becoming too unwieldy.

Another tactic is to assume that these mass changes are a very rare occurrence, and to handle them individually with targeted applies. You can target each vsphere_virtual_machine name in separate apply operations, manually rolling out the changes.

We've also had requests for a "rolling update" type of lifecycle (#16200 and #16378), which may help depending on your actual requirements.