hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
42.5k stars 9.52k forks source link

Execution order inside modules #18143

Closed vedantrathore closed 5 years ago

vedantrathore commented 6 years ago

Terraform Version

Terraform v0.11.7
+ provider.aws v1.20.0
+ provider.consul v1.0.0
+ provider.vault v1.1.0

Terraform Configuration Files

Here is the main.tf in the root module which is used to query other modules and generate the infrastructure. Each module has a code to read and write configurations from a consul server. The configurations from consul can be used in another module.

locals {
  project  = "Innovaccer"
  owner    = "vedant"
  vpc_name = "vedant_vpc"
}

module "vpc" {
  source     = "./modules/vpc"
  cidr_block = "10.1.0.0/16"
  vpc_name   = "${local.vpc_name}"
  az_code    = "a"
  owner      = "${local.owner}"
  project    = "${local.project}"
}

module "public_subnet" {
  source     = "./modules/subnet_create"
  cidr_block = "10.1.1.0/24"
  public     = "true"
  vpc_name   = "${local.vpc_name}"
  name       = "vedant_public_subnet"
  project    = "${local.project}"
  owner      = "${local.owner}"
  vpc_id     = "${module.vpc.vpc_id}"
}

module "private_subnet" {
  source     = "./modules/subnet_create"
  cidr_block = "10.1.3.0/24"
  vpc_name   = "${local.vpc_name}"
  name       = "vedant_private_subnet"
  project    = "${local.project}"
  owner      = "${local.owner}"
  vpc_id     = "${module.vpc.vpc_id}"
}

module "sec_gp" {
  source   = "./modules/sg_create"
  name     = "vedant_sec_group"
  project  = "${local.project}"
  owner    = "${local.owner}"
  vpc_name = "${local.vpc_name}"
  vpc_id   = "${module.vpc.vpc_id}"

  ingress = [
    {
      to_port    = 22
      from_port  = 22
      protocol   = "tcp"
      cidr_block = "10.1.1.0/24"
    },
    {
      to_port    = 80
      from_port  = 80
      protocol   = "tcp"
      cidr_block = "10.1.1.0/24"
    },
  ]

  egress = [
    {
      to_port    = 22
      from_port  = 22
      protocol   = "tcp"
      cidr_block = "10.1.1.0/24"
    },
  ]
}

Debug Output

The execution plan of that script is: https://gist.github.com/vedantrathore/6842232894cd0a47cc69bbcaf64307c9

Expected Behavior

The vpc should be created first and the config related to the vpc should be written in consul, then the config should be queried from consul to be used in different modules like subnet, security_groups etc

Actual Behavior

As according to the execution plan the resources like subnet are executed first and they query the config from consul before resources like vpc have written their data into consul. If there is any provision for specifying an execution order for modules then this issue could be resolved. I want one module to be executed after all the read and write from the previous module is finished.

I've tried the depends_on parameter inside the modules but it gives an error as referenced by this issue #10462 also I've tried using the output of one module as an unused input variable to another module but still no luck. Any help here is highly appreciated.

Steps to Reproduce

Please list the full steps required to reproduce the issue, for example:

  1. terraform init
  2. terraform apply

References

10462

bjollans commented 6 years ago

Could you add your code for the "vpc" module?

vedantrathore commented 6 years ago

Sure, it's actually spread into the following folder structure

variables.tf

variable "cidr_block" {
    type = "string"
    description = "CIDR block to given to the VPC"
}

variable "az_code" {
  type = "string"
  description = "Code for the availability zone e.g 'a','b','c' "
}

variable "vpc_name" {
    type = "string"
    description = "Name of VPC"
}

variable "owner" {
  type = "string"
  description = "Name of the owner"
}

variable "project" {
  type = "string"
  description = "Name of the project"
}

output.tf

output "vpc_id" {
  value = "${aws_vpc.vpc.id}"
}

main.tf

data "aws_region" "current" {}

# Resource creation section
resource "aws_vpc" "vpc" {
  cidr_block = "${var.cidr_block}"

  tags {
    Name    = "${var.vpc_name}"
    Owner   = "${var.owner}"
    Project = "${var.project}"
  }
}

# Writing back information to consul

resource "consul_keys" "write_kvs" {
  key {
    path  = "${format("vpcs/%s/id","${var.vpc_name}")}"
    value = "${aws_vpc.vpc.id}"
  }

  key {
    path  = "${format("vpcs/%s/cidr_block", "${var.vpc_name}")}"
    value = "${var.cidr_block}"
  }

  key {
    path  = "${format("vpcs/%s/az", "${var.vpc_name}" )}"
    value = "${format("%s%s",data.aws_region.current.name,var.az_code)}"
  }
}

I want the consul_keys.write_kvs to be executed first then any other module in the root main.tf

jbardin commented 6 years ago

Hi @vedantrathore,

Sorry this is tripping you up. The only order attributed to a Terraform configuration is the dependency graph of the individual resources. The module abstraction only really applies to the organization of the configuration; it's the configuration in its entirety that is evaluated to determine the execution order.

In your example here, consul_keys.write_kvs depends on aws_vpc.vpc, and will be created after. Since nothing else depends on the consul_keys resource, it's evaluation could happen an any time after.

You need something that depends on the consul_keys resource, and this is where a null_resource can be used to force a dependency chain. Until we can apply a depends_on to a module as a whole, individual resources that require consul_keys to be created already will have to somehow reference the resource, either directly or by depending a null_resource that does.

Another alternative is to break this up into multiple configurations, and reference the vpc from a remote state.

vedantrathore commented 6 years ago

Hello @jbardin

Thanks for your reply, I read the documentation about null_resource but I'm not sure how this will fit into my use case. Can you please explain a little about how I would link the consul_keys into the dependency chain using null_resource. I am thinking of adding the aws_vpc.vpc to the triggers option but not sure how that will force consul_keys to be executed first.

Thanks a lot!

jbardin commented 6 years ago

The consul_keys.write_kvs is evaluated after the aws_vpc, so if you want to ensure that consul_keys.write_kvs is evaluated before other resources, the other resources need to depend on consul_keys.write_kvs.

You would need to output a value from consul_keys.write_kvs resource, in this case the id is probably the only reasonable value. That output value can then be used as an input in each other module. Resources in those modules can either reference that input directly, or depend on a null_resource that reference the value.

vedantrathore commented 6 years ago

yes, I understand now. One more thing, this consul_keys.write_kvs resource is in all the modules because I want the details in all the modules to be written to the consul server. Should I add that id thing as an output to all the modules wherever that dependency is required to be executed first?

jbardin commented 6 years ago

Yes, the only way to enforce an order for resource evaluation in Terraform is to create some sort of dependency between resources.

You should ask yourself though, if nothing naturally references the consul values in the config, is there truly a dependency on those values? Reading values from consul would normally be an application layer concern, so as long as they are written eventually during the apply operation there's usually no reason to force the order of operations.

vedantrathore commented 6 years ago

Thanks a lot for your help @jbardin I'll try that solution and will comment here if I run into an error. :smile:

vedantrathore commented 6 years ago

Hello @jbardin, I've tried what you described earlier. I added this to my outputs.tf in the vpc_module

output "vpc_id" {
  value = "${aws_vpc.vpc.id}"
}

output "datacenter" {
  value = "${consul_keys.write_kvs.datacenter}"
}

and added the following variables into subnet module too. I have also created a null resource in the subnet's main.tf to link the outputs.

subnet module main.tf:

locals {
  is_public_subnet = "${var.public == "false" ? "false" : "true"}"
  subnet_type      = "${var.public == "false" ? "private" : "public"}"
}

resource "aws_subnet" "subnet" {
  vpc_id            = "${data.consul_keys.app.var.vpc_id}"
  cidr_block        = "${var.cidr_block}"
  availability_zone = "${data.consul_keys.app.var.az}"

  map_public_ip_on_launch = "${local.is_public_subnet}"

  tags {
    Name    = "${var.name}"
    Owner   = "${var.owner}"
    Project = "${var.project}"
  }
}

resource "consul_keys" "write_kvs" {
  key {
    path  = "${format("vpcs/%s/subnets/%s/%s/id", "${var.vpc_name}", "${local.subnet_type}", "${var.name}")}"
    value = "${aws_subnet.subnet.id}"
  }

  key {
    path  = "${format("vpcs/%s/subnets/%s/%s/cidr_block", var.vpc_name, local.subnet_type, var.name)}"
    value = "${var.cidr_block}"
  }
}

resource "null_resource" "subnet_null" {
  triggers = {
    datacenters = "${var.data_center}"
    id          = "${var.vpc_id}"
  }
}

I have also changed the main.tf in the root module to force execution order accordingly.

root main.tf

locals {
  project  = "Innovaccer"
  owner    = "vedant"
  vpc_name = "vedant_vpc"
}

module "vpc" {
  source     = "./modules/vpc"
  cidr_block = "10.1.0.0/16"
  vpc_name   = "${local.vpc_name}"
  az_code    = "a"
  owner      = "${local.owner}"
  project    = "${local.project}"
}

module "public_subnet" {
  source      = "./modules/subnet_create"
  cidr_block  = "10.1.1.0/24"
  public      = "true"
  vpc_name    = "${local.vpc_name}"
  name        = "vedant_public_subnet"
  project     = "${local.project}"
  owner       = "${local.owner}"
  vpc_id      = "${module.vpc.vpc_id}"
  data_center = "${module.vpc.datacenter}"
}

module "private_subnet" {
  source      = "./modules/subnet_create"
  cidr_block  = "10.1.3.0/24"
  vpc_name    = "${local.vpc_name}"
  name        = "vedant_private_subnet"
  project     = "${local.project}"
  owner       = "${local.owner}"
  vpc_id      = "${module.vpc.vpc_id}"
  data_center = "${module.vpc.datacenter}"
}

still the execution plan is creating the subnets first and writing the vpc config to consul in the end. The plan: https://gist.github.com/vedantrathore/4e1b648e6e5d80272ff6819a2ea75efe

Any help in this would be highly appreciated. Thanks!

apparentlymart commented 6 years ago

Hi @vedantrathore!

It sounds like you're trying to use Consul to pass data between your modules here. That is, you use module.vpc.consul_keys.write_kvs to write some data into Consul and then something inside your public_subnet module uses data "consul_keys" ... to read it out again. Is that right?

If so, this is subverting Terraform's normal way to understand the relationships between resources, and so the dependency graph is incomplete. The usual way to implement this would be to export the necessary values from the vpc module as outputs and then pass them in via the input variables of public_subnet. That way the data flows through Terraform itself, rather than going indirectly through Consul, and Terraform can then understand properly what order the steps must be executed in.


With that said, I think the reason why what you tried there didn't work is because of a missing dependency edge. The key thing to know about dependencies between modules is that each output and variable is a separate node in the dependency graph, and so your configuration shared above has the following dependencies (the relevant subset):


  module.vpc.consul_keys.write_kvs                         

                  |                                        
                  |                                        
                  V                                        

    module.vpc.output.datacenter                           

                  |                                        
                  |                                        
                  V                                        

 module.public_subnet.var.datacenter  

To complete this, you then need an expression in module.public_subnet.data.consul_keys.read_kvs (assumed name, since I don't have your public_subnet module source code) that refers to module.public_subnet.var.datacenter (which, from inside that module, is just var.datacenter.


The approach you've taken here of writing results into Consul is not conventional for passing data between modules in a single configuration, but it is a common pattern for passing data between separate configurations. A different way to write this would be for each of your modules to be a separate top-level configuration and then run terraform apply for each of them in turn.

I don't think that makes a lot of sense in this case since these modules are simple and tightly related to one another, but I share it just for completeness. Usually this approach would be used, for example, to get information about the VPC for another configuration that provisions something into the VPC. That way the second configuration can be applied separately from the first one, which is convenient when (as is commonly the case) the infrastructure in the second configuration changes more often than its containing network.

vedantrathore commented 6 years ago

Hello @apparentlymart!

I apologise for the delayed response. I understand what you're saying and I would like to point out that I am not using consul to pass data. Instead I am using consul as a config management tool and saving all the information in a hierarchal structure in consul. I am doing this because we want the terraform scripts to automatically fetch any config details from the consul server.

For example, if we specify the instance launch path as test_vpc.public.test_public_subnet so the terraform script will parse this string and fetch the corresponding subnet_id from the consul server accordingly.

This is the reason I want terraform modules to write into consul first then run another module. I have found a workaround though. I thought of appending the modules in a sequential way and then running terraform init && terraform apply after every append. This will force the script to execute modules in a sequential order. I understand this is highly inefficient and it would be great if you could tell any way around this.

ghost commented 5 years ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.