gruntwork-io / terragrunt

Terragrunt is a flexible orchestration tool that allows Infrastructure as Code written in OpenTofu/Terraform to scale.
https://terragrunt.gruntwork.io/
MIT License
7.94k stars 965 forks source link

Don't convert inputs to env vars. Instead, generate terragrunt.tfvars.json and use as -var-file #752

Open ekini opened 5 years ago

ekini commented 5 years ago

I've been reading documentation for terraform 0.12 in regards to handling env vars, and https://www.terraform.io/docs/configuration/variables.html states that:

Some special rules apply to the -var command line option and to environment variables. For convenience, Terraform defaults to interpreting -var and environment variable values as literal strings, which do not need to be quoted

and

However, if a root module variable uses a type constraint to require a complex value (list, set, map, object, or tuple), Terraform will instead attempt to parse its value using the same syntax used within variable definitions files, which requires careful attention to the string escaping rules in your shell

And it's so error-prone and counter-intuitive! Just take a look at https://github.com/gruntwork-io/terragrunt/issues/740 and https://github.com/gruntwork-io/terragrunt/issues/738

Instead, terraform directly supports loading variables from json files: https://www.terraform.io/docs/configuration/variables.html#variable-definitions-tfvars-files

Terraform 0.11 supports it as well, although it's not properly documented:

Variables files use HCL or JSON syntax to define variable values

The main point here is that types are not lost and terraform doesn't have to "guess" them. Also, the whole logic of converting inputs to env vars can be simplified to one json.Marshal.

Benefits:

  1. Simplified logic
  2. Supported by terraform 0.12 and properly documented
  3. Types are not lost in translation, no need to set constraints in variable {} blocks
  4. Works with terraform 0.11 nicely, for compatibility sake!

I can make a PR should it be needed.

barryib commented 5 years ago

Oh yes ! This is definitely the way to go for input variables.

Furthermore, as discussed https://github.com/gruntwork-io/terragrunt/issues/737#issuecomment-500895952 with @brikis98 we should also stick to Terraform, by defining variables explicitly.

# In root terragrunt.hcl
terraform {
  extra_arguments "terragrunt_generated_vars" {
    commands = "${get_terraform_commands_that_need_vars()}"

    required_inputs_files = [
      "${get_terragrunt_dir()}/${find_in_parent_folders("common.hcl")}",
    ]
  }
}
# stage/frontend-app/terragrunt.hcl
terraform {
  source = "..."
}

# Include all settings from the root terragrunt.hcl file
include {
  path = "${find_in_parent_folders()}"
}

inputs = {
  aws_region = get_input("aws_region")
  remote_state_bucket = get_input("remote_state_bucket")

  instance_type = "t2.micro"
  instance_count = 10
}

This will generate terragrunt-12334.tfvars.json and use terraform plan -var-file terragrunt-12334.tfvars.json or terragrunt.auto.tfvars.json loaded automatically by Terraform.

Inputs should follow Terraform variable merging behavior also.

Important: In Terraform 0.12 and later, variables with map and object values behave the same way as other variables: the last value found overrides the previous values. This is a change from previous versions of Terraform, which would merge map values instead of overriding them.

    required_inputs_files = [
      "${get_terragrunt_dir()}/${find_in_parent_folders("common.hcl")}",
      "${get_terragrunt_dir()}/${find_in_parent_folders("region.hcl")}",
    ]

Any inputs found in region.hcl will override those in common.hcl. Since we can now user Terraform interpolations in Terragrunt, people who wants to merge maps, will use merge interpolation.

brikis98 commented 5 years ago

I agree that Terraform's parsing of env vars is not ideal... But generating files on disk is always a bit messy:

  1. What file name to use? What if the file exists already? Do we generate a randomly named temp file? What about on subsequent re-runs of terragrunt commands?
  2. Do we delete the file after?
  3. What if the values in the inputs block are secrets? Writing them to disk may be a bad idea. E.g., consider:
    inputs = {
      # Intentionally read value from env vars so it's not written to disk
      db_password = get_env("DB_PASSWORD", "") 
    }
  4. What if the user has -var-file arguments already in extra_arguments? Does the generated file go before or after those?
  5. What if the user has -var-file arguments already when running terragrunt (e.g., terragrunt apply -var-file=foo? Does the generated file go before or after those?
barryib commented 5 years ago

@brikis98 I understand your point, but with all the benefit we can get, I think it worth that Terragrunt generates tfvars.json files.

What file name to use? What if the file exists already ? Do we generate a randomly named temp file? What about on subsequent re-runs of terragrunt commands?

Maybe in .terragrunt-cache we can generate single file (terragrunt.auto.tfvars.json) with all needed inputs, and set Terragrunt working directory to the .terragrunt-cache. Terraform will load automatically this tfvars like before.

Do we delete the file after? What if the values in the inputs block are secrets?

Yes. For consistency and to mitigate secrets exposure. Even clean before every run.

What if the user has -var-file arguments already in extra_arguments? Does the generated file go before or after those? What if the user has -var-file arguments already when running terragrunt (e.g., terragrunt apply -var-file=foo? Does the generated file go before or after those?

Same as today. It'll follow Terraform variable precedence

PS: Even Terraform Entreprise is using this for workspace variables.

ekini commented 5 years ago

I agree that Terraform's parsing of env vars is not ideal... But generating files on disk is always a bit messy:

  1. What file name to use? What if the file exists already? Do we generate a randomly named temp file? What about on subsequent re-runs of terragrunt commands?

Let's make it terragrunt.tfvars.json? It's unlikely that it can be taken. I would explicitly use -var-file instead of .auto.tfvars.json, firstly because the old terraform doesn't load it automatically, and secondly it's better to keet it explicit!

  1. Do we delete the file after?

Not sure about this one. Why do we need to delete it? It will be regenerated on every terragrunt run.

  1. What if the values in the inputs block are secrets? Writing them to disk may be a bad idea. E.g., consider:
    inputs = {
     # Intentionally read value from env vars so it's not written to disk
     db_password = get_env("DB_PASSWORD", "") 
    }

Well, if somebody had secrets in terraform.tfvars before, then it doesn't change the state at all. Otherwise, TF_VAR_xxx env var will just work as before.

  1. What if the user has -var-file arguments already in extra_arguments? Does the generated file go before or after those?

To keep the previous behaviour it needs to go as the last argument.

  1. What if the user has -var-file arguments already when running terragrunt (e.g., terragrunt apply -var-file=foo? Does the generated file go before or after those?

I think same as above? Variables from terragrunt.tfvars used to overrided everything.

Hmm, I answered almost the same as @barryib :)

ekini commented 5 years ago

Just another point: the JSON format is meant to be machine-generated, and it's a perfect use case for terragrunt.

This syntax is useful when generating portions of a configuration programmatically, since existing JSON libraries can be used to prepare the generated configuration files.

brikis98 commented 5 years ago

Well, if somebody had secrets in terraform.tfvars before, then it doesn't change the state at all. Otherwise, TF_VAR_xxx env var will just work as before.

This isn't a secret in the file. It's a secret read using a Terragrunt helper that would only exist in memory... Unless we start writing var files to disk.

One other thought:

As of Terraform 0.12, you can no longer set a variable in .tfvars file if that variable isn't defined in your Terraform (.tf) code. Is that true of .tfvars.json? If so, doesn't that significantly limit reuse of common variable files across different modules?

ekini commented 5 years ago

Not sure about secrets.

The same rules apply to both .tfvars and tfvars.json, so yes, it will limit reuse of common variables in the current state of terraform.

brikis98 commented 5 years ago

The same rules apply to both .tfvars and tfvars.json, so yes, it will limit reuse of common variables in the current state of terraform.

I suspect that will be a serious problem for many users... So we'd be trading one mild problem (variables without a type being parsed incorrectly) to a bigger problem (reuse of values becoming very difficult).

AFriemann commented 5 years ago

How about parsing the resulting terraform files and only setting variables that are defined there?

barryib commented 5 years ago

This isn't a secret in the file. It's a secret read using a Terragrunt helper that would only exist in memory... Unless we start writing var files to disk.

What if we can define env vars and var files ? Something like:

env_inputs = {
  password = ${get_env("MY_SECRET_PASSWORD")}
}

inputs = {
  instance  = "t2.nano"
  region    = "eu-west-1"
}

env_inputs will be treated like Terraform env vars TF_VARS_password and inputs will be converted into json.

I suspect that will be a serious problem for many users... So we'd be trading one mild problem (variables without a type being parsed incorrectly) to a bigger problem (reuse of values becoming very difficult).

I think that reuse of values will be difficult in Terraform 0.12, because they take the decision to move to explicite variable definition. This is neither good nor bad move, it's juste a direction they take.

So should we follow them or not ? I'll say yes, because Terragrunt is a wrapper. It should add value/feature but it's not meant to change Terraform behavior. This could be confusing for users.

Saying that, how can Terragrunt become explicit without loosing DRY concept ? I think get_input() or a map with all merged inputs is a good direction. This map or locals can be used to explicitly define inputs in terragrunt.hcl :

inputs = {
  aws_region = local.region
  # or
  aws_region = get_input("region")
}
barryib commented 5 years ago

@ekini @brikis98 How can we move forward on this discussion ?

ekini commented 5 years ago

The upstream issue seems to be ignored by Terraform team, which is a blocker.

brikis98 commented 5 years ago

@barryib I have not had time to dig into a solution yet. Hoping to have more thoughts to share in the next couple weeks.

ozbillwang commented 4 years ago

Since locals (or local) has been supported in Terragrunt, how do we reference the data source?

data "aws_ami" "example" {
  executable_users = ["self"]
  most_recent      = true
  name_regex       = "^myami-\\d{3}"
  owners           = ["self"]

  filter {
    name   = "name"
    values = ["myami-*"]
  }
}

how do we reference data.ami_ami.example.id in terragrunt.hcl? Any hints for me?

Reference resource's attriabutes (output) will be harder, maybe add feature to reference data sources will be easier to be implemented?

ghost commented 4 years ago

@ozbillwang I run into the same issue few days ago. Problem is that Terragrunt only allows terragrunt.hcl file in the directory where it is run. I tried leaving a .tf file with the data sources but it didn't work. Think this will never get implemented as, lookin at what we wanted to do, Terragrunt would have to copy those data source definitions on the side and run Terraform to get the outputs, since we wanted to instruct it to assign those outputs to it's inputs (at least that was what I was trying to achieve).

In the end we went with a new module, with only data sources defined inside of it :( ugly as hell, since you need to still pass values to the module (e.g. for google_kms_secret data source - a key and a ciphertext), but I don't think there's a better way of doing this currently.

yorinasub17 commented 4 years ago

I don't think we will ever implement a way to reference data or resources in terragrunt directly, as that requires implementing terraform directly within terragrunt, making it more than a wrapper.

With that said, I am having a hard time understanding the use case, so could help if I had a concrete example with ideal terragrunt.hcl and terraform code snippet. What I am having a hard time understanding is the benefit of doing the processing of inputs via data sources within terragrunt, as opposed to doing it in the module in terraform.

lorengordon commented 4 years ago

You can almost do this today, with the new generate block. The problem is that terragrunt currently injects a comment into the file with no way to disable that feature, and json does not allow comments. My workaround was to write a templated *.auto.tvars file instead. See https://github.com/gruntwork-io/terragrunt/issues/1121 for the approach.

lorengordon commented 4 years ago

Of course you'll still run into the undeclared var problem, but you can use a .tfvars file instead of ENVs... ;)

ghost commented 4 years ago

@yorinasub17 Sorry for neglecting to respond :(

I was (and still am) opposed to putting data source definitions in modules. I think of them rather as something that provides input values to a module. Our usage was to provide a list of IPs to a firewall configuration in out Terraform module. The list of IPs was being fetched using the http data source. Putting the data source in the module felt more like hardcoding, which is what we went for in the end.

Looking at the new generate block, I suppose this can be handled more or less the way I wanted it initially, but I haven't tested it yet.

yorinasub17 commented 4 years ago

@kcatro Since you are using the http data source, could you use run_cmd to accomplish the same thing?

ghost commented 4 years ago

@yorinasub17 Seems like a valid idea! :)

yorinasub17 commented 2 years ago

Relabeled as a feature enhancement that needs design. The main concerns around this are documented in the discussion on this PR: https://github.com/gruntwork-io/terragrunt/pull/1267

marsskop commented 1 year ago

I'd like to link an issue that could be solved by generating .tfvars with inputs: https://github.com/gruntwork-io/terragrunt/issues/2132

helobinvn commented 1 year ago

I guess using Terragrunt config to store variables instead of .tfvars, then load it into the Terragrunt config in local block by read_terragrunt_config(get_env("PATH_TO_TG_VARS_FILE")) is the workaround solution for this, just need to set some ENV var when running the terragrunt commands