Disable -var and -var-file flags when running terragrunt apply with a plan file

brikis98 commented 6 years ago

If you run terragrunt apply <PLAN_FILE> and you use extra_arguments anywhere to pass -var-file or -var flags, Terraform will give you the error:

You can't set variables with the '-var' or '-var-file' flag                                                                                                                                                                                     
when you're applying a plan file. The variables used when                                                                                                                                                                                       
the plan was created will be used. If you wish to use different                                                                                                                                                                                 
variable values, create a new plan file.

We either need to build in a special exception for the apply command with plan files, or have some way to optionally turn of extra_arguments (--terragrunt-no-extra-arguments).

ldormoy commented 6 years ago

This prevents me to use terragrunt in atlantis.

I use extra_arguments to retrieve AWS account ID and region (multi-account AWS setup):

  terraform {
    extra_arguments "bucket" {
      commands = ["${get_terraform_commands_that_need_vars()}"]

      optional_var_files = [
        "${get_tfvars_dir()}/${find_in_parent_folders("account.tfvars", "ignore")}",
        "${get_tfvars_dir()}/${find_in_parent_folders("region.tfvars", "ignore")}",
      ]
    }
  }

Is there any workaround until terragrunt handles this use case? Also, how can I help? :-)

brikis98 commented 6 years ago

PRs are always welcome 😃

FWIW, I'm not sure of the best way to fix this. --terragrunt-no-extra-arguments would be the easiest, but it's not clear it's the right solution (e.g., what if you want to disable -var-file arguments, but not other types?). A special case for apply seems ugly and has the same problems. Perhaps the commands list can recognize apply and apply-with-plan as separate commands? And by default, get_terraform_commands_that_need_vars() only returns apply?

ldormoy commented 6 years ago

Perhaps the commands list can recognize apply and apply-with-plan as separate commands? And by default, get_terraform_commands_that_need_vars() only returns apply?

Would be an option I suppose, although I understand it's not ideal because it means adding a special argument that deviates from the terraform CLI syntax.

I see that there are 2 possibilities for terragrunt apply positional arguments: PLAN or DIR. What about parsing the argument and drop -var and -var-file if it's not a directory?

terraform does it this way: https://github.com/hashicorp/terraform/blob/d4ac68423c4998279f33404db46809d27a5c2362/command/meta_new.go#L152

Other solution (for my specific use case): When running terragrunt apply PLAN, get_terraform_commands_that_need_vars() should exclude apply. Again, it could be done by checking if the positional argument is a directory.

UPDATE: I realize my proposals are probably much more complex than having an apply-with-plan command. I'd personally be more than happy with this solution. If you agree with the solution, would you like a PR submitted for this? I'm not a Go expert myself but I can give it a try and also have some colleagues that can help.

brikis98 commented 6 years ago

When running terragrunt apply PLAN, get_terraform_commands_that_need_vars() should exclude apply. Again, it could be done by checking if the positional argument is a directory.

While this one actually feels like the most "correct" solution—after all, apply PLAN does NOT need vars, so this is a bug in the method—it doesn't cover the case where someone manually sets commands = ["apply"].

What about parsing the argument and drop -var and -var-file if it's not a directory?

Doing this check for all apply <ARG> calls seems like it would handle all use cases. I'm not a fan of special cases, but in this case, it's a special case in Terraform itself, so Terragrunt will need to do the same. And I think doing this automatically for users will be less error prone than having separate apply and apply-with-plan commands and having to educate users about them.

would you like a PR submitted for this?

Please and thank you! 🍺

pawel-t commented 6 years ago

@ldormoy Any update from your side? I'm willing to try this out :)

ldormoy commented 6 years ago

Hi Pawel, unfortunately no. From my side you are very welcome to give it a try! Thanks for this :-)

marcoreni commented 6 years ago

I stumbled upon this issue while working on a CI script with Terragrunt.

Since we were in a hurry I quickly pulled together a PR to cover this scenario as discussed in previous posts (at least I think :D )

Hope this helps.

ldormoy commented 6 years ago

Thanks @marcoreni !

brikis98 commented 6 years ago

This is hopefully now fixed thanks to @marcoreni! Give https://github.com/gruntwork-io/terragrunt/releases/tag/v0.16.13 a shot.

marcoreni commented 6 years ago

@brikis98 After some checks with our systems I realized that I made a wrong assumption on the check. In fact, plan file is the last parameter added to the command, not the second.

This fix will therefore work if no parameters are added (eg. terragrunt apply plan.out), but does not work if you have parameters provided in the CLI command (eg. terragrunt apply -no-color plan.out).

I'll work on this and create a new PR shortly. Sorry about that.

brikis98 commented 6 years ago

@marcoreni Ah, good point, I missed that too!

lorengordon commented 6 years ago

@marcoreni @brikis98 In terragrunt v0.17.0, I'm seeing extra_arguments passing -var-file to terraform apply even when a plan file is provided. Is there a toggle or something to enable the behavior that was fixed in the patches for this issue?

brikis98 commented 6 years ago

Shouldn't require anything special, so it might be a bug. Can you paste the full command and log output so we can see what's happening?

lorengordon commented 6 years ago

Eh, my config is a little complicated. I'll try to distill it down and see if I can reproduce it in a minimal config. About to head out of town for a few days and gotta wrap up some other things first, so probably won't get to it until next week.

unixninja92 commented 5 years ago

We are also seeing this issue with extra_arguments. the command: terragrunt apply plan.out relevant log: Running command: terraform apply -var-file=/var/lib/jenkins/jobs/JENKINS-JOB-NAME/workspace/aws/PATH/TO/TFVARS/FILE/../../../../remote-state.tfvars plan.out [31mYou can't set variables with the '-var' or '-var-file' flag when you're applying a plan file. The variables used when the plan was created will be used. If you wish to use different variable values, create a new plan file.

Our extra arugments config: extra_arguments "remote-data" { commands = ["${get_terraform_commands_that_need_vars()}"] optional_var_files = [ "${get_tfvars_dir()}/${find_in_parent_folders("remote-state.tfvars", "ignore")}" ] }

Maybe get_terraform_commands_that_need_vars() shouldn't return for apply commands that use plan.out?

unixninja92 commented 5 years ago

The problem is terragrunt plan -out=plan.out saves the plan.out file to the download dir .terragrunt-cache. Then when you run terragrunt apply plan.out it's checking if it exists in the current directory. You have to specify the full path for plan.out when running terragurnt apply to get it working.

ivasilyev-servicetitan-com commented 4 years ago

Hello. Having the same issue. But I'm running terragrunt plan-all -out=plan --terragrunt-non-interactive and then terragrunt apply-all plan --terragrunt-non-interactive, so 'plan' is the file name in every folder, and I can't specify the full path for plan. Is there a solution for terragrunt apply-all scenario?

To workaround this I specified

extra_arguments "common_vars" {
    # do not use `get_terraform_commands_that_need_vars()` here
    # as there is an issue with `apply-all` command
    # https://github.com/gruntwork-io/terragrunt/issues/493
    # https://github.com/gruntwork-io/terragrunt/issues/457
    commands = [
      "plan"
    ]

in root terragrunt.hcl file

yorinasub17 commented 4 years ago

But I'm running terragrunt plan-all -out=plan --terragrunt-non-interactive and then terragrunt apply-all plan --terragrunt-non-interactive, so 'plan' is the file name in every folder, and I can't specify the full path for plan. Is there a solution for terragrunt apply-all scenario?

This is actually an antipattern and is not recommended. There are various scenarios where this doesn't work and you can easily shoot yourself in the foot. To understand this, consider the following scenario:

Suppose you had two modules: vpc and app. The vpc module provisions a VPC, while the app module provisions an EC2 instance and takes as input the VPC to deploy into. The app module relies on the outputs of the vpc module.

It is obvious why the plan-all => apply-all setup doesn't work in the initial state, when you have nothing deployed. You can't get a valid plan for the app module because the VPC doesn't exist.

What about after everything has been deployed and you are making changes? Suppose that you are now adding a new subnet to the VPC and want to observe how that will affect things. If you do a plan-all after the code has been modified, you will see the plan for the VPC being modified. However, you will not see any changes in the app module even if you depend on an output that will change. This is because the output doesn't change until the VPC changes are applied. So the plan is being made based on the previous state of the vpc module.

With all that said, one use case for this is if you have mechanisms for "steady state" type workflow. In this scenario, you want to continuously loop between plan-all and apply-all until no changes are being made. Locking plan files enable you to ensure that downstream changes won't be applied in the first go around, since no changes will be recorded when the plan is being made on the previous state. You do have to continuously loop between plan-all and apply-all though because the plan will change once the upstream dependencies have been applied.

ivasilyev-servicetitan-com commented 4 years ago

Thank you @yorinasub17 for stepping into and for detailed answer.

Saying "This is actually an antipattern" are you saying about "using plan-all and apply-all"? But what the reason to have at least apply-all command in TG then?

Yes, we're trying to build a workflow where we can put TF changes to Pull Request and then CICD pipeline will plan these changes and apply them after human approved. And yes, we have a lot of TF folders, that's why we need Terrgrunt with its *-all functionality.

I understand your point about issue with plan-all on new infra (I read about mock so I didn't think it's antipattern if there is a solution in TG for this). But actually we're not running TF code against new infra frequenlty, more useful - to modify existing functionality. And for case when dependency is not applied yet - I think we can deal with it by splitting into 2 PRs and deploying these changes in a sequence.

But generally, do you have any recommendations on how to build differently the workflow when you want to be able to apply the TF changes located in different folders and not with human (resolving what should be applied, and in which order) but somehow more automated and guaranteed way. My assumption was TF+TG+CICD gives you this result. And with CICD - you still need *-all commands, what's the other way?

Thank you.

yorinasub17 commented 4 years ago

Saying "This is actually an antipattern" are you saying about "using plan-all and apply-all"?

xxx-all was a pattern that we started implementing for the use case of managing the full stack, optimized for the use case of standing up the full stack from scratch or tearing the whole stack down for testing purposes. We use apply-all and destroy-all on a day to day basis for this reason, but plan-all is not something that we use or have invested the time in to make it work.

Now taking a step back, the whole pattern around terragrunt came about because of the antipatterns of a giant terraform module. There are many reasons why a single state file for your entire infrastructure is generally discouraged (blast radius, scalability, being able to understand better, etc). OTOH, managing many modules can get difficult because you have duplicate configurations for common setups (e.g state config). Terragrunt helps avoid repeating yourself while allowing you to break up your terraform modules in to multiple state files.

Using plan-all and apply-all on a day to day basis suggests that possibly those modules should be a single unit, at which point it probably makes more sense to use terraform directly. There are edge cases where you can't do that (e.g you want different state buckets in a multi tenant scenario), but in general it feels like a design smell if you are using xxx-all in day to day operations.

Yes, we're trying to build a workflow where we can put TF changes to Pull Request and then CICD pipeline will plan these changes and apply them after human approved. And yes, we have a lot of TF folders, that's why we need Terrgrunt with its *-all functionality.

Unfortunately, I don't think the community has really "solved" this yet in that there is no single generic solution that will work for all cases. I think there are specific scenarios where a plan-all => apply-all sequence works, but you need to understand the implications and drawbacks to know if it will.

Your best bet is to consider the above, and design your workflows to mitigate. E.g the looping logic is one approach, investing in tooling to only run plan and apply on the infrastructure that has been touched is another, updating your pipeline to plan and apply in dependency order is also valid, or contributing to improve plan-all output to fix issues like you are seeing + a forward propagating plan would be killer!

ivasilyev-servicetitan-com commented 4 years ago

"but you need to understand the implications and drawbacks"

I completely agree with that. It has similar things like monolith vs. multi-services approach. It's easier to control dependencies with monolith by just compiling your project. But if you need some granularity and "dynamics" you should pay with having more complex dependencies and control them more precisely (like controlling your API contracts/endpoints). It's clear that if you drop VNet in one TF folder, you will have some effect in another TF folder which has resources relying on that VNet.

Using plan-all and apply-all on a day to day basis suggests that possibly those modules should be a single unit

That was our option as well. But we decided to keep many folders to not have single giant tfstate file, which at least scary to lose/corrupt.

For now, we'll keep using *-all command in CICD pipeline. If we find some issues in the workflow, I will try to share them.

Thanks for you input again, @yorinasub17

kschu91 commented 4 years ago

Hi folks,

Unfortunetly the merged fix #577 seems to have problems with relative directories. It uses skipVars := cmd == "apply" && util.IsFile(secondArg) to check if vars need to be skipped. But when you specify the path to the plan file relatively the util.IsFile seems to fail and terragrunt passes still the -var arguments which leads to the above mentioned issue again:

How to reproduce?

terraform {
  extra_arguments "defaults" {
    commands = get_terraform_commands_that_need_vars()
    arguments = [
      "-var-file=${get_terragrunt_dir()}/../defaults.tfvars"
    ]
  }
}

terragrunt plan --terragrunt-working-dir my_module/ -out plan.out
terragrunt apply --terragrunt-working-dir my_module/ plan.out

which results in: Error: Can't set variables when applying a saved plan

Temporarily Solution

Specifying the plan.out file as absolute path for the apply command works. Eg.

terragrunt plan --terragrunt-working-dir my_module/ -out plan.out
terragrunt apply --terragrunt-working-dir my_module/ /my/path/to/plan.out

Expectred behaviour

I expect the special treatmeant for apply with a plan to also work with a correct relative path.

marcoreni commented 4 years ago

Hey @kschu91 , this behavior had been already pointed out in https://github.com/gruntwork-io/terragrunt/issues/493#issuecomment-452086522 .

The issue may be resolved by running Abs on the last parameter in order to make it an absolute path no matter what. I'm just not sure if there may be some unintended behavior..

brikis98 commented 4 years ago

Using relative paths with Terragrunt is problematic, as the source parameter results in Terragrunt checking out code to a separate dir (in .terragrunt-cache) and switching to that as the working dir. We recommend using absolute paths when possible.

gruntwork-io / terragrunt