gruntwork-io / terragrunt

Terragrunt is a flexible orchestration tool that allows Infrastructure as Code written in OpenTofu/Terraform to scale.
https://terragrunt.gruntwork.io/
MIT License
7.96k stars 967 forks source link

apply-all doesn't know about dependencies ? #923

Open rgarrigue opened 4 years ago

rgarrigue commented 4 years ago

Hi

I've been wondering about this log

~/terraform-live/qa/aws/eu-west-1/db (master ✘)✹ ᐅ terragrunt apply-all
[terragrunt] 2019/10/22 14:03:28 Error processing module at '/home/me/terraform-live/qa/aws/eu-west-1/db/app1/terragrunt.hcl'. How this module was found: Terragrunt config file found in a subdirectory of /home/me/terraform-live/qa/aws/eu-west-1/db. Underlying error: /home/me/terraform-live/qa/aws/eu-west-1/db/app1/terragrunt.hcl:11,61-71: Unknown variable; There is no variable named "dependency"., and 3 other diagnostic(s)
[terragrunt] 2019/10/22 14:03:28 Unable to determine underlying exit code, so Terragrunt will exit with error code 1

~/terraform-live/qa/aws/eu-west-1/db (master ✘)✹ ᐅ cd app1 

~/terraform-live/qa/aws/eu-west-1/db/app1 (master ✘)✹ ᐅ terragrunt apply-all
[terragrunt] 2019/10/22 14:03:33 Error processing module at '/home/me/terraform-live/qa/aws/eu-west-1/db/app1/terragrunt.hcl'. How this module was found: Terragrunt config file found in a subdirectory of /home/me/terraform-live/qa/aws/eu-west-1/db/app1. Underlying error: /home/me/terraform-live/qa/aws/eu-west-1/db/app1/terragrunt.hcl:11,61-71: Unknown variable; There is no variable named "dependency"., and 3 other diagnostic(s)
[terragrunt] 2019/10/22 14:03:33 Unable to determine underlying exit code, so Terragrunt will exit with error code 1

~/terraform-live/qa/aws/eu-west-1/db/app1 (master ✘)✹ ᐅ terragrunt apply -auto-approve
[terragrunt] 2019/10/22 14:03:40 Reading Terragrunt config file at /home/me/terraform-live/qa/aws/eu-west-1/db/app1/terragrunt.hcl
[terragrunt] [/home/me/terraform-live/qa/aws/eu-west-1/bastion] 2019/10/22 14:03:40 Reading Terragrunt config file at /home/me/terraform-live/qa/aws/eu-west-1/bastion/terragrunt.hcl
[terragrunt] [/home/me/terraform-live/qa/aws/eu-west-1/key_pair] 2019/10/22 14:03:40 Reading Terragrunt config file at /home/me/terraform-live/qa/aws/eu-west-1/key_pair/terragrunt.hcl
[terragrunt] [/home/me/terraform-live/qa/aws/eu-west-1/key_pair] 2019/10/22 14:03:40 Running command: terraform --version
...

The above mentionned 11,61-71 behing the first dependency in the execute here

include {
  path = find_in_parent_folders()
}

terraform {
  source = "git::git@github.com:org/repo.git//."

  before_hook "open_tunnel_through_bastion" {
    commands = ["plan", "apply", "show", "destroy"]
    execute  = ["screen", "-d", "-m", "ssh", "-L", "11111:${dependency.instance.outputs.this_db_instance_address}:${dependency.instance.outputs.this_db_instance_port}", dependency.bastion.outputs.hostname, "-p", "11111", "sleep", "60"]
  }
}

dependency "bastion" {
  config_path = "../../../bastion/"
  mock_outputs = {
    hostname = "localhost"
  }
}

dependency "instance" {
  config_path = "../../instance/"
  mock_outputs = {
    this_db_instance_address  = "localhost"
    this_db_instance_port     = 11111
    this_db_instance_username = "mockup_user"
  }
}

inputs = {
  service = "REDACTED"

  host = "localhost"
  port = "11111"

  postgres_user = dependency.instance.outputs.this_db_instance_username

  db_name       = "REDACTED"
  db_password   = "REDACTED"
  db_extensions = ["uuid-ossp", "hstore"]
}

So it seems it doesn't know about dependencies for the apply-all, I think I saw the same glitch for destroy-all. Anyway, is there a workaround or a good practise to avoid this pitfall ?

Best regards,

yorinasub17 commented 4 years ago

The main issue here relates to the way dependency is used. In order to come up with the dependency graph, terragrunt needs to be able to parse the terraform block in full. However, in your config, it can't parse the terraform block until it has parsed the dependency block, which requires fetching the output, and that fails.

You can avoid this by using mock_outputs on the dependency block so that it will fallback to the mock outputs when the dependency hasn't been applied yet. This works because terragrunt will actually reparse the full config before running the apply when it gets there in the dependency tree, so when it parses the config the second time, it will have the actual dependency outputs instead of the mock outputs.

rgarrigue commented 4 years ago

@yorinasub17 I already have mockups. Sorry for the lack of precision, I updated my original post with the whole terragrunt.hcl

yorinasub17 commented 4 years ago

Thanks for sharing the full config and now I see what is going on. This issue is caused by the parsing logic, where we deliberately avoid retrieving outputs from dependency blocks in the first pass to build the graph. This means that the dependency references all fail because we don't pass in any fillers to the parser.

I am not 100% sure what the best fix would be here yet. This requires a little bit of thought.

On one hand, we could attempt to read the outputs normally, but then that would fail in the case when apply-all is run for the first time.

On the other hand, we can run in skip_outputs mode where we return the mock_outputs if available. But that means you can't have real dependency outputs ever in the first pass. And it can get confusing.

Yet another potential solution is to change the parsing logic for the terraform block during the first pass so that it only gets the source property, which is the only thing it needs in the dependency graph building routine. It is very unlikely that someone uses dependency outputs there, so that would make sense.

Right now I am leaning towards the skip_outputs solution because that is the easiest to implement and maintain, but I think the partial parsing of terraform blocks is what I ultimately want.

PauloCabral-internations commented 4 years ago

I am with the same problem. Any soluction?

yorinasub17 commented 4 years ago

Sorry unfortunately, we don't have a solution for this yet.

ArchiFleKs commented 3 years ago

Hi, I have the same kind of issue here, when doing a run-all apply. @yorinasub17 you said:

This works because terragrunt will actually reparse the full config before running the apply when it gets there in the dependency tree, so when it parses the config the second time, it will have the actual dependency outputs instead of the mock outputs.

Apparently Terragrunt does not try to re-read the ouput once the first apply has been done. My code is available here

│ Error: Error creating Security Group: InvalidVpcID.NotFound: The vpc ID 'vpc-00000000' does not exist             
│       status code: 400, request id: 06bed91f-8d21-4a03-bbc8-b114e77856f0                                                                                                                                                               
│                                                                                                                   
│   with aws_security_group.cluster[0],                                                                             
│   on cluster.tf line 55, in resource "aws_security_group" "cluster":                                                                                                                                                                   
│   55: resource "aws_security_group" "cluster" {    

When trying to provision an EKS cluster after the VPC module, it seems to use my mock output (vpc-00000000). I'll retry with explicit dependencies, maybe I missed something.

I'm running Terragrunt 0.31.4 and Terraform 1.0.4

ArchiFleKs commented 3 years ago

I just retried from scratch based on the repo above.

It seems EKS start before VPC is finished:

aws_iam_role.cluster[0]: Creating...                                                                                                                                                                                                     
aws_security_group.workers[0]: Creating...                                                                                                                                                                                               
aws_security_group.cluster[0]: Creating...                                                                                                                                                                                               
aws_iam_policy.cluster_elb_sl_role_creation[0]: Creating...                                                                                                                                                                              
aws_cloudwatch_log_group.this[0]: Creation complete after 1s [id=/aws/eks/pio-teks-tg-production-arm/cluster]                                                                                                                            
aws_iam_policy.cluster_elb_sl_role_creation[0]: Creation complete after 2s [id=arn:aws:iam::161285725140:policy/pio-teks-tg-production-arm-elb-sl-role-creation20210815194818341400000004]                                               
aws_iam_role.cluster[0]: Creation complete after 2s [id=pio-teks-tg-production-arm20210815194818332100000001]                                                                                                                            
aws_iam_role_policy_attachment.cluster_AmazonEKSClusterPolicy[0]: Creating...                                                                                                                                                            
aws_iam_role_policy_attachment.cluster_AmazonEKSVPCResourceControllerPolicy[0]: Creating...                                                                                                                                              
aws_iam_role_policy_attachment.cluster_AmazonEKSServicePolicy[0]: Creating...                                                                                                                                                            
aws_iam_role_policy_attachment.cluster_elb_sl_role_creation[0]: Creating...                                                                                                                                                              
aws_iam_role_policy_attachment.cluster_AmazonEKSServicePolicy[0]: Creation complete after 1s [id=pio-teks-tg-production-arm20210815194818332100000001-20210815194820539500000005]                                                        
aws_iam_role_policy_attachment.cluster_AmazonEKSVPCResourceControllerPolicy[0]: Creation complete after 1s [id=pio-teks-tg-production-arm20210815194818332100000001-20210815194820543600000007]                                          
aws_iam_role_policy_attachment.cluster_elb_sl_role_creation[0]: Creation complete after 1s [id=pio-teks-tg-production-arm20210815194818332100000001-20210815194820541100000006]                                                          
aws_iam_role_policy_attachment.cluster_AmazonEKSClusterPolicy[0]: Creation complete after 1s [id=pio-teks-tg-production-arm20210815194818332100000001-20210815194820553700000008]                                                        
╷                                                                                                                                                                                                                                        
│ Error: Error creating Security Group: InvalidVpcID.NotFound: The vpc ID 'vpc-00000000' does not exist                                                                                                                                  
│       status code: 400, request id: f0295061-f9d1-4f08-8da7-49cb41207a31                                                                                                                                                               
│                                                                                                                                                                                                                                        
│   with aws_security_group.cluster[0],                                                                                                                                                                                                  
│   on cluster.tf line 55, in resource "aws_security_group" "cluster":                                                                                                                                                                   
│   55: resource "aws_security_group" "cluster" {                                                                                                                                                                                        
│                                                                                                                                                                                                                                        
╵                                                                                                                                                                                                                                        
╷                                                                                                                                                                                                                                        
│ Error: Error creating Security Group: InvalidVpcID.NotFound: The vpc ID 'vpc-00000000' does not exist                                                                                                                                  
│       status code: 400, request id: 1f34bf9e-8950-4d51-9a4d-04bff22c6ca9                                                                                                                                                               
│                                                                                                                                                                                                                                        
│   with aws_security_group.workers[0],                                                                                                                                                                                                  
│   on workers.tf line 360, in resource "aws_security_group" "workers":                                                                                                                                                                  
│  360: resource "aws_security_group" "workers" {                                                                                                                                                                                        
│                                                          

Even though the dependency are defined both in dependency and mock output and also with explicit dependencies block.

dependencies {
  paths = ["../vpc", "../encryption-config"]
}
bd-spl commented 2 years ago

It seems EKS start before VPC is finished

any updates on this? I'm trying it with tEKS, and hitting the same issue. Trying to mitigate it by adding a similar dependencies block for eks's terragrunt.hcl

dependencies {
  paths = ["../encryption-config", "../vpc", "../vpc-endpoints"]
}

Setting mock_outputs_merge_strategy_with_state: no_merge, without skipping outputs, makes the race to disappear!

But it might fail when running terragrunt commands from a scratch. Perhaps allow skipping outputs should be parametrizable and being smart with that, and rerun with reconfigured skip values as a failback?..

For my tEKS, the following patch (just as an example) allowed me to bypass that EKS to VPC race condition:

diff --git a/terragrunt/dependency-blocks/eks.hcl b/terragrunt/dependency-blocks/eks.hcl
index 42137a5..221f69b 100644
--- a/terragrunt/dependency-blocks/eks.hcl
+++ b/terragrunt/dependency-blocks/eks.hcl
@@ -1,4 +1,4 @@
-skip = true
+#skip = true

 dependency "eks" {
   config_path = "${get_original_terragrunt_dir()}/../eks"
@@ -9,6 +9,6 @@ dependency "eks" {
     node_groups             = {}
     aws_auth_configmap_yaml = yamlencode("")
   }
-  mock_outputs_merge_strategy_with_state = true
-  skip_outputs                           = true
+  mock_outputs_merge_strategy_with_state = "no_merge"
+  #skip_outputs                           = true
 }
diff --git a/terragrunt/dependency-blocks/encryption-config.hcl b/terragrunt/dependency-blocks/encryption-config.hcl
index 3fe1080..b475d1c 100644
--- a/terragrunt/dependency-blocks/encryption-config.hcl
+++ b/terragrunt/dependency-blocks/encryption-config.hcl
@@ -1,4 +1,4 @@
-skip = true
+#skip = true

 dependency "encryption_config" {
   config_path = "${get_original_terragrunt_dir()}/../encryption-config"
@@ -6,6 +6,6 @@ dependency "encryption_config" {
   mock_outputs = {
     arn = "arn:aws:kms:xx-mock-0:123456789012:key/12345678-1234-1234-1234-123456789012"
   }
-  mock_outputs_merge_strategy_with_state = true
-  skip_outputs                           = true
+  mock_outputs_merge_strategy_with_state = "no_merge"
+  #skip_outputs                           = true
 }
diff --git a/terragrunt/dependency-blocks/vpc.hcl b/terragrunt/dependency-blocks/vpc.hcl
index edce7b5..ff4bbdc 100644
--- a/terragrunt/dependency-blocks/vpc.hcl
+++ b/terragrunt/dependency-blocks/vpc.hcl
@@ -1,4 +1,4 @@
-skip = true
+#skip = true

 dependency "vpc" {
   config_path = "${get_original_terragrunt_dir()}/../vpc"
@@ -25,6 +25,6 @@ dependency "vpc" {
     public_route_table_ids    = []
     default_security_group_id = "sg-deadb3af0"
   }
-  mock_outputs_merge_strategy_with_state = true
-  skip_outputs                           = true
+  mock_outputs_merge_strategy_with_state = "no_merge"
+  #skip_outputs                           = true
 }
diff --git a/terragrunt/live/dev/eu-west-1/clusters/demo/eks/terragrunt.hcl b/terragrunt/live/dev/eu-west-1/clusters/demo/eks/terragrunt.hcl
index df404be..2f0fda1 100644
--- a/terragrunt/live/dev/eu-west-1/clusters/demo/eks/terragrunt.hcl
+++ b/terragrunt/live/dev/eu-west-1/clusters/demo/eks/terragrunt.hcl
@@ -1,3 +1,7 @@
+dependencies {
+  paths = ["../encryption-config", "../vpc", "../vpc-endpoints"]
+}
+
 include "root" {
   path           = find_in_parent_folders()
   expose         = true
diff --git a/terragrunt/live/dev/eu-west-1/clusters/demo/vpc-endpoints/terragrunt.hcl b/terragrunt/live/dev/eu-west-1/clusters/demo/vpc-endpoints/terragrunt.hcl
index 377b8fe..c62a65f 100644
--- a/terragrunt/live/dev/eu-west-1/clusters/demo/vpc-endpoints/terragrunt.hcl
+++ b/terragrunt/live/dev/eu-west-1/clusters/demo/vpc-endpoints/terragrunt.hcl
@@ -1,3 +1,7 @@
+dependencies {
+  paths = ["../vpc"]
+}
+
 include "root" {
   path           = find_in_parent_folders()
   expose         = true
ArchiFleKs commented 2 years ago

@bd-spl I haven't tested in a while but I was pretty sure the latest tEKS example were working with apply-all last time I checked

gercograndia commented 1 year ago

Hi guys, I already commented on the related merge, but I guess this is a better place to do it.

@yorinasub17 you are mentioning the following:

Yet another potential solution is to change the parsing logic for the terraform block during the first pass so that it only gets the source property, which is the only thing it needs in the dependency graph building routine. It is very unlikely that someone uses dependency outputs there, so that would make sense.

However, I was trying to use it in the source, just to be able to define the version of the module I am using in once central place, and refer to it via this dependency. The alternative would be that each terragrunt.hcl is defining locals, reading the configs and determine it like that. Much more code in each config file.

My question is: is there still a hard limitation on allowing this also on the source, or is this realistically not feasible?

Thanks for your help guys!

claudiosf commented 8 months ago

This is still an issue Unable to retrieve the output of a module when running "run-all apply".