hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
42.11k stars 9.47k forks source link

Support for dynamic blocks and meta-arguments #24188

Open ausfestivus opened 4 years ago

ausfestivus commented 4 years ago

Afternoon,

FR to allow the dynamic blocks capability to work with resource meta-arguments.

Current Terraform Version

0.12.20+

Use-cases

The use case Im trying to implement is a simple one. I would like to add lifecycle meta-argument to a resouce when our var.ENVIRONMENT == "prod". ie, stop the pipeline destroying prod resources.

Attempted Solutions

  # Here we set a lifecycle block to include `prevent_destroy=true` when `var.ENVIRONMENT=prod`.
  dynamic "lifecycle" {
    for_each = var.ENVIRONMENT == "prod" ? [{}] : []

    content {
      prevent_destroy = true
    }
  }

Result of the above is:

Error: Unsupported block type

  on main.tf line 25, in resource "azurerm_resource_group" "rgnamegoeshere":
  25:   dynamic "lifecycle" {

Blocks of type "lifecycle" are not expected here.

Proposal

Support meta-arguments for use with dynamic blocks. Im sure its really easy to. jk.

References

Similar request in Terraform Core discussion: https://discuss.hashicorp.com/t/dynamic-lifecycle-ignore-changes/4579/4

tomasbackman commented 4 years ago

Another use case would be if we sometimes want to ignore a field, like "master_password or similar.

Another way solving both these use cases (I guess?) could be if variables would be allowed in lifecycle blocks, something like:

locals { 
  destroy = var.ENVIRONMENT == "prod" ? true : false
}
lifecycle { 
  ignore_changes = var.list_with_changes_to_ignore
  prevent_destroy = local.destroy
}

It would be very useful in any case

janosmiko commented 4 years ago

+1 for any of these. It would be really useful if we could manipulate lifecycle rules via variables or dynamic blocks.

tomaszsek commented 4 years ago

+1 for this enhancement. In my case I want to support two different major provider versions. In the old one there is field which is required but in the newest it doesn't exist.

davi5e commented 4 years ago

In my case, I'm creating a GKE module that could use a release channel or not, so in one scenario I need to ignore both min_master_version and node_version, while when release_channel == "UNSPECIFIED" I do not want to ignore them...

It would look like something as

data "google_container_engine_versions" "location" {
  location       = "southamerica-east1"
  project        = "leviatan-prod"
  version_prefix = var.kubernetes_channel != "UNSPECIFIED" ? var.kubernetes_version : ""
}

resource "google_container_cluster" "cluster" {
  provider = google-beta

  # [...]

  min_master_version = var.kubernetes_channel != "UNSPECIFIED" ? var.kubernetes_version : data.google_container_engine_versions.location.latest_master_version
  node_version       = var.kubernetes_channel != "UNSPECIFIED" ? var.kubernetes_version : data.google_container_engine_versions.location.latest_master_version

  release_channel {
    channel = var.kubernetes_channel
  }

  lifecycle {
    ignore_changes = [
      var.release_channel != "UNSPECIFIED" ? min_master_version : null,
      var.release_channel != "UNSPECIFIED" ? node_version : null,
    ]
  }
}

By the way, seems that using both the channel and the image versions yields some computed nulls in the resource's code, but this is not a problem (in case the current channel versions are used) nor part of the scope of the discussion...

Anyhow, these are the errors:

Error: Invalid expression

  on main.tf line XXX, in resource "google_container_cluster" "cluster":
 XXX:       var.kubernetes_channel != "UNSPECIFIED" ? min_master_version : null,

A single static variable reference is required: only attribute access and
indexing with constant keys. No calculations, function calls, template
expressions, etc are allowed here.

Error: Invalid expression

  on main.tf line YYY, in resource "google_container_cluster" "cluster":
 YYY:       var.kubernetes_channel != "UNSPECIFIED" ? node_version : null,

A single static variable reference is required: only attribute access and
indexing with constant keys. No calculations, function calls, template
expressions, etc are allowed here.
timorkal commented 4 years ago

+1

thorvats commented 4 years ago

+1

jwlogemann commented 4 years ago

+1

jkrivas commented 4 years ago

+1

stencore-repo commented 4 years ago

+1

ArtemTrofimushkin commented 4 years ago

+1

alex-sitnikov commented 4 years ago

Any chance this would be implemented? We want to introduce something like blame step in CI/CD to re-tag only changed resources with various info from build and it seems that boilerplate that needs to be included with each and every resource in lifecycle.ignore_changes is obnoxious.

Nuru commented 4 years ago

@danieldreier It would be a significant added benefit even if the block were limited to evaluating expressions that are not dependent on any state or resources, such as directly set variables and functions of them. This would still allow the expression to be evaluated very early in the processing, but at the same time allow option flags.

Note to other people reading this: please do not add "+1" comments. Instead, click on the thumbs up icon at the bottom of the issue statement.

tmccombs commented 4 years ago

Another use case is I have a module for a lambda function. Most of the time, I want to ignore changes to the actual code of the lambda, because that is managed outside of terraform. But in a few cases, terraform should manage the code as well, so I don't want to ignore changes.

I also tried doing something like:

  dynamic "lifecycle" {
    for_each = var.manage_code ? [] : [1]

    content {
      ignore_changes = [
        filename,
        runtime,
        handler,
        source_code_hash,
      ]
    }
  }

But then I get an error that Blocks of type "lifecycle" are not expected here. And of course modules don't support lifecycle's either....

The only way I can find to do this is to repeat all of the configuration withe the only change being the lifecycle.

Skyfallz commented 4 years ago

Hi,

seems like this is just the fresh version of this issue (both dynamic blocks or variable interpolation would do the trick for most of us I guess).

@apparentlymart The lack of this functionality is a real problem, we can't easily secure our production resources (we obviously do not want some of them to be destroyed) while keeping flexibility for our non-production environments (if I tell Terraform not to delete anything, how is the CI/CD supposed to clean up old environments ?). This is a real production issue, definitely not an improvement or feature request. It makes the whole lifecycle system flawed, and it should be considered as a bug in it. I just can't understand why Hashicorp can't even give us a clear answer on this. It's been 5 years.

Terraform should definitely allow this, and we need to know when this could land.

And PLEASE people, do not add "+1" and noise on this issue. That's what closed the previous ticket, and that's why we never got any response.

Dmitry1987 commented 4 years ago

@Skyfallz some feature requests might never be implemented, we have to accept that and move on with the workarounds I believe. One suggestion for everyone who struggles with this, is put an easy to 'sed' placeholder to that location (like LCYCLE_REPLACEME_LIST), and run a 's/find/replace/g' every time before running terraform (if it's a CI/CD pipeline and the modified tf files get discarded in that build job anyway).

tmccombs commented 4 years ago

@Dmitry1987, besides the fact that that is an incredibly awkward workaround, that doesn't solve the problem if the lifecycle is in a module that is used multiple times in the same workspace with different lifecycle requirements. The only workarounds I know of are to duplicate all the config, give up on HCL and use some other tool to generate terraform json files (which I would probably have to build myself, since I don't know of a well-established tool to do this), or use something other than terraform altogether.

Skyfallz commented 4 years ago

@Dmitry1987 @tmccombs this workaround is not that awkward (not more awkward that the fact that we can't do it natively on TF anyway), especially if you consider to do this only in your 'destroy' step in a CI/CD (this way lifecycle blocks are still present on apply). But for sure, this is not pretty, and we should definitely have a clean solution instead. I'm working on a Terraform wrapper to handle this usecase (and some others, like https://github.com/hashicorp/terraform/issues/17599), I'll share it when it's done.

tmccombs commented 4 years ago

@Skyfallz how would that workaround work for the ignore_changes example I gave above?

alex-sitnikov commented 4 years ago

@tmccombs we currently investigating use of Pulumi instead of Terraform since it seems to not have these awkward issues and is much more succinct in terms of representation. Basically instead of writing wrappers around Terraform you can use JS/C#/Python to describe your infra.

Dmitry1987 commented 4 years ago

I won't argue if it seems awkward to some :) but that's one possible way to do it, which I can think of (rendering all TF in JSON might be better or worse, depends on size of infra and how frequently changes are made). Never saw Pulumi, thanks @ReVolly will check this out

Dmitry1987 commented 4 years ago

Oh well, the first Pulumi example reminds me using vanilla SDK of a cloud provider, so it's probably better comparison vs SDKs and not so with Terraform (which is easier to use than SDK because declarative and keeps state)

import pulumi
from pulumi_azure import core, storage

# Create an Azure Resource Group
resource_group = core.ResourceGroup("resource_group")

# Create an Azure resource (Storage Account)
account = storage.Account("storage",
    resource_group_name=resource_group.name,
    account_tier='Standard',
    account_replication_type='LRS',
    tags={"Environment": "Dev"})

# Export the connection string for the storage account
pulumi.export('connection_string', account.primary_connection_string)

I wonder how large infra looks like, probably similar to raw SDK code (like boto3 in python if someone does aws in boto)

alex-sitnikov commented 4 years ago

@Dmitry1987 Pulumi really feels like a next-level Terraform (they even use its modules and have utility to convert tf=>pulumi). It keeps state as well and yeah it's deceptive a bit, because it seems like Pulumi defining actions to be performed and not describing desired state, but it is actually not the case. It's doing the same thing as tf but in imperative way, underneath it all it's very similar to Terraform concept which is to build resource graph and then create it via underlying provider. You can actually take a look at the comparison to tf here.

Dmitry1987 commented 4 years ago

sounds good thanks for sharing :)

On Thu, Jul 2, 2020 at 5:41 AM Alexei Sitnikov notifications@github.com wrote:

@Dmitry1987 https://github.com/Dmitry1987 Pulumi really feels like a next-level Terraform (they even use its modules and have utility to convert tf=>pulumi). It keeps state as well and yeah it's deceptive a bit, because it seems like Pulumi defining actions to be performed and not describing desired state, but it is actually not the case. It's doing the same thing as tf but in imperative way, underneath it all it's very similar to Terraform concept which is to build resource graph and then create it via underlying provider. You can actually take a look at the comparison to tf here https://www.pulumi.com/terraform/.

β€” You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/hashicorp/terraform/issues/24188#issuecomment-652982083, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACHPL6CLF7VGK56GNDTKFEDRZR567ANCNFSM4KZW4UFQ .

janschumann commented 4 years ago

@Dmitry1987 @ReVolly To say that Pulumi is next level to terraform ist just wrong by definition. as the pulumi comparison page says: terraform is declarative (HCL) and Pulumi is programmatic ("any language"). So these two approaches are completely different and therefore cannot be compared at all (just like apples and pears)

That said, you probably could compare terraform with puppet and Pulumi with chef. Which I used (all of them) in various projects. And my experience with the programmatic approach is that the resulting code needs much more maintenance in the long run, as code evolves. Especially in the DevOps age, where all the developers care for infrastructure as well. So what Pulumi and those promote as an advantage (to be able to everything you want), quickly turns to a maintenance nightmare.

What I often perceived, when I found myself stuck using the declarative approach - saying "I would like to code this thing" - was that there was a flaw in the overall architecture that I created. So that's the maintenance effort in the declarative world: To keep the architecture up to date, which means to constantly improve it, while keeping the code readable to everyone!

sithmal commented 3 years ago

+1

aditki commented 3 years ago

Another use case is ignoring load balancer target groups changes that Code deploy does that we usually ignore but having this support will let us not ignore changes made to the load balancer itself or the target groups.

BastienLouvel commented 3 years ago

+1

alvaro-fdez commented 3 years ago

+1

alvarolinarescabre commented 3 years ago

+1

thetoolsmith commented 3 years ago

@Skyfallz some feature requests might never be implemented, we have to accept that and move on with the workarounds I believe. One suggestion for everyone who struggles with this, is put an easy to 'sed' placeholder to that location (like LCYCLE_REPLACEME_LIST), and run a 's/find/replace/g' every time before running terraform (if it's a CI/CD pipeline and the modified tf files get discarded in that build job anyway).

We used this method as well for modularity configuration limitations in the azurerm frontdoor resource. sed replacement is basically used for templating on the fly to generate the file based on user vars when the underlying tech doesn't allow this flexibility.

drewsb commented 3 years ago

Any update on this? Being able to specify a variable in a lifecycle block is extremely useful and common use case for my case. Example: lifecycle { ignore_changes = var.ignore_changes }

abihf commented 3 years ago

My suggestion is: instead of dynamic ignore_changes, what if we can use previous value in the field. ex:

resource "some" "thing" {
  maybe_prev = previous("default value if previous value not exist")
}

So we can dynamically ignore changes like this:

resource "some" "thing" {
  maybe_prev = var.force_new_value ? var.new_value : previous(var.new_value)
}
paolostancato commented 3 years ago

+1

simonireilly commented 3 years ago

As pointed out by @medmolryan this actually doesn't work and could be potentially dangerous for the non prod environment.

Orignal comment (caution 🚨)
I stumbled on an easy way to get this working so sharing it for everyone's benefit. This will enable you to have dynamic lifecycles, but it will mean you have to use indexing when referring to the resources. ```terraform // dynamo_db_override.tf resource "aws_dynamodb_table" "table_name" { count = var.environment_name == "prod" ? 1 : 0 lifecycle { prevent_destroy = true } } ``` ```terraform // dynamo_db.tf resource "aws_dynamodb_table" "table_name" { count = 1 name = "${var.environment_name}-table-name" billing_mode = "PAY_PER_REQUEST" hash_key = "hash_key" lifecycle { ignore_changes = [read_capacity, write_capacity] } } ``` The overrides will only apply when the condition evaluates to count = 1. Then the overrides apply to only the defined blocks and are merged as per: https://www.terraform.io/docs/language/files/override.html
medmolryan commented 3 years ago

It seems the above doesn't work on 0.14.5 because the count argument (0) in the override file takes precendent over the count in the main file (1), so the planned count is 0 and the resource will be destroyed.

damon-atkins commented 3 years ago

I am surprise you can not do

locals {
   something = {
      prevent_destroy = true
    }
}
dynamic "lifecycle" {
    condition = can(var.ENVIRONMENT)
    content = something  # as long as the map is what the resource expects in this block
}

And also

resource "abc" "xyz" {
     somesetting = var.ENVIRONMENT
}

As

resource "abc" "xyz" {
    dynamic "somesetting" {
      condition = can(var.ENVIRONMENT)
      content = var.ENVIRONMENT
}
livingstonjames45 commented 3 years ago

I was hoping to use this in setting up multiple clusters via for each in a module. I'd have the database parameters defined in a map in my tfvars file, and just iterate through. Unfortunately, not being able to use configuration for ignore_changes (and prevent_destroy also) means if I ever need to recreate a specific cluster from a snapshot, I can't do it just by changing a few config parameters. I supposed I could comment out the configuration for the cluster in question, and redefine it statically.

ImIOImI commented 3 years ago

I came here looking for a way to dynamically ignore deployed databases like RDS instances. I would love to be able to deploy an instance, then switch a feature flag to ignore future changes.

Dmitry1987 commented 3 years ago

I came here looking for a way to dynamically ignore deployed databases like RDS instances. I would love to be able to deploy an instance, then switch a feature flag to ignore future changes.

you can use "lifecycle_rule" feature to ignore changes, and also may decide to remove an object from terraform state after it was created, to make terraform completely ignore that resource from now on, but you'll also need to comment out the section which creates it, i think, in this case.

damon-atkins commented 3 years ago

I have open a new feature request for the item I mention above, for an alternative to count and for_each #29314 Meta-Argument condition or include_if & content please give the top description thumbs up.

schollii commented 2 years ago

My use case is that I have an EC2 instance defined in terraform that I want to create EITHER from a windows server AMI determined via a data source, OR from the AMI ID chosen by user. In the latter case, any changes to the ami attribute of the aws_instance should be honored; but if the Windows server AMI has changed since last apply, just ignore it (and have an output that flags the AMI as upgradable because there is a new windows server AMI that matches the data source query). So I wanted lifecycle_rule with ignore_changes = var.ami_id == null ? [ami] : [] but that is not allowed.

schollii commented 2 years ago

@janschumann

in my experience with the programmatic approach is that the resulting code needs much more maintenance in the long run, as code evolves. Especially in the DevOps age, where all the developers care for infrastructure as well. So what Pulumi and those promote as an advantage (to be able to everything you want), quickly turns to a maintenance nightmare.

Problem is, that terraform HCL2 language has evolved into almost turing-complete! And debugging it is harder than debugging Python or Go due to lack of debugger (eg having to go into the terraform console is a pain). If you have to convert lists of lists to maps of lists etc, those are a pain to write and debug, more so than in Python. As long as one keeps the terraform HCL2 code simple, all is good, the "declarative" nature of HCL2 dominates.

But you can no longer say that HCL2 is purely declarative, just because it doesn't allow you to define functions. Just because HCL does not use a traditional "for" loop but rather a list or map comprehension does not make it declarative, you are still telling HCL exactly how create one structure from another. If instead I could give it two structures and call covnert(some object, structure, new structure) and it would do it, yeah that would be declarative.

Procedural in the most generic sense means that you have to tell the language interpreter the sequence of steps, and there definitely is that in HCL. Eg the above conversion task mentioned, often you have to break it down or it is too difficult to reason about. So you break it down into several steps, and each one to a local, with each subsequent local building on the previous ones.

This is procedural, although it is couched as declarative. In python you will have to first define variable "a", then "b" which depends on "a", then "c" which depends on "b", whereas in HCL you could write the definition for c first, then b, then c, and you know the HCL dependency engine will figure out that it must first evaluate a then b then c, but really, who would write like that? Ultimately people have to read the code and in western world we read from top to bottom so it only makes sense to define a, then b, then c.

It's only a matter of time before HCL has to support functions because as modules get bigger, keeping code DRY is becoming more and more of a pain.

That being said, I look forward to trying cdktf.

What I often perceived, when I found myself stuck using the declarative approach - saying "I would like to code this thing" - was that there was a flaw in the overall architecture that I created.

Often perhaps, but you can be sloppy or just make honest mistakes in any language, sometimes you have to provide extra tools that make incorrect design possible but you put in some sanity checks (eg assertions in HCL would be nice).

Mearman commented 2 years ago

Not a perfect solution but I'm about to test this: https://github.com/hashicorp/terraform/issues/22544#issuecomment-981575058

tmccombs commented 2 years ago

that works, but really isn't a great solution when you have large/complicated resources, or have many resources that all need to have variable met-arguments.

Mearman commented 2 years ago

Oh yeah not at all. When I implement my Terragrunt, I'm going to just generate it.

diannading commented 2 years ago

+1 , I really need this feature

mrsiejas commented 2 years ago

Not a perfect solution but I'm about to test this: #22544 (comment)

Big potential problem with this workaround is changing the variable will always cause resource destruction. For those who are brave enough to give it a try, here's also how to consolidate output using locals:


variable "enable_delete_protection" {
  type        = bool
  default     = true
  description = "Set resource protection of important non-recoverable resources"
}

resource "aws_prometheus_workspace" "amp_workspace" {
  count = var.enable_delete_protection ? 0 : 1
  alias = "${var.names_prefix}-workspace"
}

resource "aws_prometheus_workspace" "amp_workspace_protected" {
  count = var.enable_delete_protection ? 1 : 0
  alias = "${var.names_prefix}-workspace"

  lifecycle {
    prevent_destroy = true
  }
}

locals {
  prometheus_endpoint = (var.enable_delete_protection ? aws_prometheus_workspace.amp_workspace_protected : aws_prometheus_workspace.amp_workspace)[0].prometheus_endpoint
  ...
  remoteWrite = [{
  url = "${local.prometheus_endpoint}api/v1/remote_write"
  ...
sjudkins commented 2 years ago

Not a perfect solution but I'm about to test this: #22544 (comment)

Big potential problem with this workaround is changing the variable will always cause resource destruction. For those who are brave enough to give it a try, here's also how to consolidate output using locals:

variable "enable_delete_protection" {
  type        = bool
  default     = true
  description = "Set resource protection of important non-recoverable resources"
}

resource "aws_prometheus_workspace" "amp_workspace" {
  count = var.enable_delete_protection ? 0 : 1
  alias = "${var.names_prefix}-workspace"
}

resource "aws_prometheus_workspace" "amp_workspace_protected" {
  count = var.enable_delete_protection ? 1 : 0
  alias = "${var.names_prefix}-workspace"

  lifecycle {
    prevent_destroy = true
  }
}

locals {
  prometheus_endpoint = (var.enable_delete_protection ? aws_prometheus_workspace.amp_workspace_protected : aws_prometheus_workspace.amp_workspace)[0].prometheus_endpoint
  ...
  remoteWrite = [{
  url = "${local.prometheus_endpoint}api/v1/remote_write"
  ...

I am trying to get this workaround to work, but I get an error:

An argument named "alias" is not expected here.

Also, introducing the "count" attribute causes other references to be changed to prevent the following:

β”‚ Because azurerm_resource_group.main has "count" set, its attributes must be accessed on specific instances. β”‚ β”‚ For example, to correlate with indices of a referring resource, use: β”‚ azurerm_resource_group.main[count.index]

Is there a workaround for the "alias" and "count" issues when using this workaround to support "dynamically" destroying or not? (Or, is there another "workaround" until this feature is implemented?)

Thank you

mrsiejas commented 2 years ago

I am trying to get this workaround to work, but I get an error:

An argument named "alias" is not expected here.

Also, introducing the "count" attribute causes other references to be changed to prevent the following:

β”‚ Because azurerm_resource_group.main has "count" set, its attributes must be accessed on specific instances. β”‚ β”‚ For example, to correlate with indices of a referring resource, use: β”‚ azurerm_resource_group.main[count.index]

Is there a workaround for the "alias" and "count" issues when using this workaround to support "dynamically" destroying or not? (Or, is there another "workaround" until this feature is implemented?)

Thank you

alias is an attribute of resource "aws_prometheus_workspace" - it was used only as an example. Accordingly to terraform docs https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/resource_group doesn't have attribute alias so you have to remove it.

I think the second error is self explanatory since you introduce count in your resource, you have to access them on specific index if you are referencing it. U can use azurerm_resource_group.main[0]. I suggest reading docs on terraform ternary operator.

IrmantasMarozas commented 2 years ago

Another use case is handling logic "if aws_appautoscaling_scheduled_action is enabled then ignore changes to min_capacity/max_capacity in aws_appautoscaling_target"

# We ignore capacity changes since it's managed by scheduled scaling
  dynamic "lifecycle" {
    for_each = try(var.task_scaling_config.min_capacity.night, 0) > 0 ? [""] : []
    content {
      ignore_changes = [
        min_capacity,
        max_capacity
      ]
    }
  }