Open skeggse opened 2 years ago
~Oops, sorry - didn't finish filling out the template. Updating.~
Hi @skeggse! Thanks for sharing this use-case.
The compromise of allowing only static move rules arose from having to work within some other constraints, so I think in order to make this more dynamic we'd need to either find a new design that meets the constraints or find a way to change the constraints. With that in mind, I'm going to record here some context about how moved
blocks work today and why they work that way, and then I suppose this issue will serve as a place for us to think about and discuss ways we can either work within or bend the constraints to meet the use-case.
The current (as of Terraform v1.1, but there weren't any relevant changes in v1.2) sequences of events for handling moved
blocks is the following:
moved
blocks in the configuration, which takes into account situations where e.g. the to
of one block refers to the from
of another, which we call "chained moves", and where the from
of one block is contained within the to
of another.Walk the move graph and rewrite the addresses in the in-memory state to follow what the moved blocks describe, where necessary and where possible. (If there isn't any object at the from
address, or if there's already an object at the to
address, a moved
block has no effect.)
At the end of this step, the in-memory state is now exclusively using the new addresses, and has no record of the old addresses. In other words, it's as if Terraform secretly ran terraform state mv
a number of times but is keeping the result only in memory and hasn't committed it anywhere yet.
moved
blocks in the configuration and confirm that they are valid. Leaving validation to the end like this is a little weird, but I'll say more below about why it works this way.There are some important ordering constraints that informed the above ordering:
moved
blocks until we know all of the instances of all resources in the configuration, because moving is an instance-oriented operation and it's invalid to specify in to
an address that is also currently declared in the configuration via a resource
or data
block.count
or for_each
expressions for the resources.I think that's full coverage of all of the constraints that led to the current design, though I'm happy to answer questions if something isn't clear.
To support dynamic expressions in moved
block from
or to
would therefore require somehow avoiding or bending the rules above, so that either we handle moved blocks during the planning graph walk or we can somehow safely evaluate expressions in the moved block preprocessing phase.
When thinking about that it's important to realize that the only kind of reference that can never potentially depend on a resource action in Terraform today is a root module input variable. A child module's input variables can be derived from any data the caller has access to, and so evaluation of those variables is subject to graph-based ordering just like any other expression in the configuration.
When we were designing config-driven move we discussed a slightly different use-case to the one you presented here which runs into the same constraints: if a particular shared module has a systematic rule for deriving for_each
keys from an input variable, it cannot possibly predict and hard-code those instance keys in moved
blocks, because the actual keys are chosen (indirectly) by the module's caller. It would be ideal if the shared module could similarly define a systematic rule for how the instance keys are derived so that a moved
block could work regardless of the input variable values, but it isn't clear how to achieve that within the above constraints or how to bend the constraints to support it.
I must admit I had not directly considered the situation you are describing, but on reflection I suppose it is a variation of the same idea: creating multiple workspaces that all share the same configuration is more or less the same as multiple different calls to the same shared module for the purposes of this discussion, with the main difference being that it's Terraform Cloud's control plane that is deciding the input variable values, instead of a dynamic expression in a calling module.
One notable difference though is that we are guaranteed today that root module input variables are always dependency leaves: they will never depend on anything else in the planning graph. That does therefore give us some room for rules-bending, but it would come at the expense of making a special exception that applies only to root module input variables in particular, and that would be unfortunate since we've been aiming to reduce rather than increase the differences in capabilities between root vs. shared modules. :thinking:
Another possible way to think about the problem is to imagine Terraform Cloud's control plane as being the topmost "calling module" and pretend as if it somehow had moved
blocks declared inside it that reach into the root module from the outside, in a similar sense as how moved
blocks can specify source and/or addresses in child modules. That would require adding some new capabilities to Terraform Cloud so it can itself have a record of historical moves within a workspace, and require us to give Terraform Cloud a way to tell Terraform Core about the moves it knows about when it creates a plan. While that does seem like a relatively heavy lift given that it crosses an architectural boundary, one interesting benefit of that is the possibility that it could in future become a building block for moving objects between workspaces, by exploiting Terraform Cloud's wider view of the world as compared to Terraform Core.
For the moment I think this remains in a "more design required" state, and so isn't ready for any particular action yet. I hope the above information is a useful starting point for more design work.
Thanks for the context, this is roughly what I expected.
While that does seem like a relatively heavy lift given that it crosses an architectural boundary, one interesting benefit of that is the possibility that it could in future become a building block for moving objects between workspaces, by exploiting Terraform Cloud's wider view of the world as compared to Terraform Core.
Cross-workspace resource moves are currently extremely painful, and having this implemented in Terraform Cloud would solve a lot of pain for us. In the interim, it's valuable enough to us that we'll likely implement custom move tooling, though it won't be as as nice as the moved
blocks.
@skeggse Hi! Would you be open to chatting more about this issue, and maybe discuss some pains with moving objects between Terraform Cloud workspaces? If so, you can book time directly at my Calendly here, but no pressure whatsoever.
This would be a great addition. For context, we use workspaces to deploy multi-region (or multi-account) infrastructure using a single TF configuration (multi-statefile monolith). Our TF deployment pipeline will execute plan|apply
against multiple workspaces, but backed by the very same configuration. We use "dynamic" locals for region-specific customization and we heavily rely on data
sources. As a result, we cannot predict exactly the name of our resources in the statefile (especially under a for_each
loop). Because moved
blocks are static, we cannot use them in such a way that will cater for multiple workspaces (under the same config). Having support for non-static values will definitely be a plus for us (and other users of this approach).
@korinne I am also open for chatting about our use-case and other possible improvements of the moved
statement (e.g. cross-statefile support, remove support, imports?) if the offer is extendible.
@tanasegabriel Yes, the offer is certainly extended! Feel free to grab any time you see on my Calendly here. Thank you in advance, and excited to chat.
I have a use-case where i'd like to change the value used in a for_each key without destroying and recreating any resources. In reality the for_each value is sourced dynamically
locals {
foo = [
{
a = 1
b = 2
}
]
}
# Before
resource "null_resource" "foo" {
triggers = {
baz = each.value
}
for_each = {
for bar in local.foo
: "${bar.a}" => bar
}
}
# After
resource "null_resource" "foo" {
triggers = {
baz = each.value
}
for_each = {
for bar in local.foo
: "${bar.b}" => bar
}
}
It'd be great to be able to express this renaming in a moved block, something like:
moved {
from = null_resource.foo[each.value.a]
to = null_resource.foo[each.value.b]
for_each = {
for bar in local.foo
: "${bar.a}${bar.b}" => bar
}
}
Hi @rifelpet! Thanks for sharing this example.
In your real system, when you say "sourced dynamically", can you say a little more about what exactly that means? For example, is it coming from a data source, and if so can you say which one? This is just to help us more fully understand the specific situation in case there's some other way to get it done that might be easier to achieve.
Sure thing, the real use-case is a reusable module with variables containing sets of AWS availability zones and Virtual Private Gateway IDs. The module creates one aws_route_table
per zone and one aws_vpn_gateway_route_propagation
per zone per VGW ID.
Originally this module used count
and when refactoring to for_each
I had used an attribute of a dependent resource in the for_each key (the route table ID) rather than the key of that dependent resource's for_each (availability zone ID). This lead to the "for_each map includes keys derived from resource attributes that cannot be determined until apply" limitation when applying new invocations of this module (during my initial refactoring, I migrated existing invocations of the module with terraform state mv
).
I want to change the aws_vpn_gateway_route_propagation
's for_each
key from referencing the route table ID to referencing the availability zone ID without destroying + recreating any existing resources. I was hoping moved
would support such a case.
variable "zones" {
default = toset(["..."])
}
variable "vgw_ids" {
default = toset(["..."])
}
resource "aws_route_table" "foo" {
for_each = var.zones
}
locals {
az_vgws = {
for rt_vgw in [for product in setproduct(var.vgw_ids, var.zones):
{
vgw : product[0],
rt : aws_route_table.foo[product[1]].id,
zone : product[1],
}
]
: "${rt_vgw.rt}-${rt_vgw.vgw}" => rt_vgw # I want to change to "${rt_vgw.zone}-${rt_vgw.vgw}"
}
}
resource "aws_vpn_gateway_route_propagation" "foo" {
for_each = local.az_vgws
}
moved {
from = aws_vpn_gateway_route_propagation.foo["${each.value.rt}-${each.value.vgw}"]
to = aws_vpn_gateway_route_propagation.foo["${each.value.zone}-${each.value.vgw}"]
for_each = local.az_vgws
}
I just ran into something similar. We were creating resources with for_each
and found a situation where the keys could be duplicated, so we changed the keys by adding a second value, basically format("%s-%s", local.config.old_key, local.config.disambiguator)
. So I want something like
moved {
for_each = local.config_map
from = aws_lb.default[each.value.old_key]
to = aws_lb.default[format("%s-%s", each.value.old_key, each.value.disambiguator)]
}
Here's a use-case that I have:
Before:
resource "aws_route" "private_a" {
route_table_id = var.aws_route_table_private_a_id
destination_cidr_block = data.aws_vpc_peering_connection.connection.cidr_block
vpc_peering_connection_id = data.aws_vpc_peering_connection.connection.id
}
resource "aws_route" "private_b" {
route_table_id = var.aws_route_table_private_b_id
destination_cidr_block = data.aws_vpc_peering_connection.connection.cidr_block
vpc_peering_connection_id = data.aws_vpc_peering_connection.connection.id
}
After:
resource "aws_route" "route" {
for_each = var.route_tables
route_table_id = each.key
destination_cidr_block = data.aws_vpc_peering_connection.connection.cidr_block
vpc_peering_connection_id = data.aws_vpc_peering_connection.connection.id
}
moved {
from = aws_route.private_a
to = aws_route.route[aws_route.private_a.route_table_id]
}
moved {
from = aws_route.private_b
to = aws_route.route[aws_route.private_b.route_table_id]
}
Trying to run this gives me this error:
There are some problems with the configuration, described below.
The Terraform configuration must be valid before initialization so that
Terraform can determine which modules and providers need to be installed.
╷
│ Error: Invalid expression
│
│ on peer.tf line 23, in moved:
│ 23: to = aws_route.route[aws_route.private_a.route_table_id]
│
│ A single static variable reference is required: only attribute access and
│ indexing with constant keys. No calculations, function calls, template
│ expressions, etc are allowed here.
╵
╷
│ Error: Invalid expression
│
│ on peer.tf line 28, in moved:
│ 28: to = aws_route.route[aws_route.private_b.route_table_id]
│
│ A single static variable reference is required: only attribute access and
│ indexing with constant keys. No calculations, function calls, template
│ expressions, etc are allowed here.
╵
Note how unhelpful/misleading this error is. After all, I am doing only attribute access
. (Although upon looking at it further I guess I'm not indexing with constant keys
.)
We have the exact same problem. I am trying to convert my aws_eks_node_group
resource, of which I create 3 of using a count, to for_each, named off the subnet_id. So I'm trying to do this:
moved {
from = aws_eks_node_group.spot_node_group[0]
to = aws_eks_node_group.spot_node_group[one(aws_eks_node_group.spot_node_group[0].subnet_ids)]
}
moved {
from = aws_eks_node_group.spot_node_group[1]
to = aws_eks_node_group.spot_node_group[one(aws_eks_node_group.spot_node_group[1].subnet_ids)]
}
moved {
from = aws_eks_node_group.spot_node_group[2]
to = aws_eks_node_group.spot_node_group[one(aws_eks_node_group.spot_node_group[2].subnet_ids)]
}
Doesn't work, same error. So now I have no recourse but to manually run 3 terraform mv state
commands to try to migrate 200+ clusters :(
So , this is also not supported for import and removed blocks as well, because it didnt work for me. Correct it if i am wrong
Current Terraform Version
Use-cases
Let's say I want to restructure my Terraform configuration, such that my
aws_subnet
resources are keyed by CIDR block. In other words, I might desire a structure that looks like:This makes it fairly convenient to review plans, and to address the right resources when targeting applies.
If I have hundreds of workspaces using the same structure, and that share the same underlying code, I might vary some of this based on the workspace. For example,
10.X.Y.0/21
whereX
depends on which workspace is being planned. This might be sourced via Terraform Cloud fromvar.cidr_offset
or something:However, my prior state might not be using this format. Instead, it might be doing something like
us-east-1a-1
instead of10.3.0.0/21
. It'd be nice to be able to move the resources in these hundreds of workspaces to their appropriate destinations by writing a single set ofmoved
blocks.Attempted Solutions
Try to use a dynamic (variable-derived) value for the resource address:
Manually update hundreds of workspaces.
Proposal
Support dynamic values derived from runtime state (e.g. just allow
var.xxyy
, but maybe notresource.abc
expressions).