Open radeksimko opened 7 years ago
Thanks for starting this discussion, @radeksimko!
I have a question about your initial example, which I'll quote some parts of and then comment inline:
$ terraform plan
$ terraform apply
# The new ECS TD doesn't work (╯°□°)╯︵ ┻━┻, let's roll back
$ terraform list-versions aws_ecs_task_definition.myapp
myapp-td:4 # CURRENT
myapp-td:3
myapp-td:2
myapp-td:1
$ terraform set-version aws_ecs_task_definition.myapp myapp-td:3
If I understood the rest of the proposal correctly, I think your intent is that at this point we've recorded in the state that version 3 is "current", but we've not actually changed anything in the real system.
$ terraform plan
# here we see plan w/ diff in dependent resources (e.g. aws_ecs_service)
In which case, what exactly is plan
doing here?
Normally we expect plan
to generate a diff from the current state to what's in the configuration, but if we did that here then we'd presumably flip back to being on version 4 during refresh (because it wants to make the state match the "real world") and then get no diff at all because the config will (presumably) be describing the world as of version 4.
It feels to me like set-version
must actually take action on the real system because otherwise whatever it does to the state would be undone by a subsequent refresh
.
One way we could conceptualize "rolling back" is as a funny sort of plan. Normally a plan produces a diff from the state to the config, and "rolling back" could be thought of as a diff from the current state to some "synthetic configuration" we obtain by reading the old version from the real system, in situations where the backend itself has a versioning concept.
So building on the idea of terraform plan -destroy
(which acts as a modifier for what we're diffing), perhaps we'd say:
$ terraform plan -set-version="aws_ecs_task_definition.myapp=myapp-td:3" -out=tfplan
aws_ecs_task_definition.myapp: Refreshing state... (ID: abc123)
aws_ecs_task_definition.myapp: Reading version myapp-td:3... (ID: abc123)
...
~ aws_ecs_task_definition.myapp
version_id: "myapp-td:4" -> "myapp-td:3"
container_definitions: "abc" -> "123"
...
Here's what I assumed plan
did here:
ReadVersion
that is like Read
but takes a particular version number, populating the ResourceData
with the configuration as of version "myapp-td:3", retaining this in memory somewhere.Now when we later apply
this plan, the Update
implementation for this resource can see that version_id
changed from "myapp-td:4" to "myapp-td:3" and so implement this update by taking whatever API action maps to rolling back to this earlier version, rather than by creating a new version as it presumably would've done in the normal case. The other attribute changes shown in the diff are purely for user information and would not actually be used in this action, because the provider could assume that the rollback would implicitly make those changes.
I think we need to think carefully about what happens to dependent resources in this situation. If we had another resource consuming the revision
attribute of that resource, I think we'd expect the terraform plan -set-version="..."
command to produce a plan that also includes a diff for the downstream resource that is using the revision
attribute, so the rollback can ripple through any downstream resources that might need to be repointed at the earlier version in some way.
we've not actually changed anything in the real system.
This obviously depends on the implementation of Resource.SetVersion
, but I thought we could have a Passthrough helper, similar to the one we have for imports - basically all it would do is d.SetId(givenId)
where givenId
is the [version-id]
from terraform set-version resource [version-id]
.
Whether we should call Read()
straight after resetting the ID is another question. I think we should follow taint
here - so I shouldn't call it.
It feels to me like set-version must actually take action on the real system because otherwise whatever it does to the state would be undone by a subsequent refresh.
It would not be undone, in fact if ID is taken is real unique ID (version-unique) - which in most cases is - then refresh would just pull a different version of the resource, if implemented well.
Normally a plan produces a diff from the state to the config, and "rolling back" could be thought of as a diff from the current state to some "synthetic configuration" we obtain by reading the old version from the real system, in situations where the backend itself has a versioning concept.
By the time we get to plan generation, we'll already have the new versioned resource pulled in place (either in memory in case of plan
or in tfstate in case of refresh
) so we should not need to make that distinction.
I think we need to think carefully about what happens to dependent resources in this situation. If we had another resource consuming the revision attribute of that resource, I think we'd expect the terraform plan -set-version="..." command to produce a plan that also includes a diff for the downstream resource that is using the revision attribute
Yes, at the moment dependent resources are planned for potential update only if Computed
fields are marked for "recomputation", which in reality is just taint
or change of any ForceNew: true
field.
There's another good reason why expand the number of cases where dependent resources should be scheduled for update, specifically https://github.com/hashicorp/terraform/pull/4846
So basically set-version
would also mark Computed
resources for "re-computation" besides changing the ID of the resource - which should in theory make it all work together.
btw. yes - the above mechanism is kinda broken under apply -refresh=false
💭 😞
Ahh, I see where I misunderstood. You imagine the resource id as identifying the specific version, where I was expecting it works be identifying the versioned object itself, regardless of current version.
Seems like we need both ids in any case, since the version-agnostic id will be needed to enumerate the available versions.
I am still a little confused about the interaction between set-version and plan; how does the subsequent plan know that you want to make the world match the old version, as opposed to matching what is in the config as it would do by default?
Seems like we need both ids in any case, since the version-agnostic id will be needed to enumerate the available versions.
Agreed.
I am still a little confused about the interaction between set-version and plan; how does the subsequent plan know that you want to make the world match the old version, as opposed to matching what is in the config as it would do by default?
The config (in HCL) is expected to be version agnostic, so there's no conflict there (I think). The only difference would appear to be in 1 or more computed fields - e.g. arn
. I'm not sure how the diff generation works for ID in particular - I assume it's not treated as Computed field, so maybe after all we would need to set ID + 1 computed field.
Something I didn't realise in the initial proposal and what you were probably asking about: We would basically suppress the diff of the versioned resource itself. It may sound like a bad idea, but after all we're not changing anything, we're just referencing a different resource which already exists, so I think it's fine.
All you would see in the plan
is other resources that reference the versioned resource.
The config (in HCL) is expected to be version agnostic, so there's no conflict there (I think).
Actually I think I now understand what do you mean. That ^ is not true. 😞
But I still think we can get around that, if we flag versioned resources specifically in the tfstate and then have 2 sets of fields - 1st primary (basically matching HCL), 2nd pinned version then we may be able to compare things accordingly -> we can detect changes if HCL != Primary, but then do the actual comparison with pinned version. After apply
we'd basically remove the pinned version.
All references would be 1st compared against pinned version if one exists.
Any plan to address this anytime soon?
New commands
terraform list-versions RESOURCE_TYPE.NAME
terraform set-version RESOURCE_TYPE.NAME [version-id]
Expected use case
Schema changes
I'd expect
set-version
to have a very similar behaviour totaint
- i.e. no API calls, just ID change. The difference would then appear duringrefresh
/plan
/apply
as these would be working with the new version of the resource.Resources that would benefit from this
aws_api_gateway_deployment
aws_ecs_task_definition
aws_lambda_function
w/publish = true
aws_s3_bucket_object
in a bucket that has versioning enabledaws_elastic_beanstalk_application_version
docker_image
fastly_service
(as @apparentlymart mentioned)Breaking changes
The way we think about versioned resources today (in majority of cases) is that a given resource manages the latest version and
refresh
pulls down the latest version if it doesn't match with local config. Some resources destroy old versions (ECS TD), some don't (S3 object, Docker Image, Lambda), some don't create new versions at all (API Gateway Deployment).This was treated as a nice way to detect changes outside of Terraform which we'd now lost.
Impact
destroy
We would only destroy a given version of the resource rather than all versions which could potentially cause dependency issues as we might not be able to destroy some parent resources - e.g.
aws_api_gateway_rest_api
.We could accept this as a feature/default behaviour and make people explicitly do something like
which would under the hood result in something like this:
refresh
We would refresh data about a given version of that resource, rather than pull the latest version, but here's how the user could emulate the latter behaviour:
apply
Each
apply
would (potentially) generate a new version (new resource) which in turn means it would leave some orphans behind.Some resources may need a new field as the cause of version generation comes from outside of the resource. We already have this in
docker_image
and we'd probably need to add similar one toaws_api_gateway_deployment
too.taint
This would behave as
apply
with changedtrigger
.