hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
42.73k stars 9.56k forks source link

New lifecycle tag "block changes" to prevent changes from happening #33437

Open gtmtech opened 1 year ago

gtmtech commented 1 year ago

Terraform Version

Any, eg 1.5

Use Cases

I am looking at using google workforce federation against Azure AD. In this, a GCP resource IAM policy is created which references a principalset:// principal type, which encodes the objectID of an AzureAD Group, which is a UUID. This basically says "anyone who is a member of this AzureAD group" will have the permission I assign to this GCP resource.

UUIDs are a little cumbersome to work with in code and configuration, especially when working with thousands of them. A yaml or json file full of UUIDs is pretty impenetrable. However I understand why google workforce federation works this way, because azure ad groups can be renamed, so the principalset does need to refer to the objectID rather than the displayName to prevent someone in azure getting permissions they shouldnt have just by renaming an azure group.

So I was thinking of using a data "azuread_group" resource to lookup the objectID based on name, and then config related to who I want to have permission on this or that gcp resource can reference group names instead, which is much more user-friendly for devs to engineer against.

This however suffers from the security issue mentioned above, if I start looking up group names using this datasource, and grabbing the objectID and using that in the principalSet:// principal string in a GCP resource IAM policy, someone could just rename a group.

I was thinking in this case though, terraform would notice, because the e.g. "google_folder_iam_member" field "member" would change (an update) which would either result in a "replace" operation or an "update" operation.

It then struck me, that I would like to have this "fixed" so that terraform would block these operations (basically error) if this field was updated. Something like:

lifecycle {
    block_changes = [ member ]
}

It struck me that this doesn't exist right now, but it would be quite useful.

I think then we could ensure that the permissions were resilient to someone trying to privilege escalate via group naming, whilst keeping the ability for teams to understand their code better by referring to permissions principals by name and not ID.

Attempted Solutions

None in terraform - I am investigating writing tooling that can help developers automatically translate between names and UUIDs in their code, but this isn't really a great approach

Proposal

lifecycle {
    block_changes = [ member ]
}

References

No response

apparentlymart commented 1 year ago

Thanks for sharing this use-case, @gtmtech!

In today's Terraform this is an example of something we'd consider to be a kind of "policy check", which are usually enforced like this:

Although this approach does require some extra steps, it also allows implementing arbitrary rules about what changes are acceptable without every one needing special support in Terraform itself.

There is one existing feature in Terraform that works like you've described: prevent_destroy makes planning fail if anything marked with it is planning for destruction. However, we consider that to have been a design error because:

All of those considerations led to the current posture of making policy checks something independent of Terraform itself. That approach allows everyone to tailor to exactly the rules they need, allows both hard failure and extra-approval-required conditions (as long as Terraform is running in automations that can support waiting for that extra approval), and makes the policy independent of what it is constraining. On the other hand, it does require some additional effort on the part of the person setting up the automation around Terraform; Terraform Cloud has this built in, but other automation methods may not.

With all of that said, I suspect that this feature would end up in the same regret bucket as prevent_destroy if it were to be implemented as a built-in, for many of the same reasons. Therefore my instinct is to ask you to implement this as a policy rule like I described above, but I'd be interested to hear if that seems infeasible for reasons I've not considered yet.

Thanks again!

jbardin commented 1 year ago

You can also create this type of policy check based on configuration using check blocks, and more specifically a data source within a check block. Using a data source to read the current state of the resource will allow you compare it against the desired value in the config and assert they are equal.