Open lukegAtPaxos opened 11 months ago
You might be able to do something like this once #219 is merged. You could create the Workspace
with managementPolicies: [Observe]
which would run terraform plan
and nothing else. We don't currently expose the output of that but it's probably feasible to do in some form. Then when you're happy with the output you could change the managementPolicies
to [Observe, Create]
or ['*']
to let it proceed. See https://docs.crossplane.io/latest/concepts/managed-resources/#managementpolicies
Oh damn, I hadn't realised management policies weren't already implemented.
Hi @lukegAtPaxos, the PR https://github.com/upbound/provider-terraform/pull/219 has been merged.
hi @bobh66 , iam making code changes like exposing the plan in gzip format either via config map or variable in status field, may i know which one will work and feasible
Hi @balu-ce - that's a good question. There are size limitations to both that might become an issue for really large terraform modules, but it would probably work for the majority of the use cases. Another possibility might be to create a Plan
resource that gets created as a child of the Workspace
resource. That doesn't solve any of the content size issues, it's just another way of representing the data.
I think I would lean toward using status
for simplicity.
@bobh66 Added changes, can you please review it
I've been using the provider a bit more recently and had some thoughts.
potentially the introduction of a PolicyConfig
resource, much like the provider config it can be referenced by a workspace. In the terraform execution stage.
in https://github.com/upbound/provider-terraform/blob/main/internal/terraform/terraform.go the provided policy could be invoked on the plan, against something like conf test.
This would involve outputting a plan file, before the apply - and executing the policy engine. The apply could then be invoked from the stored plan file.
The result of a policy failure would be to cause the workspace to fail, bubbling up the failure in status.
curious what maintainers think, i can knock out a poc.
I'm not convinced that it's a good idea to try to implement a human-approves-plan workflow inside of provider-terraform's reconcile loop. In fact I'm pretty confident it's not.
What's the benefit of having an always-on control plane constantly monitoring your infrastructure if all it's going to do is update a plan that a human needs to review and tell it it's okay to apply?
I understand the desire for a dry run (e.g. https://github.com/crossplane/crossplane/issues/1805) to build confidence before making a change to a configuration. Wouldn't it make more sense to shift that left, though? i.e. Check that the plan isn't going to do anything unexpected before the new configuration hits the control plane. For example by validating the new plan as part of a gitops workflow in CI (or even just by running terraform plan
on a developer's laptop).
Edit: I feel similarly WRT automated tests. Why not run them before the configuration (i.e. CR) is ever applied to the control plane?
I doubt that it will applicable for other native providers of crossplane but we need to reconsider it for terraform.
I doubt that it will applicable for other native providers of crossplane but we need to reconsider it for terraform.
Why?
@negz
Wouldn't it make more sense to shift that left, though? i.e. Check that the plan isn't going to do anything unexpected before the new configuration hits the control plane. For example by validating the new plan as part of a gitops workflow in CI (or even just by running terraform plan on a developer's laptop).
My concern would be that if we're not running terraform plan
in exactly the same setup that provider-terraform runs terraform apply
, we don't have any guarantees that the plan will reflect the changes that will be made. Especially with ProviderConfig
s separating some of the config from the Workspace
, it would be easy for the plan to be run with different providers, credentials, or a different terraform
binary.
I agree that provider-terraform
itself shouldn't be itself handle the approval workflow, but if it supported exposing the plan, you could fairly easily build an approval workflow on top of it using the management policies as Bob suggests above
you could fairly easily build an approval workflow on top of it using the management policies
It sounds like you're talking about an automated workflow, right? So some other system (not a human) validates the change? Do you have an example of what kind of validation such a system would run?
A combination of human and automated validations. One example would just be time based checking - e.g. changes should only be made between 10am and 4pm, Monday to Thursday, or only made during pre-approved maintenance windows.
I'd like to live in a world where we trusted ourselves and our systems to not require human approval of changes, but for production-critical environments we may be required to have a human review step as well.
I'd like to live in a world where we trusted ourselves and our systems to not require human approval of changes, but for production-critical environments we may be required to have a human review step as well.
Is that view compatible with the controller/reconciler "control plane" model? It almost seems like it's turning the control plane into a change notification system - i.e. something to let you know when drift has happened and ask you what to do about it.
Do you see this as something that would be relevant for other Crossplane providers, or only this Terraform one? Do you use/have plans to use other providers?
The way I see it, the control plane is responsible for actuating/enforcing an intent and correcting for drift; what we want to build on top of that is a set of safeguards between the changes being made by developers in source control, and a new version of the intent being "activated" (not that we intend to explicitly version things).
We have a lot of Terraform modules (occasionally several layers deep 😬 ) that are used in various combinations across many environments, and when a developer makes a change to a module, they're probably not going to be the one landing the change which bumps the module version and rolls that change and a bunch of others out to the fleet.
I don't think this is directly relevant to other Crossplane providers - the individual managed resource level is probably too granular for that sort of approval workflow - but I could see something similar being used for changes to compositions and claim parameters (given that a Terraform workspace can provision multiple cloud resources, I generally think of them as loosely equivalent to composite resources and compositions).
We do currently use provider-gcp, but only a tiny amount (probably 20-odd MRs, if that). Unfortunately we have a lot of history with Terraform, and I don't think there's any way we could justify rewriting all of our Terraform modules.
I don't know that I see any more work here for the provider beyond making the plan available for review in #226 . A higher layer can handle scheduling and approval of changes by manipulating claims and managementPolicies
to control what changes get applied and when, and if the change gets applied automatically or requires approval.
Yes, absolutely, that PR is all we need for our purposes.
I'm not convinced that it's a good idea to try to implement a human-approves-plan workflow inside of provider-terraform's reconcile loop. In fact I'm pretty confident it's not.
What's the benefit of having an always-on control plane constantly monitoring your infrastructure if all it's going to do is update a plan that a human needs to review and tell it it's okay to apply?
I understand the desire for a dry run (e.g. crossplane/crossplane#1805) to build confidence before making a change to a configuration. Wouldn't it make more sense to shift that left, though? i.e. Check that the plan isn't going to do anything unexpected before the new configuration hits the control plane. For example by validating the new plan as part of a gitops workflow in CI (or even just by running
terraform plan
on a developer's laptop).Edit: I feel similarly WRT automated tests. Why not run them before the configuration (i.e. CR) is ever applied to the control plane?
FWIW my original statement wasn't introducing a human approved step, but allowing a hook for policy automation to validate a plan file.
Policy in terraform can be written for many things - a policy could be written to ensure no AWS IAM Keys are being created for example - but at its root its validating the plan file, not offering a human approver the ability to approve.
@negz @bobh66 can we get a decision on this/#226 please?
I defer to @bobh66 and @ytsarev. While I created this provider, I haven't touched it in some time.
I understand the need for modeling changes and pre-approvals, but I agree with @negz that this sort of processing should be handled at a higher/different layer and that the provider should simply "do what it's told". I think it's better to handle this as part of crossplane/crossplane#5477 rather than do something provider-specific here. I'll update #266 as well.
What problem are you facing?
When running a terraform apply, validation of the plan file can be useful to ensure nothing unexpected is going to happen.
How could Official Terraform Provider help solve your problem?
Tools like conftest are used in other opensource tooling (Atlantis) and Hashicorp Sentinel in Terraform Cloud.
Would it be possible to block the Update / Create crossplane actions on a policy as code evaluation of the planfile.