hashicorp / hcp-terraform-operator

Kubernetes Operator allows managing HCP Terraform resources via Kubernetes Custom Resources.
https://developer.hashicorp.com/terraform/cloud-docs
Mozilla Public License 2.0
125 stars 32 forks source link

πŸ› Unmanaged Terraform Variables are being unset #221

Open KamalAman opened 1 year ago

KamalAman commented 1 year ago

Operator Version, Kind and Kubernetes Version

YAML Manifest File

# Copy-paste your YAML manifest here

Output Log

2023-07-18T11:55:07Z    INFO    Reconcile Variables {"workspace": {"name":"WORKSPACE","namespace":"tf-workspaces"}, "msg": "deleting 1 terraform variables from the workspace ID ws-iANiBEofJqkNYnEg"}

Output of relevant kubectl commands

Steps To Reproduce

  1. Create a workspace
  2. Go to terraform cloud and manually add a new variable to the workspace. Ensure that the variables name is one that shouldn't be managed by the Workspace definition
  3. Wait and see the variable become deleted

Expected Behavior

What should have happened? Variables that the Workspaces does not manage should not be modified / removed.

Actual Behavior

What actually happened? The additional variable is deleted

Additional Context

This messes up the ability to have advanced configuration outside of the scope of the operator. Hopefully this issue gets resolved ASAP as I think it is a very important workflow to support.

References

Community Note

KamalAman commented 1 year ago

I know that the request might be a little bit difficult due to the following code only having access to its current state and the remote state:

image

Ideally and CRD modification you would calculate a diff to know which variables were managed to clean up, or you might have some sort of internal storage mechanism to get the true diff between the previous state and the current state.

Other simpler solutions that are less ideal that would solve my use case is a variable whitelist/blacklist to tell the operator what to manage, or even if it is just the ability to disable variable deletion all together.

marceloboeira commented 11 months ago

I think that's a feature not a bug, e.g.: if someone goes and messes up the variables you probably want the source of truth to be the CRD and not what someone might've changed in the UI by mistake.

It's a common thing for the IaC world, is there a scenario for this being the case?

There could be a parameter to define the strategy when drift is detected:

onDriftStragegy: reconcile (default/todays behaviour, delete everything out of CRD definition)
onDriftStrategy: tolerate (keep externally managed variables but overrides same variables with CRD definition)
onDriftStrategy: ignore (keep externally managed variables and ignore value changes in the UI) -> probably useless

It should be straightforward to implement the solution, but understanding the requirements and expected/common patterns might be more complex. Maybe variable sets are a better option for such scenarios with different groups of variables.

srlynch1 commented 11 months ago

Generally I would say this is expected behavior.

If a consumer is defining variables in the CRD this is the source of truth.

If there is a need for variables to be enforced or have precedence this is something that was recently added to project based variable sets. This allows overrides to be applied by an administrator. Varsets are also independent of the workspace.

certara-mchamberland commented 10 months ago

I also have a use case here: I'm a Terragrunt user, and my current terragrunt setup provides a number of "standard" variables that people writing modules in my org can use for standardization. I have Terragrunt provision the workspaces then write a .auto.tfvars for the module hosted in that workspace, but it would be a lot more convenient (and transparent) to have it write directly in the workspace module, except the workspace module would presumably have a perpetual diff from TFC deleting the extra vars.

In general, I think this issue should be seen as a feature request for the TFE provider: We can't use the TFE provider to manage a workspace if TFC will go mess with it behind the scene and create drift. Providing a simple switch at the project and workspace level to override this behaviour would help.

KamalAman commented 9 months ago

I do agree, if the Workspace CRD is managing a variable and someone manually changes it in TFC, it should get reverted. You should not be able to mess with a managed variable.

My main idea here is: If the operator in managing a variable, manage it. If it is not managing it, leave it alone.

If a variable is being managed, and then it's then removed from the Workspace CRD, it should be removed from TFC. But after that, that variable so no longer being managed.

This should be opt in functionality.