hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
42.71k stars 9.55k forks source link

Http backend configuration values are saved into tfplan when set in TF configuration code or init options #29195

Open Pinolo opened 3 years ago

Pinolo commented 3 years ago

Terraform Version

Terraform v1.0.2
on linux_amd64
+ provider registry.terraform.io/hashicorp/null v3.1.0

Terraform Configuration Files

terraform {
  backend "http" {
    address = "https://example.api.com/state"
    lock_address = "https://example.api.com/state/lock"
    unlock_address = "https://example.api.com/state/lock"
    retry_wait_min = "5"
    lock_method = "POST"
    unlock_method = "DELETE"
    username = "someone"
    password = "mypass"
  }
}

resource "null_resource" "stub" {
  triggers = {}
}

Debug Output

https://gist.github.com/Pinolo/2aeb7ab20289a4ccade89014d5eeccd5

Expected Behavior

Backend configuration values (including sensitive info) are not saved to resulting tfplan

Actual Behavior

Backend configuration values (including sensitive info) are saved to resulting tfplan

Steps to Reproduce

  1. Change the above code using valid backend addresses and credentials
  2. terraform init
  3. terraform plan -out planenv
  4. unzip planenv
  5. less tfplan

Additional Context

The same behavior is recorded when passing the backend config values as options for the terraform init command.

If I set backend configuration using only environment variables, configuration values are not saved into the tfplan file (only configuration keys are). I don't know what is the intended behavior, but I see at least an undocumented inconsistency.

apparentlymart commented 3 years ago

Hi @Pinolo!

What you've described here is the intended behavior -- the plan includes the same information Terraform caches in .terraform/terraform.tfstate to ensure the plan will apply to the same state that it was created from.

I think what you've seen here is an unfortunate consequence of the fact that most of the backends allow setting credentials as arguments as well as out-of-band. Our intention is that the primary way to provide credentials for the backend is via out-of-band mechanisms, which for the http backend means environment variables, and that the configuration therefore focuses only on describing where the state will be stored and not who or what is running Terraform.

It's a bit of a historical design mistake, born out of pragmatism, that the backends also often allow setting credentials as part of the configuration, but so far we've been reluctant to retract that because it would be a breaking change that doesn't really add any new capabilities that Terraform doesn't already have, and so it would inconvenience those whose threat model doesn't mind credentials stored in the plan without improving anything for those who do.

I think my main takeaway here then is that we ought to be clearer in the documentation for each individual backend in how we distinguish the "location-related" settings from the "user-related" settings, and then include a clear recommendation to prefer settings in the configuration for the location-based ones and out-of-band methods for the user-related ones.

For your case in particular, that would mean removing username and password from the configuration and setting TF_HTTP_USERNAME and TF_HTTP_PASSWORD instead, where all of the remaining arguments in your example are describing the details of where and how to store and manipulate the state.

Thanks for raising this!

Pinolo commented 3 years ago

Hi @apparentlymart, thanks for the detailed explanation. I see the point and I agree that better docs will help. A couple of suggestions for the docs enhancements.

  1. (something that I forgot to describe in the context section) of course, one of the areas where this behavior is relevant (and where I stumbled upon it) is CI/CD: pipeline runners with short-lived platform credentials shouldn't save credentials into tfplan in a typical plan/apply jobs separation, thus pipeline creators should be invited to use environment vars
  2. Although I now get the reason for init options being not-out-of-band, I think it should be explicit what goes into tfstate (and into tfplan) and what doesn't
njdart commented 3 years ago

I've also been affected by this, specifically in CI/CD pipelines with gitlab similar to what you mentioned @Pinolo. I've proposed a change to the docs to document this in https://github.com/hashicorp/terraform/pull/29519