gravitational / teleport

The easiest, and most secure way to access and protect all of your infrastructure.
https://goteleport.com
GNU Affero General Public License v3.0
16.99k stars 1.71k forks source link

Terraform Provider | Provision Token Resource Expiry #42434

Open gwellington opened 9 months ago

gwellington commented 9 months ago

Hi all,

It looks like the teleport_provision_token resource requires that the expiry time is always in the future.

Issue

When the token expires after initial creation, the Terraform plan fails because the token's expiry period is no longer in the future which breaks the above contract.

Why is this an issue?

Teleport node join tokens are used once for host provisioning and are thrown away after a host has been provisioned. This means we need to continuously re-provision a new token which leads to noise in the Terraform plans and undesirable from a security perspective.

Reproduction

main.tf

provider "teleport" {}

resource "time_offset" "node_join_expiry" {
  offset_minutes = 1
}

resource "random_password" "node_join" {
  length  = 32
  special = false
}

// This will constantly rotate because this resource
// cannot deal with an expires time in the past. 
// This is incredibly annoying.  This is the way.
resource "teleport_provision_token" "node_join" {
  metadata = {
    expires = time_offset.node_join_expiry.rfc3339
    name    = random_password.node_join.result
  }

  spec = {
    roles = ["Node"]
  }
}

Error

│ Error: Time validation error
│ 
│   with teleport_provision_token.node_join[0],
│   on vm.tf line 16, in resource "teleport_provision_token" "node_join":
│   16: resource "teleport_provision_token" "node_join" {
│ 
│ Attribute metadata.expires value must be in the future

Desired State

Plan should succeed if there is no change detected for the token.

programmerq commented 9 months ago

I tinkered with this a little more today. If you are able to calculate the list of tokens that you need, I think I have a path forward. Basically, it will make sure tokens exist for the list that you give it. If you have a subsequent run with an empty list, it will remove the stale resources from the terraform state. The use of time_rotating prevents issues in a case where you actually do need to recreate a token that is expired, but hasn't yet been forgotten in the terraform backend.

terraform {
  required_providers {
    teleport = {
      version = "~> 13.0"
      source  = "terraform.releases.teleport.dev/gravitational/teleport"
    }
  }
}

provider "teleport" {}

// generate a list of needed tokens.
// I'm just using an input variable for illustrative purposes.
//
// If this can be calculated dynamically, this will ensure that tokens are
// created, and on a subsequent run when a given token is no longer needed, it
// will be cleaned up from the state file.
variable "needed" {
  type = list(string)
  #default = ["one", "two"]
  default = []
  description = "hosts needing a token"
}

// generate a value for each needed token.
resource "random_password" "node_join" {
  for_each = toset(var.needed)
  length  = 32
  special = false
}

// use time_rotating for cases where you might need to recreate a token that
// has expired but hasn't yet been forgotten from the state.
resource "time_rotating" "mytoken" {
  for_each = toset(var.needed)
  rotation_minutes=1
  triggers = {
    hostname = random_password.node_join[each.value].result
  }
}

// create one token for each
resource "teleport_provision_token" "node_join" {
  for_each = toset(var.needed)
  metadata = {
    expires = time_rotating.mytoken[each.value].rotation_rfc3339
    name    = random_password.node_join[each.value].result
  }

  spec = {
    roles = ["Node"]
  }
}
% TF_VAR_needed='["one", "two"]' terraform apply
...
random_password.node_join["one"]: Creating...
random_password.node_join["two"]: Creating...
random_password.node_join["two"]: Creation complete after 0s [id=none]
random_password.node_join["one"]: Creation complete after 0s [id=none]
time_rotating.mytoken["two"]: Creating...
time_rotating.mytoken["one"]: Creating...
time_rotating.mytoken["two"]: Creation complete after 0s [id=2023-10-06T22:58:23Z]
time_rotating.mytoken["one"]: Creation complete after 0s [id=2023-10-06T22:58:23Z]
teleport_provision_token.node_join["one"]: Creating...
teleport_provision_token.node_join["two"]: Creating...
teleport_provision_token.node_join["two"]: Creation complete after 0s [id=1696633103859192400]
teleport_provision_token.node_join["one"]: Creation complete after 0s [id=1696633103892741105]

Apply complete! Resources: 6 added, 0 changed, 0 destroyed.
% sleep 60
% TF_VAR_needed='[]' terraform apply
...
random_password.node_join["two"]: Destroying... [id=none]
random_password.node_join["one"]: Destroying... [id=none]
random_password.node_join["one"]: Destruction complete after 0s
random_password.node_join["two"]: Destruction complete after 0s

Apply complete! Resources: 0 added, 0 changed, 2 destroyed.

Since the time_rotating isn't a real resource, it doesn't need to be destroyed, it is simply removed from the state file. When it does the refresh, it sees that my expired tokens are gone. That's fine because I have specified that I don't need them anyway, so it doesn't need to delete them from teleport. It only cleans up the random_passwords that don't correspond to my now empty list.

If you aren't able to calculate what list of tokens is needed for the current terraform run, then this approach probably won't satisfy the root of the issue.

gwellington commented 8 months ago

Hello!

Thanks for the detailed response, but unfortunately, all of the resources are configured dynamically. We could try to encourage users to adopt the 2-phase commit style, but people would likely forget to add the provisioning token which would result in more issues than just a noisy TF plan.

If there was a technical guard rail that we could implement in Terraform to cross reference the resources and the tokens then we might consider doing it. However, I don't think there is a reasonable way between variable validation and Sentinel policies that would enable this.

Thanks

kmgnd commented 5 months ago

Same issue. Now this is not exactly an elegant solution