Open gwellington opened 9 months ago
I tinkered with this a little more today. If you are able to calculate the list of tokens that you need, I think I have a path forward. Basically, it will make sure tokens exist for the list that you give it. If you have a subsequent run with an empty list, it will remove the stale resources from the terraform state. The use of time_rotating
prevents issues in a case where you actually do need to recreate a token that is expired, but hasn't yet been forgotten in the terraform backend.
terraform {
required_providers {
teleport = {
version = "~> 13.0"
source = "terraform.releases.teleport.dev/gravitational/teleport"
}
}
}
provider "teleport" {}
// generate a list of needed tokens.
// I'm just using an input variable for illustrative purposes.
//
// If this can be calculated dynamically, this will ensure that tokens are
// created, and on a subsequent run when a given token is no longer needed, it
// will be cleaned up from the state file.
variable "needed" {
type = list(string)
#default = ["one", "two"]
default = []
description = "hosts needing a token"
}
// generate a value for each needed token.
resource "random_password" "node_join" {
for_each = toset(var.needed)
length = 32
special = false
}
// use time_rotating for cases where you might need to recreate a token that
// has expired but hasn't yet been forgotten from the state.
resource "time_rotating" "mytoken" {
for_each = toset(var.needed)
rotation_minutes=1
triggers = {
hostname = random_password.node_join[each.value].result
}
}
// create one token for each
resource "teleport_provision_token" "node_join" {
for_each = toset(var.needed)
metadata = {
expires = time_rotating.mytoken[each.value].rotation_rfc3339
name = random_password.node_join[each.value].result
}
spec = {
roles = ["Node"]
}
}
% TF_VAR_needed='["one", "two"]' terraform apply
...
random_password.node_join["one"]: Creating...
random_password.node_join["two"]: Creating...
random_password.node_join["two"]: Creation complete after 0s [id=none]
random_password.node_join["one"]: Creation complete after 0s [id=none]
time_rotating.mytoken["two"]: Creating...
time_rotating.mytoken["one"]: Creating...
time_rotating.mytoken["two"]: Creation complete after 0s [id=2023-10-06T22:58:23Z]
time_rotating.mytoken["one"]: Creation complete after 0s [id=2023-10-06T22:58:23Z]
teleport_provision_token.node_join["one"]: Creating...
teleport_provision_token.node_join["two"]: Creating...
teleport_provision_token.node_join["two"]: Creation complete after 0s [id=1696633103859192400]
teleport_provision_token.node_join["one"]: Creation complete after 0s [id=1696633103892741105]
Apply complete! Resources: 6 added, 0 changed, 0 destroyed.
% sleep 60
% TF_VAR_needed='[]' terraform apply
...
random_password.node_join["two"]: Destroying... [id=none]
random_password.node_join["one"]: Destroying... [id=none]
random_password.node_join["one"]: Destruction complete after 0s
random_password.node_join["two"]: Destruction complete after 0s
Apply complete! Resources: 0 added, 0 changed, 2 destroyed.
Since the time_rotating
isn't a real resource, it doesn't need to be destroyed, it is simply removed from the state file. When it does the refresh, it sees that my expired tokens are gone. That's fine because I have specified that I don't need them anyway, so it doesn't need to delete them from teleport. It only cleans up the random_passwords that don't correspond to my now empty list.
If you aren't able to calculate what list of tokens is needed for the current terraform run, then this approach probably won't satisfy the root of the issue.
Hello!
Thanks for the detailed response, but unfortunately, all of the resources are configured dynamically. We could try to encourage users to adopt the 2-phase commit style, but people would likely forget to add the provisioning token which would result in more issues than just a noisy TF plan.
If there was a technical guard rail that we could implement in Terraform to cross reference the resources and the tokens then we might consider doing it. However, I don't think there is a reasonable way between variable validation and Sentinel policies that would enable this.
Thanks
Same issue. Now this is not exactly an elegant solution
Hi all,
It looks like the
teleport_provision_token
resource requires that theexpiry
time is always in the future.Issue
When the token expires after initial creation, the Terraform plan fails because the token's expiry period is no longer in the future which breaks the above contract.
Why is this an issue?
Teleport node join tokens are used once for host provisioning and are thrown away after a host has been provisioned. This means we need to continuously re-provision a new token which leads to noise in the Terraform plans and undesirable from a security perspective.
Reproduction
main.tf
terraform plan
and watch plan failError
Desired State
Plan should succeed if there is no change detected for the token.