hashicorp / terraform-aws-vault

A Terraform Module for how to run Vault on AWS using Terraform and Packer
Apache License 2.0
656 stars 465 forks source link

sign-request script timeout #261

Open Carles-Figuerola opened 2 years ago

Carles-Figuerola commented 2 years ago

Hi all,

This might not be the best forum for a question like this, but it affects this module in some way. We are using a similar version of this script in our code and we have found a somewhat blocking process.

We automate this script by adding it as a step in our terraform module:

data "external" "vault" {
  program = ["/usr/bin/env", "python3", "${path.module}/sign-request.py", ""]
}

provider "vault" {
  address   = var.vault_address
  namespace = var.vault_namespace
  auth_login {
    path      = "auth/aws/${var.vault_namespace}/login"
    namespace = var.vault_namespace
    parameters = {
      role                    = var.vault_role
      iam_http_request_method = data.external.vault.result.iam_http_request_method
      iam_request_body        = data.external.vault.result.iam_request_body
      iam_request_headers     = data.external.vault.result.iam_request_headers
      iam_request_url         = data.external.vault.result.iam_request_url
    }
  }
}

We deploy our terraform infrastructure with a pipeline that does

terraform plan (stores output zipfile in s3)  ---> ask for approval ---> terraform apply (gets the plan from s3)

However, we have started finding a limitation with this vault authentication method. When we use the script for vault authentication, the sts response headers are calculated on the plan stage. Then, because we run the apply stage by providing a plan file, the vault auth process is not repeated and we have found that it has a timeout of 15 minutes (as much as I can see, this is to avoid replay attacks to the vault server by using sniffed headers).

Other providers have this (or some sort of) authentication process embedded inside the provider which means that it re-runs and updates even on the apply step, but using a script as a data object means that is not happening.

What could be a viable solution to this problem. Ideally we don't want to have to skip either the approval or the stored plan (this allows us to be 100% sure what we're applying is the plan we're seeing).

cc @Etiene as I saw they added the file to this repository and hopefully they have deeper knowledge on this process.