hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
41.67k stars 9.41k forks source link

Incorrect behavior `terraform init` when TF_WORKSPACE set #26127

Open ALutchko opened 3 years ago

ALutchko commented 3 years ago

Community Note

Terraform CLI and Terraform AWS Provider Version

0.12.29 3.3

Affected Resource(s)

backend initialization

Terraform Configuration Files

provider "aws" {
  assume_role {
    role_arn     = var.assume_role_arn
    session_name = "eks_${local.environment}"
  }
  region  = var.region
  version = "~> 3.3"
}

terraform {
  backend "s3" {
    bucket  = "bucketname"
    key     = "my_key"
    encrypt = "true"
    region  = "eu-central-1"
    role_arn = "arn:aws:iam::11111111:role/my_role"
    dynamodb_table = "tf-remote-state-lock"
  }
}

Debug Output

on pastebin

Expected Behavior

new workspace created

Actual Behavior

it fails with (quite misleading) error:

failed to lock s3 state: 2 errors occurred:
* ResourceNotFoundException: Requested resource not found
* ResourceNotFoundException: Requested resource not found

If I turn on TF_LOG=DEBUG then I see 400 Bad Request, details on pastebin link above

Steps to Reproduce

run terraform workspace new test

Important Factoids

backend is not on the same account as the target environment I use TF_WORKSPACE variables, and if I just run terraform init it fails because the workspace does not exist yet and the value cannot be provided because the process runs in pipeline:

terraform init
Initializing modules...

Initializing the backend...

The currently selected workspace (test) does not exist.
  This is expected behavior when the selected workspace did not have an
  existing non-empty state. Please enter a number to select a workspace:

  1. default

  Enter a value: 

Error: Failed to select workspace: input not a valid number

References

https://github.com/terraform-providers/terraform-provider-aws/issues/14896

I found the reason, but it still look like misbehavior or at least proper error message is needed: terraform workspace whatcoever should rum only AFTER terraform init. If you have TF_WORKSPACE set up, you may have error during tf init saying that the workspace does not exist yet, so you may have temptation to rum tf workspace new before tf init. Don't do it, just set up TF_WORKSPACE only after tf init.

mildwonkey commented 3 years ago

Hi @ALutchko , I'm sorry you've experienced this unexpected behavior! Perhaps there's an opportunity for a clearer error message, as you said.

I'm having trouble following the sequence of commands you are running when you get this message. I can see that terraform init gives the expected output if the workspace does not exist, but when you say "Affected Resource(s): backend initialization", but what exact command(s) are you running that results in failed to lock s3 state:? The debug log you've attached only seems to have a small snippet and not the entire output (which would show me what command you are running).

ALutchko commented 3 years ago

Hi @mildwonkey, My apologies for not being clear. "Affected Resource(s)" is the part of the ticket template and I thought it wouldn't be wise to cut it off, however I admit it may be a bit misleading in this case.

The sequence: one should run terraform workspace new test before any other command, even before terraform init (assuming there's no leftovers from previous runs). Again, my apologies, I'm unsure and can't check now if you should have TF_WORKSPACE ready set before tf ws new ...

Thank you.

mildwonkey commented 3 years ago

I just tried the following steps with both terraform v0.12.29 and v0.13 (deleting the workspaces between each run)

export TF_WORKSPACE=fake  # has not been created
terraform init
terraform workspace list
  default
* fake

Terraform created and selected the workspace without me having to do it manually.

Is it possible that you had a different issue? Perhaps your credentials or backend configuration wasn't working correctly, and that's why you saw the S3 error? I do believe that you had an issue, but I don't think it's with terraform's workspace mechanism specifically.

ALutchko commented 3 years ago

The error was related to DynamoDB, so maybe if you have backend without that feature there won't be error. On the other hand, guys from aws provider repo sent me there, please see link in reference.

mildwonkey commented 3 years ago

I suspect that there are two problems going on here that aren't actually related, just coincidentally in the same commands - let's see what we can figure out.

The issue you've linked refers to a different workspace select error that what you have in this issue, and that's why the AWS provider pointed you here. Now that we've confirmed that the odd init/workspace behavior is fixed in v0.13, we can see if the dynamoDB error is related to the workspace, or separate.

The first step is to confirm that the credentials you are using have the necessary permissions. Do those same credentials work in other workspaces, or do you have this problem with every configuration using these creds? Can you double check that your permissions match what's required by the s3 backend?

ALutchko commented 3 years ago

I used admin role to run that, and it was not related to the workspace name. Also, my_role which was assumed to get to the backend is admin too.

xarses commented 3 years ago

This reproduces on 0.13.6, 0.14.7 and 0.14.10 We use atlantis and this has cropped up before, an appears to have resolved and come back again. https://github.com/runatlantis/atlantis/issues/661 Which was linked to https://github.com/hashicorp/terraform/issues/21393 which was closed for an old version.

The problem remains, if we have TF_WORKSPACE passed when using a S3 backend, init may fail and prompt for the workspace to be selected, regardless of TF_AUTOMATION being passed or the lack of a pty

tpdownes commented 3 years ago

I also observe this behavior when using the gcs backend with 0.15.3.

Ameausoone commented 3 years ago

I get this issue with terraform 0.15.4, and gcs backend also.

rmccarthy-ellevation commented 3 years ago

I am getting the same issue using terraform 1.0.0 and using s3 as a backend.

kyle-kluever commented 2 years ago

I'm getting the same issue using terraform 1.0.5 and using s3 as a backend.

DevAndrewGeorge commented 2 years ago

Sam issue with Terraform 1.0.4 and a gcs backend. What are the next steps to hopefully getting this fixed?

ebdxflr commented 2 years ago

any workarounds found? I am running TF in automation mode through a Jenkins pipeline and it fails asking for workspace at the init stage... if I try to either set TF_WORKSPACE or create/manually set a workspace, it fails asking for init; I am chasing around my tail without any sign of results...

demiangmz commented 2 years ago

I've arrived here from the same search that I assume the rest of people here did, here's an acceptable way of resolving this if you're using automation, for me at least: https://support.hashicorp.com/hc/en-us/articles/360043550953-Selecting-a-workspace-when-running-Terraform-in-automation

These options work if the workspace already exists. This issue is dealing with scenarios where the workspaces doesn't exist. In that scenario, none of those options work.

You are correct, I deleted the comment to avoid misunderstandings. If you wish delete that reference as well, thanks for pointing it out!

yamatt commented 2 years ago

The scenario that bothers me is that I'm using my pipelines to create a new workspace for each environment (prod, test, dev, etc.)

I end up in the catch 22 where if I specify the workspace name before the init the init won't run because it can't find the workspace.

β”‚ Error: Currently selected workspace "test-workspace" does not exist

but I also cannot create a new workspace before the init is run.

β”‚ Error: Backend initialization required, please run "terraform init"
β”‚ 
β”‚ Reason: Initial configuration of the requested backend "s3"
β”‚ 
[...]
β”‚ 
β”‚ If the change reason above is incorrect, please verify your configuration
β”‚ hasn't changed and try again. At this point, no changes to your existing
β”‚ configuration or state have been made.

I therefore I do not specify the workspace before the init and allow the init to run under the default workspace. Then when it gets to the build stage I specify the workspace with TF_WORKSPACE and along with the -input=false flag it hasn't error'd.

Using Terraform 1.1.4

rbabyuk-vs commented 1 year ago

I faced the same issue. it does feel like a bug, I think it is not convenient to apply workarounds if this was working ok in older versions.

kyle-kluever commented 1 year ago

any workarounds found? I am running TF in automation mode through a Jenkins pipeline and it fails asking for workspace at the init stage... if I try to either set TF_WORKSPACE or create/manually set a workspace, it fails asking for init; I am chasing around my tail without any sign of results...

This is how I've worked around the issue.

I use S3 as my backend and have it configured as an "empty" backend.

terraform {
  required_version = "~> 1.3.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 4.17.1"
    }
  }

  backend "s3" {
  }
}

In my CI/CD workflows I do these steps:

  1. Run terraform init without TF_WORKSPACE env var set.
  2. Set TF_WORKSPACE env var with the workspace name (my environment name for my case)
  3. Run terraform workspace select, falling back to terraform workspace new if the select fails.
  4. Run terraform init again, now with the TF_WORKSPACE var set and you're good to go with plan/apply/etc commands.

Here's the generic GitHub Actions code. I use a lot of variables to not repeat myself with the multiple init commands.

- name: Terraform Init
  run: |
    terraform init -input=false \
      -backend-config="bucket=${TERRAFORM_STATE_BUCKET}" -backend-config="key=terraform.tfstate" \
      -backend-config="region=${AWS_REGION}" -backend-config="workspace_key_prefix=${{ steps.aws-creds.outputs.aws-account-id }}/${REPO}" \
      -backend-config="dynamodb_table=TerraformState" -backend-config="kms_key_id=${TERRAFORM_STATE_KEY}" \
      -backend-config="acl=bucket-owner-full-control" -backend-config="encrypt=true"

# https://github.com/hashicorp/terraform/issues/26127
# https://github.com/hashicorp/terraform/issues/16191
- name: Terraform Workspace
  run: |
    echo "{TF_WORKSPACE}={$WORKSPACE}" >> $GITHUB_ENV
    terraform workspace select $WORKSPACE || terraform workspace new $WORKSPACE
  env:
    WORKSPACE: ${{ github.event.client_payload.source.environment }}

- name: Terraform Init
  run: |
    terraform init -input=false \
      -backend-config="bucket=${TERRAFORM_STATE_BUCKET}" -backend-config="key=terraform.tfstate" \
      -backend-config="region=${AWS_REGION}" -backend-config="workspace_key_prefix=${{ steps.aws-creds.outputs.aws-account-id }}/${REPO}" \
      -backend-config="dynamodb_table=TerraformState" -backend-config="kms_key_id=${TERRAFORM_STATE_KEY}" \
      -backend-config="acl=bucket-owner-full-control" -backend-config="encrypt=true"