Unnecessary local-to-local migrating attempt when doing "terraform init"

Terraform Version

1.0.10 and 1.0.11, so far as I tested

Scenario 1 to reproduce the issue

The terraform Configuration Files, with explicit "local" backend config

resource "null_resource" "test" {
  triggers = {
    a = 100
  }
}
terraform {
  backend "local" {
  }
}

terraform init
terraform workspace new myws
terraform apply
# tfstate is written to file terraform.tfstate.d/myws/terraform.tfstate
# one might check in the tfstate file into vcs.

# remove dir .terraform/
# the initialized backend state (.terraform/terraform.tfstate) is gone after this step
# it is equivalent to someone making a fresh check-out of the source code with tfstate file 
rm -rf .terraform/

# init, while dir terraform.tfstate.d/ is present, and dir .terraform/ is not
terraform init

Then you will get the local-to-local migrating prompt as below

Initializing the backend...
Do you want to migrate all workspaces to "local"?
  Both the existing "local" backend and the newly configured "local" backend
  support workspaces. When migrating between backends, Terraform will copy
  all workspaces (with the same names). THIS WILL OVERWRITE any conflicting
  states in the destination.

  Terraform initialization doesn't currently migrate only select workspaces.
  If you want to migrate a select number of workspaces, you must manually
  pull and push those states.

  If you answer "yes", Terraform will migrate all states. If you answer
  "no", Terraform will abort.

  Enter a value:

Scenario 2 to reproduce the issue

The terraform configuration files, with no explicit "local" backend config.

resource "null_resource" "test" {
  triggers = {
    a = 100
  }
}

terraform init
# because no explicit backend config, file .terraform/terraform.tfstate wouldn't be created

terraform workspace new myws
terraform apply
# tfstate is written to file terraform.tfstate.d/myws/terraform.tfstate

# Now, add the explicit "local" backend config, as below
#   terraform {
#     backend "local" {
#     }
#   }

# Then init again
terraform init

Then you will get the same local-to-local migrating prompt as shown above.

Other context

If I make the dir terraform.state.d disappear before the second terraform init, the issue won't come out.
If the explicit "local" backend config in scenario 1 has such settings workspace_dir = "whatever", the issue won't come out.
If I don't make new terraform workspace, just use the "default" one, the issue won't come out.

Why I consider it an issue

In my whole example, the explicit "local" backend config has no non-default settings, which means it is the same whether it exists or not, so the migration is not needed at all.
The two scenarios are not the corner cases, especially for the first one, that I think happens in certain workflow.
Let alone that the issue happens only when using non-default workspace with the default workspace dir, which is terraform.tfstate.d.

Hi @ckyoog! Thanks for sharing this.

I think what you've discovered here is that terraform init internally distinguishes between implied local state vs. explicitly using the local backend, even though ultimately both of those indeed lead to Terraform holding an instance of the same local backend. The reasons for this are primarily historical: prior to Terraform v0.9 it was valid to just run operations without initializing any backend -- because backends didn't exist yet -- and the now-removed command terraform remote config was a separate step to enable remote state storage.

This historical quirk is reflected in the wording of this message: notice that it's talking about migrating to the local backend, as if you were not already using the local backend. What you've entered here is a migration codepath to get folks from the Terraform v0.8 world into the Terraform v0.9 world, with an explicit backend configuration.

In Terraform as currently designed, you should omit the backend "local" block unless you actually intend to override the settings. If you're just using the default settings anyway, leaving it unconfigured is the best approach because then no explicit initialization is required; you're effectively operating in Terraform v0.8-and-earlier mode.

The main way we recommend using modern Terraform is with a backend other than backend "local", so that the state will be stored somewhere other than your local disk. If you use remote state then the migration process you saw here is how you get from the local-only mode to the remote state mode.

I'm not sure that we can do anything specific to improve the behavior you saw here due to the Terraform v1.0 Compatibility Promises and how critical terraform init's behavior is to automation workflows.

However, one possible design change we could consider is to detect the presence of implied local state files (terraform.tfstate and/or terraform.tfstate.d files) earlier and behave as if there were a synthetic empty backend "local" block in that case, treating it differently than having no state files and a configured backend block. As part of investigating that further we'd need to research where there is any workflow it would potentially break; my initial hunch is that treating it as an empty block would only affect the case where you explicitly added an empty block like you shared here, and would therefore be okay, but we also need to consider the interaction with -backend-config command line arguments and environment variables that can affect the implied behavior.

Hi @apparentlymart , thank you for the early response and the detailed explanation. You are always so nice. I have written quite a few terraform issues, and read more. Every time I saw your detailed and patient comments. It's really appreciated.

I started to use terraform since 2017 (it was terraform 0.8??). So I know many logic behind the scene. I mean I totally understand your comments about the v1.0 compatibility and the terraform init behavior to the automation workflows, also including the suggestions like omit backend "local" block and use terraform with a backend other than backend "local".

I believe people who use terraform in production environment rarely use the "local" backend. Neither do I. Unfortunately, in some recent changes of my project, the "local" backend was introduced to support a special/tricky use case. And we have a remote backend configuration setup already, so a "local" backend block must be added explicitly to override the existing remote backend config.

Anyway, this issue is not a blocking one for my problem, I have a workaround. I just think, although you said there is not pretty much you can do, this behavior is confusing and inconsistent. It appears that the default local workspace dir terraform.tfstate.d is special to terraform. E.g., as we know, terraform compares the cached backend config .terraform/terraform.tfstate with the backend config in code, say backend.tf, and decides whether to migrate/copy the tfstate accordingly. So I thought if the cached backend config .terraform/terraform.tfstate was lost/deleted, there would be no way for terraform to know the previous backend config when doing terraform init, so that it wouldn't know whether and from where to migrate/copy the tfstate. But apparently the default local workspace dir is another way for terraform to know it. E.g., when the cached backend config .terraform/terraform.tfstate is missing, as long as terraform sees the dir terraform.tfstate.d, it will still want to try to migrate/copy the local tfstate to the new backend. BUT, (this is where I think the inconsistency comes out), if the local tfstate file is stored in other dir than the default name, say non-terraform.tfstate.d, nothing will happen.

The possible change you mentioned sounds good. But it also sounds like a lot of investigation and consideration would be involved. Hopefully you will eventually find a solution good for both sides.

hashicorp / terraform