check backend exists and where

schollii commented 3 years ago

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritise this request
Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritise the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

Description

We need to programmatically provision many stacks (different clients) and the backend in s3 that will store the corresponding tf state. The mechanism used which requires terraform init is a pain and is not idempotent. It should be possible to define "backend" and have terraform ignore it if the bucket does not exist, and if it does exist then automatically move state there without having to use terraform init. Terraform could check this at start and end of apply, in case the bucket was created during apply.

This would make storage management so much easier.

I don't know if this is an AWS provider issue or a terraform issue, I suspect both as the provider would probably have to do extra stuff that terraform can use. Let me know what you think.

Alternately if one could determine if backend bucket exists in s3, and whether the state has already been moved there, and use that to control whether terraform should generate a backend.tf flie. I think this check could be done via a local exec that uses the AWS CLI, but this seems a bit clunky.

Are there other approaches to solving this chicken an egg problem?

New or Affected Resource(s)

terraform s3 "backend" element provider

Potential Terraform Configuration

define bakend.tf that references bucket that does not exist
every time terraform apply: checks backend bucket, finds does not exist, prints message and continues with apply, saving all state locally; at end of the apply, terraform automatically checks again if bucket exists, if yes it asks and moves state there

bflad commented 3 years ago

Hi @schollii 👋 Thank you for raising this.

Backends in Terraform CLI are wholly implemented in that codebase at the moment, including the logic to operate with various providers (AWS, GCP, Azure, Kubernetes, etc.). For example, the Terraform S3 Backend specific code can be found here: https://github.com/hashicorp/terraform/tree/main/backend/remote-state/s3

Given there is nothing necessarily to be done in this codebase, I'm going to transfer this issue upstream. For any backend related questions, you may want to post in the HashiCorp Community Forums as they use their issue tracker for bug reports and feature requests only.

Aside: We do have longer term plans to split backends from the Terraform CLI codebase similar to how providers were split out a few years ago, but there are no official timelines for this effort yet.

apparentlymart commented 3 years ago

Hi @schollii! Thanks for this feature request.

I remember responding to a feature request very similar to this one before but unfortunately I wasn't able to find it in some quick searching, so instead I'm going to try to summarize what I recall saying there and hopefully later we can find the other one and merge this into it, but at least if not we'll have the information in both places...

The key design problem here is the very "chicken and egg" problem you discussed in your comment: Terraform needs to interact with the backend throughout plan and apply operations, and so achieving this feature request will require defining exactly how this dynamic transition from local to remote state ought to work, including what should happen in the unhappy path where there's an error during terraform apply where some objects have already been created but the state storage S3 bucket still doesn't exist yet.

I expect that a successful design for this would need to address both the technical concerns of how Terraform might effectively switch backends dynamically in the middle of an operation and also the user experience concerns of how to make sure users understand at all times what the situation is and in particular whether they currently have in current working directory the only record of real objects existing in the remote system, so they can make sure not to lose or delete it.

A particularly tricky case is when Terraform is running in a remote automation system where the user doesn't have direct access to the filesystem. If Terraform exists while still using local state but with real objects existing, that remote automation would need to know to somehow preserve that local state file to use on the next run, which is a task typically delegated to a remote backend and thus arguably once the automation is implementing that anyway it's might be simpler to not use the built-in remote state at all and have the surrounding automation deal with it in all cases. Since it's already possible in principle for surrounding automation to do that (by restoring and saving the local terraform.tfstate file before and after each operation), I expect we'd want to investigate the viability of that existing approach first if the new approach would end up still requiring that to be implemented anyway.

This isn't something that the Terraform Core team will be able to working on detailed design for in the near future due to our focus being elsewhere, but if anyone in the community is interested in proposing something please let us know! We'd prefer to talk with you about your intentions before you spend too much time on designing or implementing anything because it's not nice for anyone when we need to request large changes to something that someone has invested significant time in already. :confounded:

(Also, if you happen to know of or find a similar issue where I already said something along the same lines as the above, please let us know so we can merge the two! I'll watch out for it myself, but there are lot of issues about various subjects in this repository.)

hashicorp / terraform