hashicorp / terraform-cdk

Define infrastructure resources using programming constructs and provision them using HashiCorp Terraform
https://www.terraform.io/cdktf
Mozilla Public License 2.0
4.84k stars 448 forks source link

Multi-Stack deployments don't update state if one stack fails #1938

Closed javierlga closed 2 years ago

javierlga commented 2 years ago

Community Note

cdktf & Language Versions

CDKTF version: 0.11.2 Terraform version: 1.1.4

Affected Resource(s)

Debug Output

N/A

Expected Behavior

When deploying multiple stacks with cross-stack reference, all the stacks must update their state file correctly, even if one of the stacks fails.

Actual Behavior

When deploying multiple stacks with cross-stack reference, if one of the stacks fails the state file is not updated correctly.

Steps to Reproduce

  1. Create multiple stacks( network, GKE, GCS, etc...).
  2. Enable cross-stack reference by consuming the network ID in the GKE stack from the network stack.
  3. If the GCS stack fails due to whatever reason, the network resource in the Google Cloud console is created but the state is empty for the network stack.
  4. When running another cdktf deploy '*', the operation fails because a network with the same name already exists.

Note: We're using Terraform Cloud workspaces as backends.

xiehan commented 2 years ago

Closing this as a duplicate of #1836 -- please watch there for updates

javierlga commented 2 years ago

Hey @xiehan, thanks for taking a look at this. Before opening this issue I did take a look at #1836, that's a different issue but somewhat related to multi stacks deployments. I'd say that deploying multiple stacks using cdktf deploy '*' have two problems:

  1. 1836 (I had the same problem where all my states got locked when one stack failed)

  2. This issue, where the state file or not updated correctly.
ansgarm commented 2 years ago

Hi @javierlga,

could you by chance help me get this reproduced? Do you maybe even have a reproduction example that can help here?

When running another cdktf deploy '*', the operation fails because a network with the same name already exists.

To me this sounds like the Google Terraform Provider crashed in a way that created the resource but did not tell Terraform about it (hence it is not tracked by Terraform).

javierlga commented 2 years ago

Hey @ansgarm,

I don't think this is a provider issue, as the provider works when deploying a single stack with no problems, every resource gets created, the terraform workspace state is updated correctly, etc...

I don't have a public example available right now, but create you can create two stacks, and deploy them using cdktf deploy '*' but one of the stacks must fail; in my case, my service account to create GCS buckets was missing some permissions.

ansgarm commented 2 years ago

create two stacks, and deploy them using cdktf deploy '*' but one of the stacks must fail

Sounds like this testcase that we added for #1836, is that right @javierlga? Or does it have to be different?

If you can reproduce the issue locally, it would be perfect, if you could give the preview build a spin, as soon as we merge #1987.

github-actions[bot] commented 1 year ago

I'm going to lock this issue because it has been closed for 30 days. This helps our maintainers find and focus on the active issues. If you've found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.