spacelift-io / terraform-provider-spacelift

Terraform provider to interact with Spacelift
MIT License
76 stars 29 forks source link

`spacelift_space` deletion fails in same run as deleting dependent resources #516

Open jeohist opened 8 months ago

jeohist commented 8 months ago

I am writing tests for a module which creates several spacelift_space containing spacelift_stack. During the destroy phase, these tests will almost always fail with the following error:

Error: could not delete space: cannot delete space. this entity has remaining references to it: Stack (e-5, e-2)
Error: could not delete space: cannot delete space. this entity has remaining references to it: Stack (c-5, c-4, c-1, c-2, c-3)
Error: could not delete space: cannot delete space. this entity has remaining references to it: Stack (d-3, d-1, d-5)
Error: could not delete space: cannot delete space. this entity has remaining references to it: Stack (a-1, a-4, a-3)
Error: could not delete space: cannot delete space. this entity has remaining references to it: Stack (b-5, b-2, b-3)

It seems like the provider is attempting to delete the Space too quickly for the API to catch up. From the error message you can see it has partially succeeded, and locally a second terraform destroy succeeds. Since this is happening in module testing, it means I have to clean it up by myself. I can't work around it either with time_sleep because it would create a cyclic dependency.

Here's some reproduction code:

locals {
  stacks = toset(["1", "2", "3", "4", "5"])
  spaces = toset(["a", "b", "c", "d", "e"])
  combined = toset(flatten([
    for stack in local.stacks : [
      for space in local.spaces : {
        stack = stack
        space = space
      }
    ]
  ]))
}

data "spacelift_gitlab_integration" "this" {}

data "gitlab_project" "this" {
  path_with_namespace = "experiments/repro"
}

resource "spacelift_space" "this" {
  for_each = local.spaces

  name = each.key

  parent_space_id = "root"
}

resource "spacelift_stack" "this" {
  for_each = {
    for combination in local.combined : "${combination.stack}-${combination.space}" => combination
  }

  gitlab {
    id        = data.spacelift_gitlab_integration.this.id
    namespace = "experiments"
  }

  name     = "${each.value["space"]}/${each.value["stack"]}"
  space_id = spacelift_space.this[each.value["space"]].id

  repository = data.gitlab_project.this.name
  branch     = data.gitlab_project.this.default_branch
}

Here's the relevant part of the debug logs:

2024-02-22T12:59:34.791+0100 [INFO]  Starting apply for spacelift_space.this["e"]
2024-02-22T12:59:34.791+0100 [DEBUG] spacelift_space.this["e"]: applying the planned Delete change
2024-02-22T12:59:34.791+0100 [INFO]  Starting apply for spacelift_space.this["a"]
2024-02-22T12:59:34.791+0100 [DEBUG] spacelift_space.this["a"]: applying the planned Delete change
2024-02-22T12:59:34.791+0100 [INFO]  Starting apply for spacelift_space.this["d"]
2024-02-22T12:59:34.791+0100 [DEBUG] spacelift_space.this["d"]: applying the planned Delete change
2024-02-22T12:59:34.791+0100 [INFO]  Starting apply for spacelift_space.this["b"]
2024-02-22T12:59:34.791+0100 [DEBUG] spacelift_space.this["b"]: applying the planned Delete change
2024-02-22T12:59:34.791+0100 [INFO]  Starting apply for spacelift_space.this["c"]
2024-02-22T12:59:34.791+0100 [DEBUG] spacelift_space.this["c"]: applying the planned Delete change
2024-02-22T12:59:34.796+0100 [DEBUG] State storage *statemgr.Filesystem declined to persist a state snapshot
2024-02-22T12:59:34.886+0100 [ERROR] provider.terraform-provider-spacelift_v1.9.3: Response contains error diagnostic: @caller=github.com/hashicorp/terraform-plugin-go@v0.19.0/tfprotov5/internal/diag/diagnostics.go:58 diagnostic_detail= tf_rpc=ApplyResourceChange tf_req_id=bdac0715-6615-f5f4-5f7e-cde77919f4e9 @module=sdk.proto diagnostic_severity=ERROR diagnostic_summary="could not delete space: cannot delete space. this entity has remaining references to it: Stack (d-4, d-3, d-1, d-2)" tf_proto_version=5.4 tf_provider_addr=spacelift.io/spacelift-io/spacelift tf_resource_type=spacelift_space timestamp=2024-02-22T12:59:34.886+0100
2024-02-22T12:59:34.893+0100 [DEBUG] State storage *statemgr.Filesystem declined to persist a state snapshot
2024-02-22T12:59:34.893+0100 [ERROR] vertex "spacelift_space.this[\"d\"] (destroy)" error: could not delete space: cannot delete space. this entity has remaining references to it: Stack (d-4, d-3, d-1, d-2)
2024-02-22T12:59:34.902+0100 [ERROR] provider.terraform-provider-spacelift_v1.9.3: Response contains error diagnostic: diagnostic_detail= diagnostic_severity=ERROR tf_proto_version=5.4 tf_provider_addr=spacelift.io/spacelift-io/spacelift tf_req_id=8f9889c1-37c5-fcb1-3763-cb656afa61e3 @caller=github.com/hashicorp/terraform-plugin-go@v0.19.0/tfprotov5/internal/diag/diagnostics.go:58 @module=sdk.proto tf_resource_type=spacelift_space diagnostic_summary="could not delete space: cannot delete space. this entity has remaining references to it: Stack (e-5, e-1)" tf_rpc=ApplyResourceChange timestamp=2024-02-22T12:59:34.902+0100
2024-02-22T12:59:34.902+0100 [ERROR] provider.terraform-provider-spacelift_v1.9.3: Response contains error diagnostic: @caller=github.com/hashicorp/terraform-plugin-go@v0.19.0/tfprotov5/internal/diag/diagnostics.go:58 diagnostic_detail= diagnostic_summary="could not delete space: cannot delete space. this entity has remaining references to it: Stack (b-2, b-1, b-3)" tf_rpc=ApplyResourceChange tf_provider_addr=spacelift.io/spacelift-io/spacelift tf_req_id=411c3b9e-f828-ffbb-446b-f805c36ed8ea tf_resource_type=spacelift_space @module=sdk.proto diagnostic_severity=ERROR tf_proto_version=5.4 timestamp=2024-02-22T12:59:34.902+0100
2024-02-22T12:59:34.909+0100 [DEBUG] State storage *statemgr.Filesystem declined to persist a state snapshot
2024-02-22T12:59:34.909+0100 [ERROR] vertex "spacelift_space.this[\"b\"] (destroy)" error: could not delete space: cannot delete space. this entity has remaining references to it: Stack (b-2, b-1, b-3)
2024-02-22T12:59:34.916+0100 [DEBUG] State storage *statemgr.Filesystem declined to persist a state snapshot
2024-02-22T12:59:34.916+0100 [ERROR] vertex "spacelift_space.this[\"e\"] (destroy)" error: could not delete space: cannot delete space. this entity has remaining references to it: Stack (e-5, e-1)
2024-02-22T12:59:34.917+0100 [ERROR] provider.terraform-provider-spacelift_v1.9.3: Response contains error diagnostic: diagnostic_detail= diagnostic_severity=ERROR diagnostic_summary="could not delete space: cannot delete space. this entity has remaining references to it: Stack (c-5, c-4, c-1, c-2)" tf_proto_version=5.4 tf_provider_addr=spacelift.io/spacelift-io/spacelift tf_req_id=acb21a03-5042-3b2b-d3ba-c2cf79a77925 @module=sdk.proto tf_rpc=ApplyResourceChange tf_resource_type=spacelift_space @caller=github.com/hashicorp/terraform-plugin-go@v0.19.0/tfprotov5/internal/diag/diagnostics.go:58 timestamp=2024-02-22T12:59:34.917+0100
2024-02-22T12:59:34.923+0100 [DEBUG] State storage *statemgr.Filesystem declined to persist a state snapshot
2024-02-22T12:59:34.923+0100 [ERROR] vertex "spacelift_space.this[\"c\"] (destroy)" error: could not delete space: cannot delete space. this entity has remaining references to it: Stack (c-5, c-4, c-1, c-2)
2024-02-22T12:59:34.981+0100 [ERROR] provider.terraform-provider-spacelift_v1.9.3: Response contains error diagnostic: tf_resource_type=spacelift_space @caller=github.com/hashicorp/terraform-plugin-go@v0.19.0/tfprotov5/internal/diag/diagnostics.go:58 diagnostic_summary="could not delete space: cannot delete space. this entity has remaining references to it: Stack (a-5, a-1, a-2)" tf_provider_addr=spacelift.io/spacelift-io/spacelift diagnostic_severity=ERROR tf_proto_version=5.4 tf_req_id=1a108202-a119-82c5-95b1-9eb5adc30e3a tf_rpc=ApplyResourceChange @module=sdk.proto diagnostic_detail= timestamp=2024-02-22T12:59:34.981+0100
2024-02-22T12:59:34.991+0100 [DEBUG] State storage *statemgr.Filesystem declined to persist a state snapshot
2024-02-22T12:59:34.991+0100 [ERROR] vertex "spacelift_space.this[\"a\"] (destroy)" error: could not delete space: cannot delete space. this entity has remaining references to it: Stack (a-5, a-1, a-2)
2024-02-22T12:59:35.016+0100 [DEBUG] provider.stdio: received EOF, stopping recv loop: err="rpc error: code = Unavailable desc = error reading from server: EOF"
2024-02-22T12:59:35.019+0100 [DEBUG] provider: plugin process exited: path=.terraform/providers/registry.terraform.io/spacelift-io/spacelift/1.9.3/darwin_arm64/terraform-provider-spacelift_v1.9.3 pid=27213
2024-02-22T12:59:35.019+0100 [DEBUG] provider: plugin exited
marcinwyszynski commented 7 months ago

It seems like the provider is attempting to delete the Space too quickly for the API to catch up.

If you're suspecting eventual consistency, it's definitely not that - these operations are all transactional, unless there's a stack destructor involved (which defers the stack deletion), which does not seem to be the case here.

It's unclear to me why Terraform/OpenTofu would try to delete a space before it deletes all the stacks that belong to that space. My understanding is that this has something to do with how it sees and executes these dependencies. Perhaps the funky way in which they're defined makes the dependency resolver confused?

The part showing the ordering of calls (start time and finish time) would be most interesting here, to understand if Terraform/OpenTofu has been messing up with the ordering.