hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
42.29k stars 9.48k forks source link

Terraform Crash #35477

Open bhechinger opened 1 month ago

bhechinger commented 1 month ago

Terraform Version

❯ terraform version                                                      
Terraform v1.9.2
on linux_amd64
+ provider registry.terraform.io/hashicorp/google v5.37.0
+ provider registry.terraform.io/hashicorp/google-beta v5.37.0

Terraform Configuration Files

I'm unable to share it. It's a module I've written that is mostly cloud foundation fabric modules with a few "raw" google resources as well. In this particular instance I'm just trying to apply this one resource:

resource "google_compute_address" "validator" {
  count = var.details.instances_per_region
  name = format("validator-%02d", count.index + 1)

  project = var.project_id
  region  = var.details.region
}

Debug Output

terraform.log.gz

EDIT: gist is getting cut off.

Expected Behavior

Terraform should have created my resources instead of crashing.

Actual Behavior

Terraform crashed

Steps to Reproduce

Trying to create only one set of resources in an empty environment.

  1. terraform init
  2. terraform apply -target='module.infra.google_compute_address.validator'

Additional Context

No response

References

No response

bhechinger commented 1 month ago

Interesting thing I just noticed, the resource it complains about in the panic is different every time: panic: checkable object status report for unexpected checkable object module.infra[0].module.nat.var.endpoint_types panic: checkable object status report for unexpected checkable object module.infra[0].module.ember-template.var.options

liamcervante commented 1 month ago

Hi @bhechinger, the debug output you have shared doesn't show Terraform crashing - it has an error about the state lock being unavailable which suggests you have more than one Terraform operation executing against the same state file at a time.

Possibly the wrong debug output was shared?

bhechinger commented 1 month ago

You are absolutely correct. I forgot to reset the lock after the last crash. I have, however, apparently run into size limitations with gist and it's cutting off the log file. Here is the output.

terraform.log.gz

liamcervante commented 1 month ago

Thanks @bhechinger, that's definitely a crash stack trace!

The error message we see in there indicates that the listed variables likely have some form of validation attached to them and for some reason Terraform isn't picking up and registering that validation during the analysis phase, but is then attempting to validate everything properly later and reporting that it found a validation it didn't expect.

Are you able to try some things out / answer the following questions for more info?

  1. Does terraform plan -target='module.infra.google_compute_address.validator' also panic in the same way (ie. without it trying to go on and do an apply).
  2. Does it panic in the same way without the -target attribute?
  3. Do you know if this affects any other versions of Terraform? Like, did you just upgrade to v1.9.2 and it started crashing?
  4. Have you made any changes to the configuration recently, like added that new resource you are trying to target? Or linked in the new module.

I appreciate you might not be able to share all the configuration you have, but is there any connection between the resource you are targeting and the variables that are causing the crash? As in via chains of references through the intermediary module call.

liamcervante commented 1 month ago

I haven't been able to replicate this yet, so I've put together https://github.com/hashicorp/terraform/pull/35535 which adds more logging around this area so we'll get more information if this happens after the upcoming 1.9.4 release.

liamcervante commented 1 month ago

Hi @BeneHa and/or @bhechinger, we've released v1.9.4 now which is logging more information around the problematic area in the code. Are either of able to retry your configurations with v1.9.4 and (a) check this still happens and (b) share the output with trace logging enabled? Thanks!