hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
42.72k stars 9.55k forks source link

Terraform plan crashed. #35038

Closed melphleg closed 6 months ago

melphleg commented 6 months ago

Terraform Version

Task         : Terraform
Description  : Execute terraform commands to manage resources on AzureRM, Amazon Web Services(AWS) and Google Cloud Platform(GCP)
Version      : 4.227.24
Author       : Microsoft Corporation

Terraform Configuration Files

...terraform config...

Debug Output

goroutine 1141 [running]: runtime/debug.Stack() /opt/hostedtoolcache/go/1.22.1/x64/src/runtime/debug/stack.go:24 +0x5e github.com/hashicorp/terraform/internal/logging.PanicHandler() /home/runner/work/terraform/terraform/internal/logging/panic.go:84 +0x1ac panic({0x36b62a0?, 0xc002b21630?}) /opt/hostedtoolcache/go/1.22.1/x64/src/runtime/panic.go:770 +0x132 github.com/hashicorp/terraform/internal/instances.(Expander).ResourceInstanceKeys(0xc0007c0690?, {{}, {0xc001fee480, 0x1, 0x1}, {{}, 0x44, {0xc00013b9c8, 0x14}, {0xc0007778a0, ...}}}) /home/runner/work/terraform/terraform/internal/instances/expander.go:444 +0x20d github.com/hashicorp/terraform/internal/terraform.(evaluationStateData).GetResource(0xc001cb2900, {{}, 0x44, {0xc00013b9c8, 0x14}, {0xc0007778a0, 0x8}}, {{0xc00089d0b0, 0x19}, {0x23, ...}, ...}) /home/runner/work/terraform/terraform/internal/terraform/evaluate.go:592 +0x286 github.com/hashicorp/terraform/internal/lang.(Scope).evalContext(0xc001cb2990, {0xc000c64058, 0x1, 0x1}, {0x0, 0x0}) /home/runner/work/terraform/terraform/internal/lang/eval.go:371 +0x19e3 github.com/hashicorp/terraform/internal/lang.(Scope).EvalContext(...) /home/runner/work/terraform/terraform/internal/lang/eval.go:245 github.com/hashicorp/terraform/internal/lang.(Scope).EvalExpr(0xc001cb2990, {0x46361d0, 0xc0006c8380}, {{0x4635928?, 0xc000123680?}}) /home/runner/work/terraform/terraform/internal/lang/eval.go:170 +0x9e github.com/hashicorp/terraform/internal/terraform.(BuiltinEvalContext).EvaluateExpr(0x0?, {0x46361d0, 0xc0006c8380}, {{0x4635928?, 0xc000123680?}}, {0x0?, 0x0?}) /home/runner/work/terraform/terraform/internal/terraform/eval_context_builtin.go:322 +0xab github.com/hashicorp/terraform/internal/terraform.evaluateCountExpressionValue({0x46361d0, 0xc0006c8380}, {0x4658de8?, 0xc001132000?}) /home/runner/work/terraform/terraform/internal/terraform/eval_count.go:71 +0x8a github.com/hashicorp/terraform/internal/terraform.evaluateCountExpression({0x46361d0, 0xc0006c8380}, {0x4658de8?, 0xc001132000?}, 0x0) /home/runner/work/terraform/terraform/internal/terraform/eval_count.go:31 +0x4a github.com/hashicorp/terraform/internal/terraform.(NodeAbstractResource).writeResourceState(0xc001ce4900, {0x4658de8, 0xc001132000}, {{}, {0xc001fee480, 0x1, 0x1}, {{}, 0x44, {0xc00013b9b0, ...}, ...}}) /home/runner/work/terraform/terraform/internal/terraform/node_resource_abstract.go:424 +0x1a5 github.com/hashicorp/terraform/internal/terraform.(nodeExpandPlannableResource).expandResourceInstances(0xc000be1c20, {0x4658de8, 0xc001132b40}, {{}, {0xc001fee480, 0x1, 0x1}, {{}, 0x44, {0xc00013b9b0, ...}, ...}}, ...) /home/runner/work/terraform/terraform/internal/terraform/node_resource_plan.go:474 +0x105 github.com/hashicorp/terraform/internal/terraform.(nodeExpandPlannableResource).DynamicExpand(0xc000be1c20, {0x4658de8, 0xc001132b40}) /home/runner/work/terraform/terraform/internal/terraform/node_resource_plan.go:198 +0x516 github.com/hashicorp/terraform/internal/terraform.(Graph).walk.func1({0x3cbd2a0, 0xc000be1c20}) /home/runner/work/terraform/terraform/internal/terraform/graph.go:122 +0x822 github.com/hashicorp/terraform/internal/dag.(Walker).walkVertex(0xc0009655c0, {0x3cbd2a0, 0xc000be1c20}, 0xc000fe5a40) /home/runner/work/terraform/terraform/internal/dag/walk.go:384 +0x2d7 created by github.com/hashicorp/terraform/internal/dag.(Walker).Update in goroutine 856 /home/runner/work/terraform/terraform/internal/dag/walk.go:307 +0xff3

[warning]Can't find loc string for key: TerraformPlanFailed

[error]Error: TerraformPlanFailed 11

Finishing: Terraform: PLAN

Expected Behavior

executed init, validate and plan

Actual Behavior

it crashed

Steps to Reproduce

Define a map to store the names of the key vault secrets

locals { key_vault_secrets = { apppw = "gss-PostgreSQLPassword" appuser = "gss-postgreSQLUserName" JWT_SECRET = "JWTSECRET" } }

Fetch data for each key vault secret using for_each

Fetch data for each key vault secret using for_each

data "azurerm_key_vault" "gssapp" { for_each = local.key_vault_secrets name = var.GSS_Keyvault_Name resource_group_name = var.GSS_Keyvault_Resource_Group

depends_on = [ module.KeyVault ] }

data "azurerm_key_vault_secret" "secrets" { for_each = local.key_vault_secrets name = each.value key_vault_id = data.azurerm_key_vault.gssapp[each.key].id

depends_on = [ module.KeyVault ] }

data "azurerm_service_plan" "existing" { count = can(data.azurerm_service_plan.existing[*].id) != 0 ? 1 : 0 name = var.tf_plan_name resource_group_name = var.TF_VAR_deployment_resource_group_name }

Check if the service plan exists

locals { service_plan_exists = length(data.azurerm_service_plan.existing) > 0 }

Module to create ServiceBus

module "ServiceBus" { providers = { azurerm = azurerm } source = "../ServiceBus" serviceBusNamespaceName = var.serviceBusNamespaceName servicebus_resource_group = var.servicebus_resource_group GSS_API_CHANGED_TOPIC = var.GSS_API_CHANGED_TOPIC resource_location = var.resource_location subscription_id = var.subscription_id tf_plan_name = var.tf_plan_name TF_VAR_deployment_resource_group_name = var.TF_VAR_deployment_resource_group_name depends_on = [module.KeyVault] }

module "KeyVault" { providers = { azurerm = azurerm } source = "../KeyVault" GSS_Keyvault_Name = var.GSS_Keyvault_Name GSS_Keyvault_Resource_Location = var.GSS_Keyvault_Resource_Location GSS_Keyvault_Resource_Group = var.GSS_Keyvault_Resource_Group GSS-postgreSQLUserName = var.GSS-postgreSQLUserName GSS-PostgreSQLPassword = var.GSS-PostgreSQLPassword GSS-JWTSECRET = var.GSS-JWTSECRET application_name_service_principle_object_id = var.application_name_service_principle_object_id admin_group_id = var.admin_group_id tenant_id = var.tenant_id application_name_service_principle = var.application_name_service_principle admin_group = var.admin_group common_tags = var.common_tags app_registration_service_connection_name = var.app_registration_service_connection_name app_registration_service_connection_object_id = var.application_name_service_principle_object_id }

module "PostgreSQL" { providers = { azurerm = azurerm } source = "../PostgreSQL" TF_VAR_deployment_resource_group_name = var.TF_VAR_deployment_resource_group_name resource_location = var.resource_location tf_plan_name = var.tf_plan_name common_tags = var.common_tags baseName = var.baseName databaseName = var.databaseName stack = var.stack
subscription_id = var.subscription_id GSS_Keyvault_Name = var.GSS_Keyvault_Name GSS_Keyvault_Resource_Location = var.GSS_Keyvault_Resource_Location GSS_Keyvault_Resource_Group = var.GSS_Keyvault_Resource_Group GSS-postgreSQLUserName = var.GSS-postgreSQLUserName GSS-PostgreSQLPassword = var.GSS-PostgreSQLPassword GSS-JWTSECRET = var.GSS-JWTSECRET application_name_service_principle_object_id = var.application_name_service_principle_object_id admin_group_id = var.admin_group_id depends_on = [module.KeyVault] }

Conditionally create the service plan based on its existence

resource "azurerm_service_plan" "gssapiservice" { count = local.service_plan_exists ? 0 : 1 name = var.tf_plan_name location = var.resource_location resource_group_name = var.TF_VAR_deployment_resource_group_name sku_name = "S1" os_type = "Linux" depends_on = [module.PostgreSQL, module.ServiceBus, module.KeyVault] }

Store the service plan ID in a variable if it exists

locals { var_service_plan_id = length(data.azurerm_service_plan.existing) > 0 ? data.azurerm_service_plan.existing[0].id : azurerm_service_plan.gssapiservice[0].id }

output "var_service_plan_id" { value = local.var_service_plan_id }

resource "azurerm_service_plan" "gssappservice" { name = var.tf_plan_name_web location = var.resource_location resource_group_name = var.TF_VAR_deployment_resource_group_name sku_name = "S1" os_type = "Linux" depends_on = [ module.PostgreSQL, module.ServiceBus, module.KeyVault] }

resource "azurerm_application_insights" "gssappservice" { name = var.stack location = var.resource_location resource_group_name = var.TF_VAR_deployment_resource_group_name application_type = "web" depends_on = [ azurerm_service_plan.gssappservice, module.PostgreSQL, module.ServiceBus, module.KeyVault ]

}

resource "azurerm_application_insights" "gssapiservice" { name = var.API_NAME location = var.resource_location resource_group_name = var.TF_VAR_deployment_resource_group_name application_type = "web" depends_on = [ azurerm_service_plan.gssappservice, module.PostgreSQL, module.ServiceBus, module.KeyVault ]

}

resource "azurerm_linux_web_app" "gssapiservice" { name = var.API_NAME location = var.resource_location resource_group_name = var.TF_VAR_deployment_resource_group_name //service_plan_id = "/subscriptions/${var.subscription_id}/resourceGroups/${var.TF_VAR_deployment_resource_group_name}/providers/Microsoft.Web/serverFarms/${var.tf_plan_name}" service_plan_id = local.var_service_plan_id //service_plan_id = azurerm_service_plan.gssapiservice.id

site_config { application_stack { node_version = "12-lts" } } depends_on = [ azurerm_application_insights.gssapiservice, module.PostgreSQL, module.ServiceBus, module.KeyVault ]

}

resource "azurerm_linux_web_app" "gssappservice" { name = var.WEB_NAME location = var.resource_location resource_group_name = var.TF_VAR_deployment_resource_group_name service_plan_id = azurerm_service_plan.gssappservice.id

site_config { application_stack { node_version = "12-lts" } }

app_settings = { APPINSIGHTS_INSTRUMENTATIONKEY = azurerm_application_insights.gssappservice.instrumentation_key API_NAME = var.API_NAME DB_HOST = var.DB_HOST DB_NAME = var.DB_NAME DB_USERNAME = module.PostgreSQL.DB_USERNAME DB_PASSWORD = data.azurerm_key_vault_secret.secrets["apppw"].value DB_SSL = var.DB_SSL STACK = var.stack JWT_SECRET = data.azurerm_key_vault_secret.secrets["JWT_SECRET"].value API_URL = var.API_URL SERVICE_BUS_CONNECTION_STRING = module.ServiceBus.SERVICE_BUS_CONNECTION_STRING PLANT_RFS_PREDICTION_SB_CONNECTION = module.ServiceBus.PLANT_RFS_PREDICTION_SB_CONNECTION GSS_API_CHANGED_TOPIC = var.GSS_API_CHANGED_TOPIC SUBSCRIPTION_KEY = var.SUBSCRIPTION_KEY }

depends_on = [ azurerm_application_insights.gssappservice, azurerm_service_plan.gssappservice, module.PostgreSQL, module.ServiceBus, module.KeyVault ] }

Additional Context

running Azure pipeline Infrastructure.zip

References

No response

liamcervante commented 6 months ago

Hi @melphleg, thanks for filing this! Firstly, I will say Terraform definitely shouldn't be crashing so that is a bug for us to fix.

However, I'm curious as to what the intention behind the count attribute within the azurerm_service_plan.existing data source is?

data "azurerm_service_plan" "existing" {
  count = can(data.azurerm_service_plan.existing[*].id) != 0 ? 1 : 0
  name = var.tf_plan_name
  resource_group_name = var.TF_VAR_deployment_resource_group_name
}

The can() function returns a boolean, and Terraform doesn't equate values to zero and non-zero is the same way C does. This means that can(...) != 0 always equals true, regardless of what you put into the can function itself.

The reason I say this, is I think the self-reference within that can is the source of the crash so you might be able to work around this by removing the count attribute entirely since it currently always equates to 1 at the moment anyway.

liamcervante commented 6 months ago

Here's a simpler reproduction:

resource "tfcoremock_simple_resource" "resource" {
    count = tfcoremock_simple_resource.resource[*].id != 0 ? 1 : 0
}

Running terraform plan with the above resource crashes with the same stack trace as in the original comment.

liamcervante commented 6 months ago

If you attempt to refer to a resource from within the same resource you normally get the following error:

~/terraform/35038 > terraform plan   

Planning failed. Terraform encountered an error while generating this plan.

╷
│ Error: Self-referential block
│ 
│   on main.tf line 3, in resource "tfcoremock_simple_resource" "resource":
│    3:     id = tfcoremock_simple_resource.resource.id
│ 
│ Configuration for tfcoremock_simple_resource.resource may not refer to itself.
╵
╷
│ Error: Self-referential block
│ 
│   on main.tf line 3, in resource "tfcoremock_simple_resource" "resource":
│    3:     id = tfcoremock_simple_resource.resource.id
│ 
│ Configuration for tfcoremock_simple_resource.resource may not refer to itself.

This doesn't seem to happen for the count and for_each attributes, so I think the correct fix here is to produce the same error for the meta-attributes as we do for the regular attributes. The same restrictions should apply.

tegomass commented 6 months ago

Here's a simpler reproduction:

resource "tfcoremock_simple_resource" "resource" {
    count = tfcoremock_simple_resource.resource[*].id != 0 ? 1 : 0
}

Running terraform plan with the above resource crashes with the same stack trace as in the original comment.

Hi, I've just upgraded from v1.7.5 to v1.8.2, I need to perform this kind of check (for resource reusability), but I cannot anymore. Is there an alternative to that?

The idea is to use a specific AWS resource if it exists (checked by running aws cli command through external data). If not, create the resource. In my case, the condition you wrote avoid terraform to create the resource on the 1st run, then destroy it on the 2nd run, and so on.

resource "aws_appconfig_application" "app" {
  count       = (length(aws_appconfig_application.app[*].id) != 0) || (length(data.external.appconfig_app_id.result.id)) == 0 ? 1 : 0
  ...
}

With 1.8.2, I now face the Self-referential block error. (Which is still better than crashing the plan)

liamcervante commented 6 months ago

Hi @tegomass, apologies for the break in your workflow. I'm afraid to say this is an example of a bug that may have provided some useful functionality but was still not behaving as we'd normally expect Terraform to behave. Basically, in normal operation Terraform should always be reading the "post-plan" or "post-apply" value for a given resource, but with this kind of self-reference (due to the bug allowing it) it was actually reading the value in the reference directly from the existing state. While in your case this may have been helpful, it was inconsistent behaviour and not in line with documented behaviour of Terraform.

One thing I can think of that might help would be to tag your aws_appconfig_application with a "managed-by-terraform" value using the tags argument when it is created by Terraform. Then, updating your external data source to check if the application exists and is not marked as "managed-by-terraform" and use that as your sole condition within the count attribute.

Another alternative would be to use the new import block with a for_each argument. This would only work if you were okay with Terraform taking control of the pre-existing object.

import {
  for_each = data.external.appconfig_app_id.result

  id = each.value
  to = aws_appconfig_application.app
}

resource "aws_appconfig_application" "app" {
  id = data.external.appconfig_app_id.result.id
  ...
}

This assumes your external data source either returns an empty map if no resource or a single entry with the id you checked for. If the resource already exists the import block will execute on the first terraform apply operation. If the resource doesn't exist already the import block doesn't execute on the first run because the for_each argument was empty. On follow up operations the import block will never execute because the target resource already exists, and the import block knows not to overwrite things.

Hopefully either of these work for you, and apologies again for the break in your workflow.

melphleg commented 6 months ago

I abandoned the loop and found an alternative to get my pipeline to work. Thanks for your effort.

Bob On Friday, April 26, 2024 at 06:44:59 AM MDT, Liam Cervante @.***> wrote:

Hi @tegomass, apologies for the break in your workflow. I'm afraid to say this is an example of a bug that may have provided some useful functionality but was still not behaving as we'd normally expect Terraform to behave. Basically, in normal operation Terraform should always be reading the "post-plan" or "post-apply" value for a given resource, but with this kind of self-reference (due to the bug allowing it) it was actually reading the value in the reference directly from the existing state. While in your case this may have been helpful, it was inconsistent behaviour and not in line with documented behaviour of Terraform.

One thing I can think of that might help would be to tag your aws_appconfig_application with a "managed-by-terraform" value using the tags argument when it is created by Terraform. Then, updating your external data source to check if the application exists and is not marked as "managed-by-terraform" and use that as your sole condition within the count attribute.

Another alternative would be to use the new import block with a for_each argument. This would only work if you were okay with Terraform taking control of the pre-existing object. import { for_each = data.external.appconfig_app_id.result

id = each.value to = aws_appconfig_application.app }

resource "aws_appconfig_application" "app" { id = data.external.appconfig_app_id.result.id ... }

This assumes your external data source either returns an empty map if no resource or a single entry with the id you checked for. If the resource already exists the import block will execute on the first terraform apply operation. If the resource doesn't exist already the import block doesn't execute on the first run because the for_each argument was empty. On follow up operations the import block will never execute because the target resource already exists, and the import block knows not to overwrite things.

Hopefully either of these work for you, and apologies again for the break in your workflow.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

tegomass commented 6 months ago

Hi @liamcervante Appreciate your suggestion!

I was going with an additional external data (I hate doing that because it brings some trickiness) that get the state of the resource, something like: terraform state show 'aws_appconfig_application.app[0]'

but I think your approach using tags could better address my use case.

Thank you

github-actions[bot] commented 5 months ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.