hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.84k stars 9.19k forks source link

[Bug]: Error: The terraform-provider-aws_v5.42.0_x5 plugin crashed! #37289

Open marcellpatonay opened 6 months ago

marcellpatonay commented 6 months ago

Terraform Core Version

1.5.7

AWS Provider Version

5.42.0

Affected Resource(s)

aws_subnets aws_iam_policy aws_iam_role aws_security_group aws_kms_key aws_vpc aws_cloudsearch_domain

Expected Behavior

Expected plan to complete

Actual Behavior

Terraform failed with the following error:

Error: The terraform-provider-aws_v5.42.0_x5 plugin crashed!

This is always indicative of a bug within the plugin. It would be immensely
helpful if you could report the crash with the plugin's maintainers so that it
can be fixed. The output above should help diagnose the issue.

attached is a partial debug log

Relevant Error/Panic Output Snippet

Stack trace from the terraform-provider-aws_v5.42.0_x5 plugin:

panic: set item just set doesn't exist

goroutine 219 [running]:
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*MapFieldWriter).setSet(0x14002a44bd0, {0x14002fcced0, 0x1, 0x1}, {0x112866ec0, 0x140028f4ab0}, 0x14001317cc0)
        github.com/hashicorp/terraform-plugin-sdk/v2@v2.33.0/helper/schema/field_writer_map.go:330 +0x720
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*MapFieldWriter).set(0x14002a44bd0, {0x14002fcced0, 0x1, 0x1}, {0x112866ec0, 0x140028f4ab0})
        github.com/hashicorp/terraform-plugin-sdk/v2@v2.33.0/helper/schema/field_writer_map.go:110 +0x120
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*MapFieldWriter).WriteField(0x14002a44bd0, {0x14002fcced0, 0x1, 0x1}, {0x112866ec0, 0x140028f4ab0})
        github.com/hashicorp/terraform-plugin-sdk/v2@v2.33.0/helper/schema/field_writer_map.go:92 +0x388
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*ResourceData).Set(0x14002f58100, {0x110792897, 0xb}, {0x112866ec0, 0x140028f4ab0})
        github.com/hashicorp/terraform-plugin-sdk/v2@v2.33.0/helper/schema/resource_data.go:230 +0x1a0
github.com/hashicorp/terraform-provider-aws/internal/service/cloudsearch.resourceDomainRead({0x11560d2a8, 0x14002f5a810}, 0x14002f58100, {0x1153e1160?, 0x140025bd420?})
        github.com/hashicorp/terraform-provider-aws/internal/service/cloudsearch/domain.go:333 +0x1470
github.com/hashicorp/terraform-provider-aws/internal/provider.New.(*wrappedResource).Read.interceptedHandler[...].func9(0x0?, {0x1153e1160?, 0x140025bd420?})
        github.com/hashicorp/terraform-provider-aws/internal/provider/intercept.go:113 +0x1d4
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*Resource).read(0x11560d2a8?, {0x11560d2a8?, 0x14002f40ed0?}, 0xd?, {0x1153e1160?, 0x140025bd420?})
        github.com/hashicorp/terraform-plugin-sdk/v2@v2.33.0/helper/schema/resource.go:790 +0x64
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*Resource).RefreshWithoutUpgrade(0x14001312a80, {0x11560d2a8, 0x14002f40ed0}, 0x140029a9ee0, {0x1153e1160, 0x140025bd420})
        github.com/hashicorp/terraform-plugin-sdk/v2@v2.33.0/helper/schema/resource.go:1089 +0x430
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*GRPCProviderServer).ReadResource(0x140033290c8, {0x11560d2a8?, 0x14002f40de0?}, 0x14002f1c640)
        github.com/hashicorp/terraform-plugin-sdk/v2@v2.33.0/helper/schema/grpc_provider.go:667 +0x3e4
github.com/hashicorp/terraform-plugin-mux/tf5muxserver.(*muxServer).ReadResource(0x11560d2e0?, {0x11560d2a8?, 0x14002f40ae0?}, 0x14002f1c640)
        github.com/hashicorp/terraform-plugin-mux@v0.15.0/tf5muxserver/mux_server_ReadResource.go:35 +0x184
github.com/hashicorp/terraform-plugin-go/tfprotov5/tf5server.(*server).ReadResource(0x14000b25ea0, {0x11560d2a8?, 0x14002f40330?}, 0x14002c6b380)
        github.com/hashicorp/terraform-plugin-go@v0.22.0/tfprotov5/tf5server/server.go:775 +0x3c4
github.com/hashicorp/terraform-plugin-go/tfprotov5/internal/tfplugin5._Provider_ReadResource_Handler({0x11518ab20?, 0x14000b25ea0}, {0x11560d2a8, 0x14002f40330}, 0x140029a3e00, 0x0)
        github.com/hashicorp/terraform-plugin-go@v0.22.0/tfprotov5/internal/tfplugin5/tfplugin5_grpc.pb.go:482 +0x164
google.golang.org/grpc.(*Server).processUnaryRPC(0x14001584400, {0x11560d2a8, 0x14002f402a0}, {0x115646138, 0x1400270a1a0}, 0x14002f3cc60, 0x14002689020, 0x11d621cc8, 0x0)
        google.golang.org/grpc@v1.62.0/server.go:1383 +0xb8c
google.golang.org/grpc.(*Server).handleStream(0x14001584400, {0x115646138, 0x1400270a1a0}, 0x14002f3cc60)
        google.golang.org/grpc@v1.62.0/server.go:1794 +0xc70
google.golang.org/grpc.(*Server).serveStreams.func2.1()
        google.golang.org/grpc@v1.62.0/server.go:1027 +0x8c
created by google.golang.org/grpc.(*Server).serveStreams.func2 in goroutine 25
        google.golang.org/grpc@v1.62.0/server.go:1038 +0x150

Terraform Configuration Files

example:

################################################################################
# EKS Module
################################################################################
module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "20.8.4"

  cluster_name                   = var.cluster_name
  cluster_version                = "1.29"
  cluster_endpoint_public_access = true

  enable_cluster_creator_admin_permissions = true

  # Enable EFA support by adding necessary security group rules
  # to the shared node security group
  enable_efa_support = true

  cluster_addons = {
    coredns = {
      most_recent = true
    }
    kube-proxy = {
      most_recent = true
    }
    vpc-cni = {
      most_recent = true
    }
  }

  vpc_id                   = data.aws_vpc.aws-vpc.id
  subnet_ids               = data.aws_subnets.k8s_subnets_ids.ids
  control_plane_subnet_ids = data.aws_subnets.k8s_subnets_ids.ids

  # External encryption key
  create_kms_key = false
  cluster_encryption_config = {
    resources        = ["secrets"]
    provider_key_arn = module.kms.key_arn
  }

  self_managed_node_group_defaults = {
    # enable discovery of autoscaling groups by cluster-autoscaler
    autoscaling_group_tags = {
      "k8s.io/cluster-autoscaler/enabled" : true,
      "k8s.io/cluster-autoscaler/${var.cluster_name}" : "owned",
    }
  }

  self_managed_node_groups = {
    # Default node group - as provisioned by the module defaults
    #default_node_group = {}

    # Complete
    default_node_group = {
      name                      = "${var.cluster_name}-node-group"
      use_name_prefix           = true
      wait_for_capacity_timeout = "0"

      subnet_ids = data.aws_subnets.k8s_subnets_ids.ids

      min_size     = 2
      max_size     = 3
      desired_size = 3

      ami_id = "${data.aws_ami.eks_node.id}"

      pre_bootstrap_user_data = <<-EOT
        export FOO=bar
      EOT

      post_bootstrap_user_data = <<-EOT
        echo "you are free little kubelet!"
      EOT

      instance_type = "m6i.large"
      key_name      = var.cluster_name

      launch_template_name            = "${var.cluster_name}-node-lt"
      launch_template_use_name_prefix = true
      launch_template_description     = "node group launch template"

      ebs_optimized     = true
      enable_monitoring = true

      block_device_mappings = {
        xvda = {
          device_name = "/dev/xvda"
          ebs = {
            volume_size           = 20
            volume_type           = "gp3"
            iops                  = 3000
            throughput            = 150
            delete_on_termination = true
          }
        }
      }

      metadata_options = {
        http_endpoint               = "enabled"
        http_tokens                 = "required"
        http_put_response_hop_limit = 2
        instance_metadata_tags      = "disabled"
      }

      create_iam_role          = true
      iam_role_name            = "${var.cluster_name}-node-role"
      iam_role_use_name_prefix = false
      iam_role_description     = "node group iam role"
      iam_role_tags = {
        terraform = true
        env       = var.env
        org       = var.org
      }
      iam_role_additional_policies = {
        AmazonEC2ContainerRegistryReadOnly                  = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
        AmazonEKS_CNI_Policy                                = "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy"
        AmazonEC2ContainerRegistryFullAccess                = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryFullAccess"
        AmazonEKSWorkerNodePolicy                           = "arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy"
        AmazonSSMManagedInstanceCore                        = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"
        EC2InstanceProfileForImageBuilderECRContainerBuilds = "arn:aws:iam::aws:policy/EC2InstanceProfileForImageBuilderECRContainerBuilds"
        AmazonEBSCSIDriverPolicy                            = "arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy"
        EKSAutoScalingPolicy                                = "${module.iam_policy.arn}"
        NodeWorkerPolicy                                    = "${module.iam_eks_policy.arn}"
      }

      tags = {
        terraform = true
        env       = var.env
        org       = var.org
      }
    }

  }

  tags = {
    terraform = true
    env       = var.env
    org       = var.org
  }
}

Please note that if run against an empty state the configuration successfully applies

Steps to Reproduce

Would be hard to reproduce, If configuration is run against an empty state the issue described above doesn't appear.

Debug Output

debug.log

Panic Output

No response

Important Factoids

State is managed by Gitlab, besides that it's pure terraform. The issue happens both locally and in gitlab pipelines. Local env: arm macs Gitlab pipelines: saas-linux-small-amd64

References

similar issues: https://github.com/hashicorp/terraform-provider-aws/issues/36588 https://github.com/hashicorp/terraform-provider-aws/issues/32212

Would you like to implement a fix?

None

github-actions[bot] commented 6 months ago

Community Note

Voting for Prioritization

Volunteering to Work on This Issue

aristosvo commented 6 months ago

Hi @marcellpatonay!

Can you share a cleaned up version of your aws_cloudsearch_domain config? It seems that the crash has to do with that resource only.

marcellpatonay commented 6 months ago

Hi @aristosvo here's the config:

resource "aws_cloudsearch_domain" "domain" {
  name = var.domain_name

  scaling_parameters {
    desired_instance_type = var.instance_type
  }

  endpoint_options {
    enforce_https = var.enforce_https
  }

  dynamic "index_field" {
    for_each = var.indexes
    content {
      name            = index_field.value["name"]
      type            = index_field.value["type"]
      search          = index_field.value["search"]
      return          = index_field.value["return"]
      sort            = index_field.value["sort"]
      highlight       = index_field.value["highlight"]
      analysis_scheme = index_field.value["analysis_scheme"]
    }
  }
}

variable "enforce_https" {
  description = "Whether to enforce HTTPS on the domain"
  type        = bool
  default     = false
}

variable "domain_name" {
  description = "The CloudSearch domain name"
  type        = string
}

variable "instance_type" {
  description = "Size of the instance to use for the CloudSearch domain"
  type        = string
  default     = "search.large"
}
module "cloudsearch" {

  source  = "gitlab.com/cardmarket/terraform-modules/aws//cloudsearch"
  version = "1.0.1-1-beta"

  domain_name   = "product-dev"
  instance_type = "search.small"
  enforce_https = "false"
  multi_az      = "false"

  indexes = [
    {
      name            = "title"
      type            = "text"
      search          = true
      return          = true
      sort            = false
      highlight       = false
      analysis_scheme = "_en_default_"
    },
  ]

}
aristosvo commented 6 months ago

@marcellpatonay I cannot really locate the issue without extra information, I'm afraid. I cannot replicate the issue in a test.

Has there been external interaction with the indexes on de cloudsearch domain resource or any service which might have interacted with it? Is the resource imported by any chance?

marcellpatonay commented 6 months ago

@aristosvo We did some futher testing.

  1. Copied over the state to my local machine and performed operations against that.

    • removed everything related to cloudsearch with terraform state rm
    • plan immediately succeeded afterwards
  2. The issue also resolved itself somehow. Today, an hour ago, we noticed that our pipelines in gitlab were passing.

    • note that we haven't made a single change to gitlab managed state.
    • We are still looking into what exactly could've caused this.
    • Could this be something on aws api side?

And to answer your questions:

+1 Really appreciate the help! Thank you!