Provider produced inconsistent final plan `aws_s3_bucket_object`

pradeep-repaka-mf commented 3 years ago

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform CLI and Terraform AWS Provider Version

terraform - 0.13.5 terragrunt - 0.25.5 AWS provider - latest, even tested with v3.24.0 also.

Affected Resource(s)

aws_s3_bucket_object

Terraform Configuration Files

Example Terraform code

terraform {
  required_version = ">= 0.13"
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = ">=3.15.0"
    }
  }
}

provider "aws" {
  region = var.aws_region
}

resource "aws_security_group" "cluster_security_group" {
  name        = "cluster-sg-connectivity"
  vpc_id      = var.vpc_id
  tags = {
    Name   = "cluster-sg-connectivity",
    source = "terraform"
  }
}

module "primary_subcluster_node" {
  source = "git::https://github.com/gruntwork-io/terraform-aws-server.git//modules/single-server?ref=v0.9.4"

  count                         = var.primary_subcluster_node_count
  name                          = "primary_subcluster_node"
  iam_role_name                 = "primary-role-${count.index}"
  instance_type                 = "c5d.4xlarge"
  ami                           = <replace_ami>
  keypair_name                  = var.instance_key_pair_name
  user_data_base64              = data.template_cloudinit_config.userdata.rendered
  root_volume_type              = "standard"
  root_volume_size              = 30
  vpc_id                        = var.vpc_id
  subnet_id                     = var.subnet_id
  additional_security_group_ids = [aws_security_group.cluster_security_group.id]
  allow_ssh_from_cidr_list      = []
  allow_all_outbound_traffic    = false
  attach_eip                    = false
  tags = {
    Name   = "${local.primary_subcluster_node_name}-${count.index}"
    source = "terraform"
  }
}

module "secondary_subcluster_node" {
  source = "git::https://github.com/gruntwork-io/terraform-aws-server.git//modules/single-server?ref=v0.9.4"

  count                         = var.secondary_subcluster_node_count
  name                          = "secondary_subcluster_node"
  iam_role_name                 = "secondary-role-${count.index}"
  instance_type                 = "c5d.4xlarge"
  ami                           = <replace_ami>
  keypair_name                  = var.instance_key_pair_name
  root_volume_type              = "standard"
  root_volume_size              = 30
  vpc_id                        = var.vpc_id
  subnet_id                     = var.subnet_id
  additional_security_group_ids = [aws_security_group.cluster_security_group.id]
  allow_ssh_from_cidr_list      = []
  allow_all_outbound_traffic    = false
  attach_eip                    = false
  user_data_base64              = data.template_cloudinit_config.userdata.rendered
  tags = {
    Name   = "${local.secondary_subcluster_node_name}-${count.index}",
    source = "terraform"
  }
}

resource "aws_s3_bucket_object" "s3_default_subcluster_instance_ip_list" {
  bucket       = var.s3_bucket_name
  acl          = "private"
  key          = "cluster/config/default_subcluster_instance_ip_list"
  content_type = "text/plain"
  content      = join("\n", [for v in module.primary_subcluster_node : v.private_ip], [""])
}

resource "aws_s3_bucket_object" "s3_analytics_subcluster_instance_ip_list" {
  bucket       = var.s3_bucket_name
  acl          = "private"
  key          = "cluster/config/default_subcluster_instance_ip_list"
  content_type = "text/plain"
  content      = join("\n", [for v in module.secondary_subcluster_node : v.private_ip], [""])
}

Please include all Terraform configurations required to reproduce the bug. Bug reports without a functional reproduction may be closed without investigation.

terraform {
  source = "<our-repo>"
}

include {
  path = find_in_parent_folders("")
}

inputs = {
  aws_region                      = "us-east-2"
  primary_subcluster_node_count   = 3
  secondary_subcluster_node_count = 0
  instance_key_pair_name          = <required_keypair>
  vpc_id                          = <required_vpc_id>
  subnet_id                       = <required_subnet_id>
  s3_bucket_name                  = <require_s3_bucket_name>
}

Panic Output

aws_s3_bucket_object.s3_secondary_subcluster_instance_ip_list to include new values learned so far during apply, provider "registry.terraform.io/hashicorp/aws" produced an invalid new value for .version_id: was known, but now unknown.

This is a bug in the provider, which should be reported in the provider's own issue tracker.

Expected Behavior

Ideally when the value 'secondary_subcluster_node_count' changed from 0 to any non-zero value and then when we re-apply, the newly created nodes under the secondary subcluster private ip address need to write into "cluster/config/secondary_subcluster_instance_ip_list" file.

Actual Behavior

Instead of writing a new node private IP address into "cluster/config/secondary_subcluster_instance_ip_list" file under s3 bucket, it is throwing provider error when we do re-apply after changing the 'secondary_subcluster_node_count' value.

Steps to Reproduce

With the provided values need to do terragrunt apply.
When the first time applied, it will deploy primary subcluster nodes with specified value 'primary_subcluster_node_count' and secondary subcluster won't create any nodes as its 'secondary_subcluster_node_count' value is 0.
After primary subcluster nodes created, change 'secondary_subcluster_node_count' value from 0 to non-zero.
Again do terragrunt apply and will throw an error.

bflad commented 3 years ago

Hi @pradeep-repaka-mf 👋 Thank you for raising this and sorry you ran into trouble here.

The Provider produced inconsistent final plan type of error seems to indicate that there is potentially a bug either with Terraform CLI or the aws_s3_bucket_object but we will need help reproducing the issue. Ideally, this would be a self-contained configuration we can run without needing special access or inventing details, but if not can you at least please provide either the module configuration or the outputs of that module?

If possible, you may also want to see if upgrading to Terraform CLI version 0.14.7 (latest as of this writing) has the same issue since there were some operation graph changes that occurred between Terraform CLI 0.13 and 0.14.

Jorge-Rodriguez commented 3 years ago

I'm experiencing this too, it looks like a regression on #14900 and it didn't show with provider version 3.16.0

bflad commented 3 years ago

Hi @Jorge-Rodriguez can you please provide a self-contained configuration that reproduces the issue?

Jorge-Rodriguez commented 3 years ago

@bflad I'll try, but it might be hard to do so. We haven't been able to consistently reproduce the issue, we've only seen it when running Terraform via GitHub actions.

pradeep-repaka-mf commented 3 years ago

@bflad I already provided an example in the bug description. I have not given ami, vpc_id and subnet_id values, do you want to me to provide those values too?

bflad commented 3 years ago

@pradeep-repaka-mf the referenced Terraform Module appears to be private, at least to my account.

$ terraform init
Initializing modules...
Downloading git::https://github.com/gruntwork-io/terraform-aws-server.git?ref=v0.9.4 for primary_subcluster_node...
Downloading git::https://github.com/gruntwork-io/terraform-aws-server.git?ref=v0.9.4 for secondary_subcluster_node...

Error: Failed to download module

Could not download module "primary_subcluster_node" (main.tf:24) source code
from
"git::https://github.com/gruntwork-io/terraform-aws-server.git?ref=v0.9.4":
error downloading
'https://github.com/gruntwork-io/terraform-aws-server.git?ref=v0.9.4':
/usr/local/bin/git exited with 128: Cloning into
'.terraform/modules/primary_subcluster_node'...
ERROR: Repository not found.
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

Error: Failed to download module

Could not download module "secondary_subcluster_node" (main.tf:48) source code
from
"git::https://github.com/gruntwork-io/terraform-aws-server.git?ref=v0.9.4":
error downloading
'https://github.com/gruntwork-io/terraform-aws-server.git?ref=v0.9.4':
/usr/local/bin/git exited with 128: Cloning into
'.terraform/modules/secondary_subcluster_node'...
ERROR: Repository not found.
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

willhughes-au commented 3 years ago

I'm also encountering this error, but with a different resource - so I'm not certain if it's the same issue.

I can't reproduce it reliably it's started popping up a few times over the last week in our automated deployments across a few workspaces/environments.

Context, if it helps:

Our workflow is that we're using Teamcity to do a tf plan, saving that plan to disk, and then immediately running a tf apply of that saved plan. (same script, build process - we don't exit)

The lambda function referenced is being set from values from an S3 Object (source_code_hash is set to the value from an S3 Object)

I have other aws_lambda_function in this configuration which are fine, but only the two lambda functions that have a aws_lambda_function_event_invoke_config associated also generate these errors.

data "aws_s3_bucket_object" "xxx" {
  bucket = local.lambda_s3_bucket
  key    = "path/to/lambda/xxx.base64sha256"
}

resource "aws_lambda_function" "xxx" {
  count = local.lambda_functions_deployed ? 1 : 0 

  source_code_hash = data.aws_s3_bucket_object.xxx.body

  // rest of the properties omitted
}

resource "aws_lambda_function_event_invoke_config" "xxx" {
  count = local.lambda_functions_deployed ? 1 : 0 

  function_name          = aws_lambda_function.xxx[0].function_name
  qualifier              = aws_lambda_function.xxx[0].version
  maximum_retry_attempts = 1
}

The error this time was:

Error: Provider produced inconsistent final plan

When expanding the plan for
aws_lambda_function_event_invoke_config.xxx[0] to include new values
learned so far during apply, provider "registry.terraform.io/hashicorp/aws"
produced an invalid new value for .qualifier: was cty.StringVal("66"), but now
cty.StringVal("67").

This is a bug in the provider, which should be reported in the provider's own
issue tracker.

Error: Provider produced inconsistent final plan

When expanding the plan for
aws_lambda_function_event_invoke_config.yyy[0] to include new
values learned so far during apply, provider
"registry.terraform.io/hashicorp/aws" produced an invalid new value for
.qualifier: was cty.StringVal("66"), but now cty.StringVal("67").

This is a bug in the provider, which should be reported in the provider's own
issue tracker.

Terraform Version 0.13.5 AWS Provider Version v3.37.0

I'm not sure if it's relevant, but that qualifier mentioned has changed each time.

2021-06-09 16:09 UTC+10: .qualifier: was cty.StringVal("56"), but now cty.StringVal("57").
2021-06-10 18:20 UTC+10: .qualifier: was cty.StringVal("59"), but now cty.StringVal("60").
2021-06-11 17:56 UTC+10: .qualifier: was cty.StringVal("62"), but now cty.StringVal("63").
2021-06-15 17:05 UTC+10: .qualifier: was cty.StringVal("66"), but now cty.StringVal("67").

miff2000 commented 1 year ago

I'm seeing this too in v5.7.0 of the provider and Terraform v1.2.8. Going to disable the source_hash until there's a workaround.

hashicorp / terraform-provider-aws