hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.79k stars 9.14k forks source link

[Bug]: some VPC resources leaking during creation #39741

Open woranhun opened 11 hours ago

woranhun commented 11 hours ago

Terraform Core Version

v1.9.7

AWS Provider Version

5.72.0

Affected Resource(s)

Expected Behavior

If a resource is created on AWS side, then it should not be leaked.

Actual Behavior

This is can happen because conn.CreateSubnet returns earlier during an error and therefore the subnedId is not saved. https://github.com/hashicorp/terraform-provider-aws/blob/12339906a1803312219d9bfd257c025aaad05513/internal/service/ec2/vpc_subnet.go#L197

Relevant Error/Panic Output Snippet

output, err := conn.CreateSubnet(ctx, input)

    if err != nil {
      return sdkdiag.AppendErrorf(diags, "creating EC2 Subnet: %s", err)
    }

    d.SetId(aws.ToString(output.Subnet.SubnetId))

Terraform Configuration Files

    resource "aws_subnet" "main" {
    vpc_id     = "vpc-01ced180f39a4a9d2"
    cidr_block = "10.0.1.0/24"

    tags = {
      Name = "Main"
    }
  }

Steps to Reproduce

Debug Output

No response

Panic Output

No response

Important Factoids

We have seen issues with only the 4 resources I mentioned before. I think because these are the ones we create in high quantity. However, I think other resources might be affected as well...

References

relates to: https://github.com/crossplane-contrib/provider-upjet-aws/issues/1482 relates to: https://github.com/hashicorp/terraform-provider-aws/issues/38251

Would you like to implement a fix?

Yes

github-actions[bot] commented 11 hours ago

Community Note

Voting for Prioritization

Volunteering to Work on This Issue

alexbacchin commented 9 hours ago

@woranhun seems to me a normal behavior of the EC2 API. https://docs.aws.amazon.com/ec2/latest/devguide/eventual-consistency.html

woranhun commented 8 hours ago

@alexbacchin Can you elaborate on why do you think is it a normal behavior? I think, It is the same issue as happened before with RDS (https://github.com/hashicorp/terraform-provider-aws/issues/38251). And also this behavior causes issues with managing AWS resources from Crossplane (https://github.com/crossplane-contrib/provider-upjet-aws/issues/1482)

alexbacchin commented 8 hours ago

@woranhun When you CTRL+C Terraform, but the AWS API action is successfully received, AWS control plane will execute the action and the Terraform state will have no record of the resource been successfully created. Thus, it will try again in the subsequent apply, as there is already a subnet with the same CIDR, you get an error.

woranhun commented 7 hours ago

@alexbacchin Yes, but If a resource creation was triggered from TF, then TF should be aware that the resource exists, because it was created earlier by itself.

  output, err := conn.CreateSubnet(ctx, input)

  if err != nil {
    return sdkdiag.AppendErrorf(diags, "creating EC2 Subnet: %s", err)
  }

  d.SetId(aws.ToString(output.Subnet.SubnetId))

For example in this case: The resource was created in AWS AND err is not nil (for whatever reason), then the SubnetId is lost forever (because of the return).

My plan is to move d.SetId(aws.ToString(output.Subnet.SubnetId)) line above to the error check.