hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.82k stars 9.17k forks source link

Adding description to a rule in security group recreates the rule #6274

Open shivamdixit opened 6 years ago

shivamdixit commented 6 years ago

Community Note

Terraform Version

Terraform: v0.11.7 AWS provider: v1.41

Affected Resource(s)

Terraform Configuration Files

resource "aws_security_group" "test_resource" {
  name        = "foo"
  description = "bar"
  vpc_id      = "${var.vpc_id}"

  ingress {
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

Plan Output

~ module.test.aws_security_group.test_resource
      ingress.2541437006.cidr_blocks.#:      "1" => "0"
      ingress.2541437006.cidr_blocks.0:      "0.0.0.0/0" => ""
      ingress.2541437006.description:        "" => ""
      ingress.2541437006.from_port:          "22" => "0"
      ingress.2541437006.ipv6_cidr_blocks.#: "0" => "0"
      ingress.2541437006.prefix_list_ids.#:  "0" => "0"
      ingress.2541437006.protocol:           "tcp" => ""
      ingress.2541437006.security_groups.#:  "0" => "0"
      ingress.2541437006.self:               "false" => "false"
      ingress.2541437006.to_port:            "22" => "0"
      ingress.47949487.cidr_blocks.#:        "0" => "1"
      ingress.47949487.cidr_blocks.0:        "" => "0.0.0.0/0"
      ingress.47949487.description:          "" => "Testing description"
      ingress.47949487.from_port:            "" => "22"
      ingress.47949487.ipv6_cidr_blocks.#:   "0" => "0"
      ingress.47949487.prefix_list_ids.#:    "0" => "0"
      ingress.4794948     tocol:             "" => "tcp"
      ingress.47949487.security_groups.#:    "0" => "0"
      ingress.47949487.self:                 "" => "false"
      ingress.47949487.to_port:              "" => "22"

Expected Behavior

If a description is added to a rule, terraform should update ONLY description in the rule using the update description API.

For resource aws_security_group_rule, adding description to a rule which didn't have one before works as expected.

Actual Behavior

Terraform is removing the rule (setting all the values to null), and then re-creating it. This could result in a potential outage as removing the rule and adding it again is non-atomic and might result in packet loss in that duration.

Steps to Reproduce

  1. Create a security group resource with at-least one ingress/egree rule without any description
resource "aws_security_group" "test_resource" {
  name        = "foo"
  description = "bar"
  vpc_id      = "${var.vpc_id}"

  ingress {
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
    ### no description is present
  }
}
  1. Run terraform apply
  2. Add description in ingress/egress rule. The final resource will look like:

    resource "aws_security_group" "test_resource" {
    name        = "foo"
    description = "bar"
    vpc_id      = "${var.vpc_id}"
    
    ingress {
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
    description = "Testing description"
    }
    }
  3. Run terraform apply again, it will try to delete and add rule.

References

This feature was introduced in the following PR: #1587 Feature request: https://github.com/terraform-providers/terraform-provider-aws/issues/1554

shivamdixit commented 6 years ago

I did some debugging. Here are my findings:

  1. From my understanding, the diff in the plan output is being generated by Terraform core and since the ingress/egress block is of type set, changing the description is changing the hashcode of the set and it is treating both sets as completely different.

  2. The way update is being handled in aws_security_group, it is assuming that whenever any of the attributes in the ingress/egress blocks are modified, it means destroying and re-creating the rule itself. Reference: https://github.com/terraform-providers/terraform-provider-aws/blob/cf3da6e100ec38565d653b6ee90322bba2b7da5a/aws/resource_aws_security_group.go#L686-L785 I guess this assumption was true before description was introduced for ingress/egress, but I believe it no longer holds true.

  3. In resource aws_security_group_rule, the only thing that can be updated is description, therefore it is being appropriately handled. Reference: https://github.com/terraform-providers/terraform-provider-aws/blob/daf0e91aac4d782ce3896c2a69a9484465afd8f7/aws/resource_aws_security_group_rule.go#L821-L867

Suggested Alternatives

I believe introducing an if/else block and comparing the old and new sets if only description is changed (in func resourceAwsSecurityGroupUpdate) can partially fix the behaviour. However, I don't think it will change the output of plan, which is ugly.

I am not very familiar with Terraform, however, having a quick glimpse pointed me to CustomizeDiff option. Can we introduce this to modify the diff and leave it only with description, if only description has been added?

bflad commented 6 years ago

CustomizeDiff can have some unexpected behaviors (e.g. it ignores computed attribute values) so we generally discourage its usage. We should probably add that as a note to the top of that documentation section. After Terraform 0.12 settles down, some of its functionality might be replaced/improved with some better mechanisms to accomplish its original goals, but I'm not sure if those have even been thought about too far yet.

Generally speaking though, this is a larger problem with schema.TypeSet attributes being used for "configuration blocks" in the HCL where something more along the lines of a schema.TypeMap would actually be a better implementation, both from the difference perspective and situations like this (e.g. updating individual child attributes via separate API calls). There might be some eventual movement away from schema.TypeSet for this purpose or some migration process at some point in the future, but it will be a large effort for sure.

So apologies for not having a great suggestion for how to this move forward off the top of my head, but it seemed necessary to provide some context.

bflad commented 6 years ago

If I had to venture a guess, a potential solution might be to exclude description from the schema.TypeSet hash. That will at least make the difference show up as only on the description instead of the whole rule.

tmccombs commented 5 years ago

This problem is more general than just the description. For example adding a new cidr block to cidr_blocks does the same thing. I'm not sure what a good solution for this is. Removing "description" from the hash doesn't solve the cidr_block problem. But if you only use fields that can't be changed (to_port, from_port, protocol). Then you can't have multiple blocks for the same port(s) and protocol with different descriptions.

And as far as using TypeMap, I'm not sure how that would solve this problem. What would the key be?

tmccombs commented 5 years ago

Given that the ingress/egress blocks don't have any computed values, maybe this is an appropriate place to use CustomizeDiff?

tmccombs commented 5 years ago

Thinking about this some more, I do think that the description should be removed from the hash function, since it can be updated independently of the rest of the rule.

I also thing that the real underlying problem with the diff is that the terraform config doesn't map very well onto the AWS state. In AWS a rule is really a single tuple of from_port, to_port, protocol, and external identifier (cidr_block, security_group_id, etc.). The API allows grouping these rules together, but afaict there is no inherent grouping in AWS's state and operations can be done on these individual resources. I do agree that the current config format is more convenient than having to specify a seperate ingress/egress block for each cidr block. But I think ideally the config should be expanded into individual rules before the diff occurs. But I don't know if that is possible with the current terraform.

github-actions[bot] commented 3 years ago

Marking this issue as stale due to inactivity. This helps our maintainers find and focus on the active issues. If this issue receives no comments in the next 30 days it will automatically be closed. Maintainers can also remove the stale label.

If this issue was automatically closed and you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thank you!

tmccombs commented 3 years ago

This is still an issue and because of this I exclusively use inline security group rules rather than this resource type.