hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.76k stars 9.11k forks source link

MSK cluster destroy when is updating the resource #25196

Open milalima opened 2 years ago

milalima commented 2 years ago

Community Note

Terraform CLI and Terraform AWS Provider Version

Affected Resource(s)

Terraform Configuration Files

Please include all Terraform configurations required to reproduce the bug. Bug reports without a functional reproduction may be closed without investigation.


resource "aws_msk_cluster" "brooklyn_msk_dev" {
  cluster_name           = "brooklyn-msk-dev"
  kafka_version          = "2.6.2"
  number_of_broker_nodes = 2

  broker_node_group_info {
    instance_type   = "kafka.m5.large"
    ebs_volume_size = 1000
    client_subnets = [
      "subnet-02bad39cf9d2bc5fc",
      "subnet-0ae1670fde2c23056"
    ]
    security_groups = [
                        "sg-04400140f7e14a3a1"
                      ]
  }

  encryption_info {
    encryption_in_transit {
      client_broker = "TLS_PLAINTEXT"
    }
  }

}

Debug Output

Expected Behavior

          updating only the security groups:

          ~ security_groups = [ # forces replacement
              + "sg-01e1d963d29f704d4",
              + "sg-04400140f7e14a3a1",
              - "sg-0de7d1ac25d14f1f3",
            ]

Actual Behavior


-/+ resource "aws_msk_cluster" "brooklyn_msk_dev" {
      ~ arn                          = "arn:aws:kafka:us-east-1:559426459479:cluster/brooklyn-msk-dev/15d82649-9a3b-4594-861b-ac96b1b8148b-22" -> (known after apply)
      ~ bootstrap_brokers            = "b-1.brooklyn-msk-dev.6cml9z.c22.kafka.us-east-1.amazonaws.com:9092,b-2.brooklyn-msk-dev.6cml9z.c22.kafka.us-east-1.amazonaws.com:9092" -> (known after apply)
      + bootstrap_brokers_sasl_iam   = (known after apply)
      + bootstrap_brokers_sasl_scram = (known after apply)
      ~ bootstrap_brokers_tls        = "b-1.brooklyn-msk-dev.6cml9z.c22.kafka.us-east-1.amazonaws.com:9094,b-2.brooklyn-msk-dev.6cml9z.c22.kafka.us-east-1.amazonaws.com:9094" -> (known after apply)
      ~ current_version              = "K2EUQ1WTGCTBG2" -> (known after apply)
      ~ id                           = "arn:aws:kafka:us-east-1:559426459479:cluster/brooklyn-msk-dev/15d82649-9a3b-4594-861b-ac96b1b8148b-22" -> (known after apply)
      - tags                         = {} -> null
      ~ tags_all                     = {} -> (known after apply)
      ~ zookeeper_connect_string     = "z-1.brooklyn-msk-dev.6cml9z.c22.kafka.us-east-1.amazonaws.com:2181,z-2.brooklyn-msk-dev.6cml9z.c22.kafka.us-east-1.amazonaws.com:2181,z-3.brooklyn-msk-dev.6cml9z.c22.kafka.us-east-1.amazonaws.com:2181" -> (known after apply)
      ~ zookeeper_connect_string_tls = "z-1.brooklyn-msk-dev.6cml9z.c22.kafka.us-east-1.amazonaws.com:2182,z-2.brooklyn-msk-dev.6cml9z.c22.kafka.us-east-1.amazonaws.com:2182,z-3.brooklyn-msk-dev.6cml9z.c22.kafka.us-east-1.amazonaws.com:2182" -> (known after apply)
        # (4 unchanged attributes hidden)
      ~ broker_node_group_info {
          ~ security_groups = [ # forces replacement
              + "sg-01e1d963d29f704d4",
              + "sg-04400140f7e14a3a1",
              - "sg-0de7d1ac25d14f1f3",
            ]
            # (4 unchanged attributes hidden)
        }
      - configuration_info {
          - revision = 0 -> null
        }
      ~ encryption_info {
          ~ encryption_at_rest_kms_key_arn = "arn:aws:kms:us-east-1:559426459479:key/2d02708f-c929-4bb8-9da1-be9693161ff1" -> (known after apply)
            # (1 unchanged block hidden)
        }
      - open_monitoring {
          - prometheus {
              - jmx_exporter {
                  - enabled_in_broker = false -> null
                }
              - node_exporter {
                  - enabled_in_broker = false -> null
                }
            }
        }
    }

It is deleting the whole cluster and recreating a new one. This impacts the development team since it is also changing the hostnames.

Steps to Reproduce

Do any changes in the cluster

It is possible to look into the cloudtrail the API deletecluster being called. The user agent for above operation was "APN/1.0 HashiCorp/1.0 Terraform/1.1.4 (+https://www.terraform.io) terraform-provider-aws/dev (+https://registry.terraform.io/providers/hashicorp/aws) aws-sdk-go/1.43.40 (go1.17.6; linux; amd64)". So it seems like a terraform script deleted and recreated the MSK cluster.

What can I do not delete the cluster and only update it? .

justinretzolk commented 2 years ago

Hey @milalima 👋 Thank you for taking the time to raise this! In this case, it looks like it's the change in broker_node_group_info.security_groups that is causing the replacement of the resource. There are certain arguments that, when changed, require recreating resources. Usually this is due to an API limitation where that part of the configuration can't be modified without recreating the resource entirely. For example, in this case, there are functions in the AWS Go SDK to update the broker count, storage, and type, but no function to update the broker security group configuration.

With that in mind, in this case, the only way to prevent recreating the resource would be to not modify the broker_node_group_info.security_groups configuration. I realize this may not be the answer you were looking for, but I hope that some of this information helps.

quinnmEG commented 2 years ago

The manual procedure for modifying the security groups that an MSK cluster belongs to are documented, but are very roundabout. It's possible to do this manually by locating the attached ENIs used by the cluster brokers, and then assigning the desired security groups using EC2 management commands.

Knowing this, I wonder if it's possible to add support for changing MSK cluster security groups to the terraform provider so these changes can be automated without requiring recreation of the cluster.

github-actions[bot] commented 1 month ago

Marking this issue as stale due to inactivity. This helps our maintainers find and focus on the active issues. If this issue receives no comments in the next 30 days it will automatically be closed. Maintainers can also remove the stale label.

If this issue was automatically closed and you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thank you!

arthur-leclerc commented 1 month ago

As explained in the doc, this is not a permanent change, the new brokers will still use the old security group.

If you change the security group that is associated with the brokers of a cluster, and then add new brokers to that cluster, Amazon MSK associates the new brokers with the original security group that was associated with the cluster when the cluster was created. However, for a cluster to work correctly, all of its brokers must be associated with the same security group. Therefore, if you add new brokers after changing the security group, you must follow the previous procedure again and update the ENIs of the new brokers.