hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.8k stars 9.15k forks source link

[Enhancement]: Extend 'aws_msk_cluster' by adding flag for turning on AWS PrivateLink feature (msk multi-vpc) #34419

Open malamin opened 11 months ago

malamin commented 11 months ago

Description

After creating MSK provisioned cluster with 'aws_msk_cluster' resource, it is not possible to apply cluster policy with terraform, because option for enabling AWS PrivateLink (MSK multi-VPC) is missing and by default this option is turned off (needs to be turned on manually before applying cluster policy and there is no drift afterwards in 'aws_msk_cluster' resource)

image

resource "aws_security_group" "msk_sg" {
  name        = "${local.cluster_name}-sg"
  description = "MSK cluster security group"
  vpc_id      = var.vpc_id
}

resource "aws_security_group_rule" "sg_ingress" {
  for_each          = {for account in var.client_accounts : account.account_id => account}
  type              = "ingress"
  from_port         = 9094
  to_port           = 9094
  protocol          = "tcp"
  cidr_blocks       = [each.value.cidr_block]
  security_group_id = aws_security_group.msk_sg.id
}

resource "aws_security_group_rule" "sg_egress" {
  for_each          = {for account in var.client_accounts : account.account_id => account}
  type              = "egress"
  from_port         = 9094
  to_port           = 9094
  protocol          = "tcp"
  cidr_blocks       = [each.value.cidr_block]
  security_group_id = aws_security_group.msk_sg.id
}

resource "aws_cloudwatch_log_group" "log_group" {
  name              = "/aws/msk/${local.cluster_name}"
  retention_in_days = var.cloud_watch_retention_days
  tags              = {
    "Name" = local.cluster_name
  }
}

resource "aws_msk_cluster" "msk_cluster" {
  cluster_name           = local.cluster_name
  kafka_version          = var.kafka_version
  number_of_broker_nodes = local.number_of_broker_nodes

  broker_node_group_info {
    instance_type   = var.instance_type
    client_subnets  = var.private_subnet_ids
    security_groups = [aws_security_group.msk_sg.id]
    storage_info {
      ebs_storage_info {
        volume_size = var.volume_size
      }
    }
  }

  encryption_info {
    encryption_at_rest_kms_key_arn = var.kms_msk_arn
  }

  logging_info {
    broker_logs {
      cloudwatch_logs {
        enabled   = true
        log_group = aws_cloudwatch_log_group.log_group.name
      }
    }
  }

  client_authentication {
    unauthenticated = false
    sasl {
      iam   = true
      scram = false
    }
  }
}

resource "aws_msk_cluster_policy" "msk_cluster_policy" {
  count       = length(var.client_accounts) > 0 ? 1 : 0
  cluster_arn = aws_msk_cluster.msk_cluster.arn
  policy      = data.aws_iam_policy_document.cluster_iam_policy_document.json
}

data "aws_iam_policy_document" "cluster_iam_policy_document" {
  statement {
    effect = "Allow"
    principals {
      type        = "AWS"
      identifiers = [for client_account in var.client_accounts : "arn:aws:iam::${client_account.account_id}:root"]
    }
    actions = [
      "kafka:CreateVpcConnection",
      "kafka:GetBootstrapBrokers",
      "kafka:DescribeCluster",
      "kafka:DescribeClusterV2",
      "kafka-cluster:Connect",
      "kafka-cluster:DescribeTopic",
    ]
    resources = [
      aws_msk_cluster.msk_cluster.arn,
      "arn:aws:kafka:*:${var.current_account_id}:topic/${local.cluster_name}/*/*"
    ]
  }
}

Affected Resource(s) and/or Data Source(s)

aws_msk_cluster, aws_msk_cluster_policy

Potential Terraform Configuration

No response

References

No response

Would you like to implement a fix?

None

github-actions[bot] commented 11 months ago

Community Note

Voting for Prioritization

Volunteering to Work on This Issue

justinretzolk commented 11 months ago

Hey @malamin 👋 Thank you for taking the time to raise this! I found #31062, which seems to reference the relatively new aws_msk_vpc_connection resource as an answer for this. Admittedly, I'm not familiar enough to know; can you review the linked issue (and it's comments), and the linked resource document and let me know if that covers what you're looking for?

malamin commented 11 months ago

Hi :) This resource allows to create connection to existing msk cluster from another VPC in another account. To use this resource though, multi-vpc needs to be first enabled on the msk cluster itself (and would be great to be able to do it with terraform).

naviat commented 9 months ago

@justinretzolk Should we consider automating manual steps, such as enabling multi-VPC, by using Terraform? @malamin Do you have any way to do this case by terraform?

cobbr2 commented 9 months ago

Definitely want to be able to do it from terraform myself. Any interface in AWS that involves a long UPDATING wait (which this does) really needs to be automated. Also, the msk_cluster_policy resource doesn't work on provisioned clusters until this has been done (serverless clusters seem to get this work done for free); it generates

 Error: setting MSK Cluster Policy (arn:aws:kafka:STUFF:cluster/common-blue/ANDNONNSENSE): operation error Kafka: PutClusterPolicy, https response error StatusCode: 400, RequestID: 7755396f-74b4-4eae-860a-9ef80efea1df, BadRequestException: The cluster must have multi-VPC private connectivity enabled for its cluster policy.

It appears the correct API is documented at https://docs.aws.amazon.com/msk/1.0/apireference/clusters-clusterarn-security.html (we want to update the VpcConnectivityInfo).

cobbr2 commented 8 months ago

That problem with setting the MSK Cluster Policy appears to me to be a red herring. I saw it also with MSK Serverless, which does not have the multi-VPC requirement (it does it by default). I think the problem in our case is that we were starting deployment of two MSK replicators while the policy was still being set up. If we make the replicators depend on the policies (instead of just the clusters), we reliably set the policies. (We don't reliably set up working replicators, but that'll be another issue when AWS support and I can finally figure out why.)

dabmajor commented 7 months ago

The aws_msk_vpc_connection resource appears to be for creating a Managed VPC Connection. As I understand it, a Managed VPC Connection is different than enabling multi-VPC connectivity on a single cluster's network configuration.

fedeostrit commented 6 months ago

It is terrible that this cannot be enabled by terraform, it is precisely as indicated that there is no option to enable this and it would have to be in the resource "aws_msk_cluster" "multi-vpc = true or false" "multi-vpc = enabled or disabled"

nairb commented 6 months ago

I found this thread while attempting to use MSK as a source for Opensearch Ingestion, which requires multi-vpc be enabled. You can turn it on with Terraform like so:

resource "aws_msk_cluster" "msk_cluster" {
  ...
  broker_node_group_info {
    connectivity_info {
      vpc_connectivity {
        client_authentication {
          sasl { 
            iam = "true" 
          }
        }
      }
    }
  }
  ...
}
dabmajor commented 6 months ago

I found this thread while attempting to use MSK as a source for Opensearch Ingestion, which requires multi-vpc be enabled. You can turn it on with Terraform like so:

resource "aws_msk_cluster" "msk_cluster" {
  ...
  broker_node_group_info {
    connectivity_info {
      vpc_connectivity {
        client_authentication {
          sasl { 
            iam = "true" 
          }
        }
      }
    }
  }
  ...
}

Based on the testing I have seen, this is not a complete solution and is still dependent on manual configuration in the aws console.

nairb commented 6 months ago

I found this thread while attempting to use MSK as a source for Opensearch Ingestion, which requires multi-vpc be enabled. You can turn it on with Terraform like so:

resource "aws_msk_cluster" "msk_cluster" {
  ...
  broker_node_group_info {
    connectivity_info {
      vpc_connectivity {
        client_authentication {
          sasl { 
            iam = "true" 
          }
        }
      }
    }
  }
  ...
}

Based on the testing I have seen, this is not a complete solution and is still dependent on manual configuration in the aws console.

All of my resources were created via Terraform with no manual intervention and OSIS was able to create the vpc connection to the MSK cluster.

sagar89jadhav commented 4 months ago

I found this thread while attempting to use MSK as a source for Opensearch Ingestion, which requires multi-vpc be enabled. You can turn it on with Terraform like so:

resource "aws_msk_cluster" "msk_cluster" {
  ...
  broker_node_group_info {
    connectivity_info {
      vpc_connectivity {
        client_authentication {
          sasl { 
            iam = "true" 
          }
        }
      }
    }
  }
  ...
}

Based on the testing I have seen, this is not a complete solution and is still dependent on manual configuration in the aws console.

All of my resources were created via Terraform with no manual intervention and OSIS was able to create the vpc connection to the MSK cluster.

I did try the above configuration. Yes, it worked but terraform took almost 1hr 15min. to update the private link/ multi-vpc. settings.

dabmajor commented 4 months ago

I will try this out again with a new cluster, to see if this actually might align with our use case

On Fri, Jun 14, 2024, 2:30 AM Sagar Jadhav @.***> wrote:

I found this thread while attempting to use MSK as a source for Opensearch Ingestion, which requires multi-vpc be enabled. You can turn it on with Terraform like so:

resource "aws_msk_cluster" "msk_cluster" { ... broker_node_group_info { connectivity_info { vpc_connectivity { client_authentication { sasl { iam = "true" } } } } } ... }

Based on the testing I have seen, this is not a complete solution and is still dependent on manual configuration in the aws console.

All of my resources were created via Terraform with no manual intervention and OSIS was able to create the vpc connection to the MSK cluster.

I did try the above configuration. Yes, it worked but terraform took almost 1hr 15min. to update the private link/ multi-vpc. settings.

— Reply to this email directly, view it on GitHub https://github.com/hashicorp/terraform-provider-aws/issues/34419#issuecomment-2167527564, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE54WQPECSONHYL75ZCI54DZHKS3VAVCNFSM6AAAAAA7MW7RJSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRXGUZDONJWGQ . You are receiving this because you commented.Message ID: @.***>