hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.85k stars 9.2k forks source link

[Enhancement]: Add support for Redshift Serverless as a destination for aws_kinesis_firehose_delivery_stream #39307

Open si-robinson opened 2 months ago

si-robinson commented 2 months ago

Description

AWS have started supporting Redshift Serverless as a delivery destination for Kinesis firehose. The destination needs to reference the serverless workgroup rather than the standalone cluster (see attached) Firehose_RS_Serverless_Config

I have used the console to do this and seen it working irl.

Affected Resource(s) and/or Data Source(s)

aws_kinesis_firehose_delivery_stream

Potential Terraform Configuration

The usual redshift_configuration could be replaced with a redshift_serverless_configuration that would 
look almost identical only
1. instead of `cluster_jdbcurl` a reference to the serverless workgroup arn would be supplied
2. an additional property for the database name within the namespace would be required
3. there is now an option to use secrets manager rather than `username` and `password` so maybe an 
    abstraction around that where you can supply `credentials` or `managed_credentials`

So, maybe something like the below...
I've put examples of what I think both the credentials things might look.  If using just `credentials`, 
there would be no need for the `aws_secretsmanager_secret.redshift-access` block.
Pretty much everything else would reflect the same properties of the current `redshift_configuration`

# Create a Kinesis Stream
resource "aws_kinesis_stream" "event_stream" {
  .....
}

# Create a Redshift Serverless namespace
resource "aws_redshiftserverless_namespace" "redshift-serverless-storage" {
  ......
}

# Create a Redshift Serverless workgroup
resource "aws_redshiftserverless_workgroup" "redshift-serverless-compute" {
  .....
}

# Create processing Lambda
resource "aws_lambda_function" "processing-lambda" {
  .....
}

# Create Firehose Role
resource "aws_iam_role" "firehose-role" {
  .....

}

# Create Lambda Role
resource "aws_iam_role" "lambda-role" {
  .....

}

# Secrets manager credentials for database
resource "aws_secretsmanager_secret" "redshift-access" {
  .....
}

resource "aws_kinesis_firehose_delivery_stream" "stream" {
  name        = "firehose-name"
  destination = "redshift"

  kinesis_source_configuration {
    kinesis_stream_arn = aws_kinesis_stream.event_stream.arn
    role_arn           = aws_iam_role.firehose-role.arn
  }

  redshift_serverless_configuration {
    role_arn           = aws_iam_role.lambda-role.arn
    workgroup_arn      = aws_redshiftserverless_workgroup.redshift-serverless-compute.arn
    database_name      = "myDb"

    credentials {
      username           = var.username
      password           = var.password
    }

    OR

    managed_credentials {
      secret_arn = aws_secretsmanager_secret.redshift-access.arn
    }

    retry_duration     = var.retry-duration
    data_table_name    = "mySchema.myTable"
    copy_options       = "json 's3://delivery/jsonpaths.json' region 'eu-west-1' TRUNCATECOLUMNS TIMEFORMAT 'auto'"
    data_table_columns = "field1,field2,field3"

    s3_configuration {
      role_arn           = aws_iam_role.firehose-role.arn
      bucket_arn         = var.s3-bucket-arn
      prefix             = var.s3-prefix
      buffering_size     = 10
      buffering_interval = 60
      compression_format = "UNCOMPRESSED"

      cloudwatch_logging_options {
        enabled         = true
        log_group_name  = "/aws/kinesisfirehose/loggroup"
        log_stream_name = "S3Delivery"
      }
    }

    processing_configuration {
      enabled = "true"
      processors {
        type = "Lambda"
        parameters {
          parameter_name  = "LambdaArn"
          parameter_value = "${aws_lambda_function.processing-lambda.arn}:$LATEST"
        }
      }
    }

    cloudwatch_logging_options {
      enabled         = true
      log_group_name  = var.devicery-loggroup-name
      log_stream_name = var.log-stream-name
    }
  }

  tags = {
    Environment = "${var.environment}-${var.region}"
    Name        = "firehose-name"
  }

}

References

This was first requested in March 2023 but at that time AWS did not support Redshift Serverless as a Kinesis destination: https://github.com/hashicorp/terraform/issues/32794

Looks like AWS added support in June 2023: https://aws.amazon.com/about-aws/whats-new/2023/06/amazon-kinesis-data-firehose-data-stream-delivery-redshift-serverless/

I'm having trouble finding definitive AWS docs on this but this page makes reference to it being do-able: https://docs.aws.amazon.com/firehose/latest/dev/basic-deliver.html

Would you like to implement a fix?

None

github-actions[bot] commented 2 months ago

Community Note

Voting for Prioritization

Volunteering to Work on This Issue

si-robinson commented 2 months ago

Also, looks like the Secrets Manager option for credentials is also available for standalone clusters.