hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.84k stars 9.19k forks source link

DMS Endpoint Re-design #23506

Open YakDriver opened 2 years ago

YakDriver commented 2 years ago

Community Note

Description

Driving Force: AWS has hinted (in the Console) that it will be deprecating extra_connection_attributes. In addition, it is bad (see below) having two ways of providing the exact same information: 1) using a single long string providing many attributes (e.g., extra_connection_attributes = "bucketFolder=value;bucketName=value;...) that maps to 2) many Terraform arguments (e.g., s3_settings.0.bucket_folder = "value" and s3_settings.0.bucket_name = "value").

Problems:

Previous Design: Before, we could fit a bunch of endpoints into the same resource because we did not have individual Terraform arguments corresponding to each of the extra_connection_attributes. Without extra_connection_attributes, there are about 259 endpoint-specific attributes (159 unique attributes).

New Design Plan:

  1. We add new endpoint-specific resources (see below), each of which will have a reasonable number of arguments.
  2. The new resources will not have extra_connection_attributes. This means each attribute will need a Terraform argument. This allows us to avoid the mapping back and forth and makes it easier to handle the AWS API inconsistencies (e.g., caps to lowercase).
  3. The current endpoint (aws_dms_endpoint) will continue as before but we will deprecate these arguments: elasticsearch_settings, extra_connection_attributes, kafka_settings, kinesis_settings, mongodb_settings, and s3_settings.
  4. The current endpoint resource will still be used for some endpoint types because they either have no corresponding extra_connection_attributes or use only the set of common arguments: aurora, azuredb, db2, dynamodb, mariadb, and sybase.

New or Affected Resource(s)

Potential Terraform Configuration

resource "aws_dms_s3_endpoint" "test" {
  endpoint_id   = "example"
  endpoint_type = "target"
  ssl_mode      = "none"

  tags = {
    Name   = "example"
    Update = "to-update"
    Remove = "to-remove"
  }

  add_column_name                             = true
  bucket_folder                               = "folder"
  bucket_name                                 = "updated_name"
  canned_acl_for_objects                      = "private"
  cdc_inserts_and_updates                     = true
  cdc_max_batch_interval                      = 100
  cdc_min_file_size                           = 16
  cdc_path                                    = "cdc/path"
  compression_type                            = "GZIP"
  csv_delimiter                               = ";"
  csv_no_sup_value                            = "x"
  csv_null_value                              = "?"
  csv_row_delimiter                           = "\\r\\n"
  data_format                                 = "parquet"
  data_page_size                              = 1100000
  date_partition_delimiter                    = "SLASH"
  date_partition_enabled                      = true
  date_partition_sequence                     = "yyyymmddhh"
  date_partition_timezone                     = "America/Eastern"
  dict_page_size_limit                        = 1000000
  enable_statistics                           = false
  encoding_type                               = "plain"
  encryption_mode                             = "SSE_S3"
  external_table_definition                   = "etd"
  ignore_header_rows                          = 1
  include_op_for_full_load                    = true
  max_file_size                               = 1000000
  parquet_timestamp_in_millisecond            = true
  parquet_version                             = "parquet-2-0"
  preserve_transactions                       = false
  rfc_4180                                    = true
  row_group_length                            = 11000
  service_access_role_arn                     = aws_iam_role.iam_role.arn
  timestamp_column_name                       = "tx_commit_time"
  use_csv_no_sup_value                        = true
  use_task_start_time_for_full_load_timestamp = true

  depends_on = [aws_iam_role_policy.dms_s3_access]
}

References

samoclay commented 1 year ago

would be great to hear any news on when this could be available

slords commented 2 months ago

Any update on this? Oracle endpoints are currently broken as the extra_connection_atributes (at least for sources) doesn't get applied. #20397 was raised because this doesn't work but it was pointed here as the solution. As this hasn't been implemented yet there is no solution for oracle endpoints. Any update that would provide guidance for how to solve this today would be appreciated.

evbo commented 1 month ago

there is currently no way to set mysql settings, would be great to have this or @mdjward workaround released:

https://docs.aws.amazon.com/dms/latest/APIReference/API_MySQLSettings.html