hashicorp / terraform-provider-aws

The AWS Provider enables Terraform to manage AWS resources.
https://registry.terraform.io/providers/hashicorp/aws
Mozilla Public License 2.0
9.61k stars 9k forks source link

[Enhancement]: Amazon AppFlow support for setting "multiple files" aggregating setting #37737

Open nsb413 opened 1 month ago

nsb413 commented 1 month ago

Description

Amazon AppFlow supports Aggregate records into multiple files in each partition in Aggregation settings. However Terraform is only supporting None and SingleFile.

Affected Resource(s) and/or Data Source(s)

aws_appflow_flow

Potential Terraform Configuration

resource "aws_appflow_flow" "this" {
  name        = var.name
  description = var.desc

  source_flow_config {
    connector_type         = "SAPOData"
    connector_profile_name = var.source_connector
    source_connector_properties {
      sapo_data {
        object_path = var.object_path
      }
    }
  }

  destination_flow_config {
    connector_type = "S3"
    destination_connector_properties {
      s3 {
        bucket_name   = var.s3_bucket_name
        bucket_prefix = var.s3_bucket_prefix

        s3_output_format_config {
          file_type = "PARQUET"

          aggregation_config {
            aggregation_type = "MultipleFiles"
            target_file_size = "128"
          }
        }
      }
    }
  }

  task {
    source_fields = [""]
    task_type     = "Map_all"

  }

  trigger_config {
    trigger_type = "OnDemand"
  }
  tags = var.tags
}

References

https://docs.aws.amazon.com/appflow/latest/userguide/flows-partition.html

Would you like to implement a fix?

None

github-actions[bot] commented 1 month ago

Community Note

Voting for Prioritization

Volunteering to Work on This Issue

jpowell8 commented 1 week ago

It would be nice to have "MultipleFiles" explicitly included as a configuration option. When I test the module with no aggregation_type set, the result is Aggregate records into multiple files in each partition, so I think all 3 behaviors are available, and MultipleFiles appears to be the default behavior.

My full destination_flow_config block

  destination_flow_config {
    connector_type = "S3"
    destination_connector_properties {
      s3 {
        bucket_name = var.source_bucket_name 
        bucket_prefix = "data"

        s3_output_format_config {
          file_type = "JSON"

          prefix_config {
            prefix_type = "PATH"
          }

          aggregation_config {
            target_file_size = "128"
          }
        }
      }
    }
  }