GoogleCloudPlatform / DataflowTemplates

Cloud Dataflow Google-provided templates for solving in-Cloud data tasks
https://cloud.google.com/dataflow/docs/guides/templates/provided-templates
Apache License 2.0
1.11k stars 931 forks source link

Terraform Sample: Live migration all infrastructure setup #1669

Closed manitgupta closed 2 weeks ago

manitgupta commented 3 weeks ago

This PR adds samples for common scenarios users might have while trying to run a live migration to Spanner.

Each sample contains the following (and potentially more) files -

  1. main.tf - This contains the Terraform resources which will be created.
  2. outputs.tf - This declares the outputs that will be output as part of running the terraform example.
  3. variables.tf - This declares the input variables that are required to configure the resources.
  4. terraform.tf - This contains the required providers and APIs/project configurations for the sample.
  5. terraform.tfvars - This contains the dummy inputs that need to be populated to run the example.
  6. terraform_simple.tfvars - This contains the minimal list of dummy inputs that need to be populated to run the example.

SCENARIO: This Terraform example illustrates launching a live migration job for a MySQL source, setting up all the required cloud infrastructure. Only the source details are needed as input.

Creates the following resources -

  1. Datastream private connection - If configured, a Datastream private connection will be deployed for your configured VPC. If not configured, IP whitelisting will be assumed as the mode of Datastream access.
  2. Source datastream connection profile - This allows Datastream to connect to the MySQL instance (using IP whitelisting).
  3. GCS bucket - A GCS bucket to for Datastream to write the source data to.
  4. Target datastream connection profile - The connection profile to configure the created bucket in Datastream.
  5. Pubsub topic and subscription - This contains GCS object notifications as files are written to GCS for consumption by the Dataflow job.
  6. Datastream stream - A datastream stream which reads from the source specified in the source connection profile and writes the data to the bucket specified in the target connection profile. Note that it uses a mandatory prefix path inside the bucket where it will write the data to. The default prefix path is data (can be overridden).
  7. Bucket notification - Creates the GCS bucket notification which publish to the pubsub topic created. Note that the bucket notification is created on the mandatory prefix path specified for the stream above.
  8. Dataflow job - The Dataflow job which reads from GCS and writes to Spanner.
  9. Permissions - It adds the required roles to the specified (or the default) service accounts for the live migration to work.
codecov[bot] commented 3 weeks ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 45.45%. Comparing base (f14c62c) to head (dc699fe). Report is 17 commits behind head on main.

:exclamation: Current head dc699fe differs from pull request most recent head c514f26

Please upload reports for the commit c514f26 to get more accurate results.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #1669 +/- ## ============================================ + Coverage 41.29% 45.45% +4.16% + Complexity 2929 717 -2212 ============================================ Files 769 301 -468 Lines 44602 16181 -28421 Branches 4770 1607 -3163 ============================================ - Hits 18418 7355 -11063 + Misses 24636 8286 -16350 + Partials 1548 540 -1008 ``` | [Components](https://app.codecov.io/gh/GoogleCloudPlatform/DataflowTemplates/pull/1669/components?src=pr&el=components&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GoogleCloudPlatform) | Coverage Δ | | |---|---|---| | [spanner-templates](https://app.codecov.io/gh/GoogleCloudPlatform/DataflowTemplates/pull/1669/components?src=pr&el=component&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GoogleCloudPlatform) | `59.10% <ø> (-2.31%)` | :arrow_down: | | [spanner-import-export](https://app.codecov.io/gh/GoogleCloudPlatform/DataflowTemplates/pull/1669/components?src=pr&el=component&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GoogleCloudPlatform) | `∅ <ø> (∅)` | | | [spanner-live-forward-migration](https://app.codecov.io/gh/GoogleCloudPlatform/DataflowTemplates/pull/1669/components?src=pr&el=component&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GoogleCloudPlatform) | `73.87% <ø> (ø)` | | | [spanner-live-reverse-replication](https://app.codecov.io/gh/GoogleCloudPlatform/DataflowTemplates/pull/1669/components?src=pr&el=component&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GoogleCloudPlatform) | `49.70% <ø> (ø)` | | | [spanner-bulk-migration](https://app.codecov.io/gh/GoogleCloudPlatform/DataflowTemplates/pull/1669/components?src=pr&el=component&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GoogleCloudPlatform) | `82.06% <ø> (ø)` | | [see 491 files with indirect coverage changes](https://app.codecov.io/gh/GoogleCloudPlatform/DataflowTemplates/pull/1669/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GoogleCloudPlatform)