GoogleCloudPlatform / DataflowTemplates

Cloud Dataflow Google-provided templates for solving in-Cloud data tasks
https://cloud.google.com/dataflow/docs/guides/templates/provided-templates
Apache License 2.0
1.14k stars 950 forks source link

Support populating Spanner source database id, instance id and a custom metadata field in Spanner CDC to Pub Sub template #1769

Open ShuranZhang opened 1 month ago

ShuranZhang commented 1 month ago

Introduce two new options for Spanner Change Streams to Pub/Sub template.

  1. includeSpannerSource: boolean option default to false. If this config is set to true, two new fields spannerDatabaseId and spannerInstanceId will be populated to the message data field with the associate values.

  2. outputMessageData: string option default to empty. If this config has non-empty string value, the string value will be populated to the outputMessageMetadata field in output pub sub message body.

I verified correctness for both AVRO and JSON data formats through integration tests in SpannerChangeStreamsToPubSubIT.java.