apache / beam

Apache Beam is a unified programming model for Batch and Streaming data processing.
https://beam.apache.org/
Apache License 2.0
7.81k stars 4.23k forks source link

[Feature Request]: Mechanism to enable BigQuery IO to handle table schema changes while a streaming job is running #25356

Open ragyabraham opened 1 year ago

ragyabraham commented 1 year ago

What would you like to happen?

Currently, the BQ IO can not handle schema changes if a table schema changes during a streaming job the IO returns an error and the rows are dropped. We propose a new mechanism that will enable BQ to handle a schema change:

  1. On startup a check will be made to ensure table schemas are up to date, if not, update the current schema
  2. Upon receiving a schema error, check if there are any columns in the schema supplied to the BQ IO step that is not in the existing table schema, the column(s) will be added to the table

@tomlynchRNA @ahmedabu98

Issue Priority

Priority: 2 (default / most feature requests should be filed as P2)

Issue Components

ragyabraham commented 1 year ago

.take-issue