googlegenomics / gcp-variant-transforms

GCP Variant Transforms
Apache License 2.0
137 stars 55 forks source link

Append must be an atomic operation #518

Open samanvp opened 5 years ago

samanvp commented 5 years ago

When we append to an existing table (whether partitioned or not), we should make sure the append is an atomic operation. We either fail without appending any data to the destination table or we succeed and append all the new data.

If Dataflow does not offer any guarantee for atomic append, then we need to do it "manually": write output to a temporary BQ table and after successful run of VT append the new table to the exiting destination table.

samanvp commented 4 years ago

Possible solution: https://cloud.google.com/bigquery/docs/managing-tables#undeletetable