ExpediaGroup / circus-train

Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.
Apache License 2.0
86 stars 15 forks source link

Full overwrite replication #185

Closed JayGreeeen closed 4 years ago

JayGreeeen commented 4 years ago

:pencil: Description

Added a replication mode FULL_OVERWRITE which will overwrite an existing table with the source. The expected use for this mode would be early during the development cycle when the schema could be subject to incompatible changes.

:link: Related Issues

183

JayGreeeen commented 4 years ago

We've decided that we do want this replication mode to delete the data as well as the replica table, but this is a slightly bigger change so we will separate it out as another PR.

So for now, this mode will only drop the existing replica table and not touch any data.