Closed Dima1224 closed 4 years ago
Thanks for the extension suggestion!
I'm working on adding in a FULL_OVERWRITE
replication mode, which when specified in the config file will drop the existing target table and replace it with a copy of the source table, keeping the same name as the dropped target table. Effectively, doing a full update of data and metadata each time.
Is this what you're after?
Yep, that sounds like it would do the trick. One thing I'd keep in mind is that this will likely be the mode for tables that are being actively iterated on. At some point that table will be hardened and the consumer will likely want to switch the replication mode to "regular."
Great! And yep thats fine, when you no longer want to overwrite the target each time you can just use the FULL
replication mode as done previously.
The PRs relating to this issue have both been merged, I will now close this ticket.
Is your feature request related to a problem? Please describe. Early on in the development lifecycle, it is common to change a table's schema in non-backwards-compatible ways and repopulate the table from scratch. This could also happen with a mature table, though it would be much more rare.
Describe the solution you'd like I'd like Circus Train to support this use case and replicate the update. As a user of Circus Train I am not expecting it to protect me from inadvertent breaking changes, I just expect it to copy data and metadata from one place to the next.
If there are CT users who depend on CT to protect them from breaking schema changes, I propose we add some metadata to indicate which tables should be protected and which shouldn't. This can be thought of as a dev/prod distinction or something along the lines of safe/unsafe. CT could still attempt to detect partitions which weren't updated to match the schema change and fail to copy if such partitions exist. Though I'd argue that this isn't the job of the copy tool, but rather the Data Lake tooling surrounding the upstream schema change.
This has been discussed with @massdosage