ExpediaGroup / circus-train

Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.
Apache License 2.0
86 stars 15 forks source link

Make more configuration values available to Copiers #195

Closed massdosage closed 4 years ago

massdosage commented 4 years ago

Is your feature request related to a problem? Please describe. When developing a downstream Copier for Circus Train we discovered that we require the Hive database and table name in order for the copier to be able to retrieve the Table's tags. Currently the copiers are only passed information about the paths which isn't enough for our copier.

Describe the solution you'd like Ideally we would have more of the table replication context available to the copier but the immediate need is these two fields.

Describe alternatives you've considered The only way that we can think of currently to get these values is to pass them in as custom copier options but this then means that they are duplicated in the config which isn't ideal.