ExpediaGroup / circus-train

Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.
Apache License 2.0
86 stars 15 forks source link

Added a fix for generating the target partition path if base and part… #213

Closed patduin closed 3 years ago

patduin commented 3 years ago

fixes #212

patduin commented 3 years ago

Gonna close this this is not going to work correctly for some use cases (e.g. partition x is in bucket x and partition y is in bucket /y/z) and might lead to us not getting a notification and data being not in a different location than advertised in the Metastore. I'll update the issue with findings and rewrite it to better capture the requirements unfortunately that's a lot more work there is no quick fix imo