cloudera-labs / hms-mirror

"hms-mirror" is a utility used to bridge the gap between two clusters and migrate hive metadata.
Apache License 2.0
14 stars 8 forks source link

Provide some flexibility on partition writes during migrations #53

Closed dstreev closed 1 year ago

dstreev commented 1 year ago

We have options to use or not use dynamic optimizations or specify prescriptive optimizations with DISTRIBUTE BY.

When partitions are large, the writes will only do 1 writer per partition. Which can take a long time, depending on the size.

When partitions are large, we need to be able to add an extra level in the 'DISTRIBUTE BY' to add more writers.