PARTITION BY RANGE_BUCKET(`GOE_PART_10000_DT_KEY`, GENERATE_ARRAY(20000000, 30000000, 10000))
And tuncates the synthetic values by 10,000. This is a hangover from Hadoop partitioning.
Instead we should be able to use the real column for partitioning and add a new step option, --partition-granularity, independent of the granularity. Then we could partition by:
PARTITION BY RANGE_BUCKET(`DT_KEY`, GENERATE_ARRAY(20000000, 30000000, 10000))
Take this example Oracle partition scheme:
Currently in addition to lower/upper bounds we need to pass Offload
--partition-granularity
to define the partition step, for example:This adds a synthetic partition key:
And tuncates the synthetic values by 10,000. This is a hangover from Hadoop partitioning.
Instead we should be able to use the real column for partitioning and add a new step option,
--partition-granularity
, independent of the granularity. Then we could partition by: