kite-sdk / kite

Kite SDK
http://kitesdk.org/docs/current/
Apache License 2.0
394 stars 263 forks source link

CDK-988. Add a long range partitioner with fixed size bounds. #355

Closed tomwhite closed 9 years ago

rdblue commented 9 years ago

@tomwhite, I'd like to do a release soon. Is this something you want to get in?

tomwhite commented 9 years ago

Thanks for the feedback @laserson and @rdblue! I have updated the PR to use the start value for each bucket in the range.

I've also addressed the previous feedback with a couple of exceptions:

rdblue commented 9 years ago

One more note: any idea how to extend this for int, float, and double types? For the 4-byte types we could simply promote them and run the long/double partitioner?

I think we would need a separate partitioner for a double fixed range, which I think will be valuable for geo purposes since lat/long values are usually decimals. To do this we could just add a "type" option to the existing "range" JSON. Just want to hear your thoughts here so we have a plan for adding it later that doesn't conflict with this work.

tomwhite commented 9 years ago

@rdblue Thanks for all the feedback, and for the guidance on implementing the project methods. I've added tests for them now, and addressed all your other points.

Regarding double fixed size range - it should be straightforward to add it in the future with a "type" option as you suggest.

rdblue commented 9 years ago

A added a couple of minor comments, but this looks good overall.

rdblue commented 9 years ago

+1

Thanks, Tom!