NICTA / scoobi

A Scala productivity framework for Hadoop.
http://nicta.github.com/scoobi/
482 stars 97 forks source link

Use an int instead of a double for shuffling key #244

Closed espringe closed 11 years ago

espringe commented 11 years ago

We decide the partition by calling .hashCode on the key, which returns an Int. So generating anything larger than an Int is a waste of space and computation (and calling .hashCode on an int is much faster, as it merely returns the value as opposed to actually doing any hashing)