NICTA / scoobi

A Scala productivity framework for Hadoop.
http://nicta.github.com/scoobi/
482 stars 97 forks source link

Fix DList Shuffle #245

Closed espringe closed 11 years ago

espringe commented 11 years ago

Thinking about it a bit more, my pull-request #244 was a bit stupid -- and I made worse (but faster). While it ensures the values are sent to random partition, it keeps relative order when two keys collide (which they will do very often), and pointlessly serializes junk.

espringe commented 11 years ago

Closing, and the PR can be used for discussion