lintool / bespin

Reference implementations of data-intensive algorithms in MapReduce and Spark
http://bespin.io/
Other
82 stars 96 forks source link

Bespin scala enhancement - Initial Pass #2

Closed moore-ryan closed 8 years ago

moore-ryan commented 8 years ago

This is the initial step of porting over more of the Bespin Java MapReduce code to Scala. It also includes the initial version of a small DSL (MapReduceSugar.scala) which allows for a more natural definition of MapReduce jobs in a type-safe manner.

This PR includes the Scala implementations of the Bigram count MapReduce programs. Output from the Scala implementation appears the same as the Java implementation, modulo some ordering differences in the "Stripes" implementation.

moore-ryan commented 8 years ago

This PR is for progress tracking only; I will close this PR, rebase, and open a new PR in order to avoid cluttering commit history later.

moore-ryan commented 8 years ago

Closing this PR in favor of this PR, which is simply a squashed version of this commit. This should cut back on the commit history growth.