amplab / keystone

Simplifying robust end-to-end machine learning on Apache Spark.
http://keystone-ml.org/
Apache License 2.0
470 stars 117 forks source link

Added hashing trick nodes #193

Closed tomerk closed 8 years ago

tomerk commented 8 years ago

issue #184

etrain commented 8 years ago

This looks good to me - if you could clean up the comments and add a direct reference to the Murmur Hash 3 algorithm. Is there a reason we're reimplementing rather than linking to the one in the scala stdlib?

http://www.scala-lang.org/api/2.10.4/#scala.util.hashing.MurmurHash3$

tomerk commented 8 years ago

Yeah the code (and constants) come directly from the scala stdlib, but the scala stdlib marked these rolling hash methods as private[hashing]. I could make a util in the hashing namespace that exposes these methods to us, which may be preferable to the code & constants copying I've done here.

etrain commented 8 years ago

Copy/pasting is fine, then, I guess. Let's just leave a reference to where it came from in the code?

tomerk commented 8 years ago

Okay @etrain I added a reference to the scala stdlib class.

etrain commented 8 years ago

LGTM, merging this!