radanalyticsio / silex

something to help you spark
Apache License 2.0
65 stars 13 forks source link

VectorID - subclasses for MLLib DenseVector and SparseVector that carry ... #2

Closed erikerlandson closed 8 years ago

erikerlandson commented 9 years ago

...an identifier payload

erikerlandson commented 9 years ago

or is it more standard to use '.vectorid' ?

erikerlandson commented 9 years ago

I have a more flexible and yet simpler way to do this:

case class DenseTaggedVector[+T](data: Array[Double], tag: T)
  extends org.apache.spark.mllib.linalg.DenseVector(data)

case class SparseTaggedVector[+T](sz: Int, idx: Array[Int], data: Array[Double], tag: T)
  extends org.apache.spark.mllib.linalg.SparseVector(sz, idx, data)